Only Page One
Only Page One
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSG.2016.2628061, IEEE
Transactions on Smart Grid
Pursuant to the DOE Public Access Plan, this document represents the authors' peer-reviewed, accepted 1
manuscript. The published version of the article is available from the relevant publisher.
The upper limit vector and lower limit vec-
system load forecasting, which aims to provide highly accurate tor of
.
short-term load forecasting with high resolution utilizing a A searching step vector for GTA, where j
support vector regression (SVR) based forecaster and a two-step
hybrid parameters optimization method. Specifcally, because the is the jth element of .
load profles in distribution systems contain abrupt deviations, A high-dimensional vector for GTA, where
a data normalization is designed as the pretreatment for the j is the jth element of .
collected historical load data. Then an SVR model is trained by H A traversing vector, which contains all the
the load data to forecast the future load. For better performance combinations of different parameters in .
of SVR, a two-step hybrid optimization algorithm is proposed
to determine the best parameters. In the frst step of the hybrid mj It is defned that the total number of the
optimization algorithm, a designed grid traverse algorithm (GTA) elements in vector j is mj .
is used to narrow the parameters searching area from a global Hj2 Hj2 is the j2 th element of H, where j2 ∈
to local space. In the second step, based on the result of the {1, 2, · · · , m1 × m2 ×, · · · , ×mp }.
GTA, particle swarm optimization (PSO) is used to determine βj,m′′j The parameter βj,m′′j is the m′′j th ele-
the best parameters in the local parameter space. After the best
parameters are determined, the SVR model is used to forecast ment chosen from vector j , where m′′j ∈
the short-term load deviation in the distribution system. The {1, 2, · · · , mj }.
performance of the proposed approach is compared to some Hb The best parameter vector for SVR, Hb is
classic methods in later sections of the paper. an element in H.
Index terms— Short-term load forecast, support vector re- ω ω is a linear combination coeffcient of
gression, grid traverse algorithm, particle swarm optimization, SVR, 21 ω T ω indicates the fatness of the
distribution system regression coeffcients.
b b is an offset coeffcient of SVR.
C, γ, ε C is a trade-off parameter, γ is a parameter
L IST OF S YMBOLS of (Gaussian) radial basis function, and ε
L The historical load data vector of a distri- is an adjustable precision parameter, which
bution system, where Li is the ith element indicates the training error threshold.
of set L, i ∈ {1, · · · , n}. RCV The objective function of designed Cross-
L′ The normalized historical load data vector Validation.
of a distribution system, where L′i is the ith ζ̂˝ ζ̂˝ is the load data forecasted by proposed
element of set L′ , i ∈ {1, · · · , n}. approach to test in the Cross-Validation, and
L̂ The forecasting load data vector of a distri- ζ˝ is the corresponding load data.
bution system, where L̂i1 is the i1 th element
1949-3053 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSG.2016.2628061, IEEE
Transactions on Smart Grid
Pursuant to the DOE Public Access Plan, this document represents the authors' peer-reviewed, accepted
manuscript. The published version of the article is available from the relevant publisher.
the end users has much bigger impact on distribution systems In Section III, the designed SVR-based load forecasting is
than it does on transmission systems. Successfully operating introduced. In Section IV, the two-step hybrid parameter
an independent distribution system requires much accurate optimization method is illustrated for SVR-based forecaster.
and high-resolution load forecasting than today’s technique In Section V, numerical results of the proposed approach are
can provide [6]–[12]. In this paper, a short-term and high- presented and compared to other methods.
resolution load forecasting is proposed for the distribution
feeders, which contains the aggregated loads of a small section II. P ROBLEM F ORMULATION
of distribution feeder load.
There are well established methods for load forecasting
Start Approach rocess Arrow
in general. In [13], an algorithm based on support vector
machine (SVM) with simulated annealing is employed to Block Note Arrow
1
forecast the system load. In [14], the ant colony optimization Data re-processing Data and Information Flow
is used to determine the best parameters for the SVR, which
is implemented to forecast the system load. A load forecast-
2 Parameter
ing algorithm based on SVM and genetic algorithm (GA) arameter Initiation
Optimization
is presented in [15]. Short-term load forecasting normally
3
has more computational intensive solutions. For example, the Grid Traversing Algorithm
ANN-based short-term load forecast approaches are presented
in [16], [17]. In [18], an autoregressive integrated moving- 4
Cross Validation Error Calculation
average (ARIMA) model is used to forecast the short-term
load. A Kalman fltering-based short-term load forecasting No 5
Threshold
approach is presented in [19]. In [20], an approach based on
double seasonal exponential smoothing is used for short-term Yes
load forecasting. 6
article Swarm Optimization
Although many load forecasting approaches exist, most of The Best
them focus on load forecasting in transmission systems. In arameters
transmission systems, the aggregated loads are three-phase bal- Cross Validation Error Calculation
anced and always fairly smooth.In this paper, a small section of
8
distribution feeder load contains several end customers, which No
Threshold 10
are three-phase unbalanced and much less than the aggregated SVR Model with Best
Yes
loads at a transmission feeder. This indicates that the loads 9
arameters and Short-
Best arameters Determined term Load forecasting
profle characteristics of a distribution feeder are different.
In [21], [22], the impact of electric vehicles on the distribution
systems is evaluated, and it indicates that abruptly stochastic 11
load deviation is a feature of distribution systems. In addition, Result Comparison
and Analysis
combined with distributed renewable energy resources, the
control strategy of the distribution system requires a high- Algorithm Feature
speed and high-resolution approach for stochastical nonlinear
tracking and forecasting [6], [7], [23], [24]. Therefore, it End
is critical and imperative to provide a short-term and high-
resolution forecasting approach for the aggregated loads of a
Fig. 1. Flowchart of the proposed short-term load forecasting approach.
small section of distribution feeder load.
In this paper, the proposed SVR-based short-term load
forecasting approach is a supervised machine learning ap- In this paper, the proposed load forecasting approach con-
proach that requires the preprocessing of the input data to sists of three major parts: data preprocessing, SVR model
achieve better regression and forecasting performance. After training, and forecasting with the SVR model. To determine
preprocessing, the collected historical load data are used to the best parameters, the proposed two-step hybrid parameter
train the SVR model. Meanwhile, compared to GA, ANN, optimization method is designed in the SVR model training
and simulated annealing approach, particle swarm optimiza- part.
tion (PSO) has a higher converge speed to determine the The fowchart of the proposed approach is shown in Fig. 1.
best parameters for SVR, but the PSO cannot guarantee to Firstly, the collected historical load data is processed to reduce
converge to a global optimal solution [25], [26]. To solve the excessive deviation. Next, a two-step approach is utilized
the above problems, a two-step hybrid parameters searching to optimize parameters for the SVR solution. In the frst step,
algorithm is proposed to determine the parameters effectively the grid traverse algorithm (GTA) traverses the global solution
and effciently. space into local spaces, which can be process in parallel and
The paper is organized as follows. In Section II, the problem narrows the parameters searching area. In the second step, the
of short-term distribution system load forecasting is formulated PSO is used to determine the best parameters of the SVR
and the fowchart of the proposed approach is introduced. in the local solution space. If multiple local solution spaces
2
1949-3053 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSG.2016.2628061, IEEE
Transactions on Smart Grid
Pursuant to the DOE Public Access Plan, this document represents the authors' peer-reviewed, accepted
manuscript. The published version of the article is available from the relevant publisher.
are determined by the GTA, the PSO can calculate the best Subject to ′
parameters in the local solution spaces in parallel. After the Li2 − f (xi2 ) 6 ε + ξi2 ,
two-step parameter optimization, a best parameter vector can − L′i2 + f (xi2 ) 6 ε + ξi∗2 , (4)
be decided for the SVR. Finally the future load is forecasted
ξi2 , ξi∗2 > 0.
based on the determined SVR function and historical data. The
detail procedures of the proposed approach can be found in Here in (3), i2 is a time index of training data, i2 ∈ {1, · · · , n},
1 T
Appendix A. 2 ω ω indicates the fatness of the regression coeffcients, the
second item is the cost function of the training errors between
III. SVR FOR S HORT- TERM D ISTRIBUTION S YSTEM L OAD L′i2 and f (xi2 ), C is a trade-off coeffcient between the frst
F ORECASTING two items, ξi2 and ξi∗2 indicates the training error above the
A. Data preprocessing precision parameter ε and below −ε, respectively. In [29], the
risk function (3) with the constraint (4) can be solved as a dual
The original data from the natural world usually contains a
problem, which can minimize the risk function and determine
lot of excessively deviating instances, which distract the target
ω, b, ξi2 , and ξi∗2 . Therefore, the parameters γ, C, and ε are
of the supervised forecasting approach and signifcantly impact
chosen as the optimal parameters, which are most important
performance [27]. In this paper, we are trying to forecast
to the accuracy of the forecasting [15], [28], [30], [31]. The
load at feeder level or distribution system level, therefore it
following chapter introduces a two-step hybrid method for
is necessary to process the original load data to improve the
parameters optimization.
performance of the proposed approach. The detail procedures
can be found in Appendix B.
IV. PARAMETER O PTIMIZATION OF SVR
B. Basic Concept of SVR Compared to the GA, ANN, and simulated annealing ap-
Based on support vector machine, the SVR is a supervised proaches in [13], [15], [16], [32], PSO is an effective algorithm
learning method used for nonlinear regression and time se- for solving nonlinear problems; however, it does not guarantee
ries forecasting applications [28]–[30]. The historical data is that the algorithm can converge to a global optimal solution
treated as input training data, and the forecasting is based on in the solution space [25]. In [26], it is noticed that the
the SVR model trained by historical data. conventional PSO may become divergent or trapped in a
local optimal solution. Conventionally, decreasing the global
search step size and increasing the number of particles are two
C. SVR for Short-Term Load Forecasting in Distribution Sys-
solutions to avoid local optimizations, but they may result in
tems
′ prohibit computational load. Therefore, considering both the
In a SVR model, a nonlinear mapping function (·): Rn → global convergence speed and accuracy of the algorithm, a
m′
R is defned to map the input data into a high-dimensional two-step hybrid global optimization algorithm is proposed to
feature space, where n′ , m′ ∈ {1, 2, 3, · · · }, m′ > n′ . In [30], determine the best forecasting parameters for the SVR.
[31], it is noted that the (Gaussian) radial basis function (RBF)
kernel is widespread with high performance for regression and
forecasting. In this paper, the RBF is designed as: A. GTA for Parameter Optimization
1) Design of GTA: In GTA, based on the parameter vector
KRBF (x1 , x2 ) = (x1 )T (x2 ) (1) ¯ ,
, and the searching
, its upper and lower limit vectors
j
f (x) = hω, (x)i + b, (2) where
where f (x) is the regression function aiming to have at • j is the index of parameters in
, and j ∈ {1, 2, · · · , p}.
most ε deviation from the elements of the collected historical • j is a vector corresponding to parameter
j , and
load data L′ , ω is a linear combination coeffcient, b is an computed with its upper and lower limits
j ,
¯ j , and
offset coeffcient, ε is an adjustable precision parameter, which searching step j .
indicates the training threshold. • It is defned that the total number of the elements in vector
¯ j −
)
(
3
1949-3053 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSG.2016.2628061, IEEE
Transactions on Smart Grid
Pursuant to the DOE Public Access Plan, this document represents the authors' peer-reviewed, accepted
manuscript. The published version of the article is available from the relevant publisher.
Rm1 ×m2 ×,··· ,×mp is built. Then, a traversing vector H is The results of the parameter selection with the GTA are shown
built as a fnite multi-Cartesian product, which is illustrated as in Fig. 2.
in Appendix C. In Fig. 2, the x-axis denotes the parameter log2 C, the y-
2) Design of Cross-Validation: A designed cross-validation axis indicates the parameter log2 γ, and the curves with red to
is used with the GTA to limit the overftting problem and blue indicate the PRESS value from large to small. It is clear
compute the best parameters [33]. The training data is divided that the areas with blue curves have smaller PRESS values and
into 10 groups; for each time, one group is selected as the test the next step optimization will only process this local area. In
data, and the rest of the data are training data. The objective this case, the lower limit and upper limit of C are 28 and 215 ,
function is designed as a prediction sum of squares (PRESS) respectively. The lower limit and upper limit of γ are 28 and
statistic 211 , respectively.
10
X (ζ˝ − ζ̂˝ )2
RCV = min (6)
,",C 10 B. PSO for the Optimal Parameter Selection
˝ =2
1) Basic Concept of the PSO: PSO is a numerical
where ζ̂˝ is the load data forecasted by proposed approach intelligence-based approach, which can solve many nonlinear
to test in the Cross-Validation, and ζ˝ is the corresponding optimal problems that cannot be solved analytically. In ad-
load data. For example, if τ = 2, this means the data group dition, PSO has a higher convergency speed in the solution
1 is the training data to forecast data group 2; if τ = 5, the space, convenient to implement, and its converge ability is not
data groups 1, 2, 3, 4 are training data to forecast data group largely affected by the size and nonlinearity of the optimal
5. The difference between ζ˝ and ζ̂˝ is the residual error. In problems [36].
addition, because the normalized load data ranged at (0, 1) in
the preprocessing block, this objective function has the same
Start
meaning as the mean square percentage error (MSPE), which
can indicate the accuracy and deviation of the forecasting
results [34], [35]. The detailed description of the GTA can PSO Initialization with GTA result
be found in Appendix D.
16 0.12
00..2360.3 0.18 0.2 Velocity and position update
0. 80. 2 0.360.3
0. 2
0.
15 00.12 00.12
.18
8
0.3 .18
00.26.3 0.2
0.306.3
0.3
0.0.8 2
0.30.
0.3
0.2 02
0
00.2..3620.
0.18
1
0.12
6 2
00.. 8
0.12
.8
8
0.3
0.1
00. .23
0. 8
.3
Parameter log 2 γ
2
8
0.3
13
6
0. 2 0
2 0.36. 8
0.3
12 0. 2 0.36
0.3 0.36 00..11028.200..3 2
0. 8 0 . 2 0. 8
0.03.28
0.3 0.
0
0..112
11 0.0 6 2 0. 8
6 0.3 0.360. 2 0. 80.360. 2
0.1
0.02068.2
0.1 0.100.3
0.12
0.06 8.2 No
0.06 0.06 Threshold
0. 8
10 2
0.1
6 02.38
0.12
0.30.0.1
0.
0.06 06 0.12
2 Yes
9 0.
0 .1
0.06
2
0.06
8 Best parameters of SVR
8 9 10 11 12 13 1 15 16
Parameter log 2 C
Fig. 2. A contour map of parameter selection of the GTA for a distribution End
system load data prediction (" = 0.001).
Fig. 3. Flowchart of the designed PSO for the best parameters.
4
1949-3053 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSG.2016.2628061, IEEE
Transactions on Smart Grid
Pursuant to the DOE Public Access Plan, this document represents the authors' peer-reviewed, accepted
manuscript. The published version of the article is available from the relevant publisher.
2) Optimal Parameter Determination with PSO: A Distri- from winter (Dec.-Feb.), spring (Mar.-May.), summer (Jun.-
bution System Load Data Prediction: Based on the results in Aug.), and autumn (Sep.-Nov.) for 20-days each season. With
Section IV-A3, the initialization area of the PSO is the blue the sampling rate of 1 Hz, the total data length is 6,912,000.
area in Fig. 2. In this example, ε = 0.001, the PSO is used to Different from the load data profle in transmission systems
determine the best value of C and γ during their lower and as discussed in Section I, there are a lot of abruptly stochastic
upper limits as in Section IV-A3. Considering that the PSO is deviations in the load profle of the distribution system, as
based on the result of the GTA, the range of velocity is set to shown in Fig. 5. The resolution of the proposed forecasting
be relatively small, as (0.8, 1.2). According to the PSO, the approach is 1 second. The simulations are executed using a
best parameters are determined as C = 532 and γ = 977. With server with 3.60 GHz Intel Xeon CPU and 32 GB RAM. The
the best parameters, the prediction results are demonstrated as SVR and PSO are implemented using the LibSVM, MATLAB
in Fig. 4. global optimization toolbox, and parallel computing toolbox.
As shown in Fig. 4(a), the prediction result is shown in red
2
curves with triangles, and the original data is shown using 5 x 10
.5
blue curves. The training data section is from 0 s to 3,060 s,
and the prediction data section is from 3,061 s to 3,600 s. 3.5
Load (KW)
3
It is noticed that the prediction data curve and original data 2.5
2
curve are almost consistent except for several abrupt deviation 1.5
points. This demonstrates the effectiveness of the proposed 1
0.5
prediction approach. 00 5 10 15 20
Time (Day)
In Fig. 4(b), the mean absolute percentage error (MAPE) is
1.57%, and more than 94% of the errors are accumulated in Fig. 5. An example of the collected four season data: 20-day summer load
range (−2%, 2%), which also indicates the effectiveness and data of a distribution system.
accuracy of the proposed approach. And some forecast points
with relatively large errors demonstrate that the stochastic
and abrupt deviating load points in a distribution system are
A. Numerical Results
diffcult to track and forecast. This also demonstrates that com-
pared to a transmission system, the stochastic characteristic 1) Numerical Results on Different Timescales: Considering
of load deviation is more complicated and widespread in a different control strategies for transient and short-term dis-
distribution system. turbance, the proposed approach is implemented in different
timescales: minutes-ahead and hours-ahead. In our study, the
3 10
2
training data is 10 times than the testing data. To evaluate the
Original data
Prediction data proposed method comprehensively, the sliding window test
2.5
is employed to traverse the whole load data. In detail, for
example, the sliding window test employed with 5 minutes
Load (KW)
2
ahead forecasting can be illustrated as follows.
1.5
1) First, the load data section from 1 to 50 minutes is taken
1 as training data to determine the best parameters and
build the forecasting model.
0.5 0 500 1000 1500
Time (S)
2000 2500 3000 3500 2) In the second step, the load data section from 51 to
55 minutes is taken as the test data to evaluate the
(a)
0
performance of the forecasting model.
3) In the third step, for the next round forecasting, the train-
50 94.45% ing data section moves forward from 6 to 55 minutes,
Density (%)
5
1949-3053 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSG.2016.2628061, IEEE
Transactions on Smart Grid
Pursuant to the DOE Public Access Plan, this document represents the authors' peer-reviewed, accepted
manuscript. The published version of the article is available from the relevant publisher.
30
TABLE I
P ERFORMANCE OF M INUTES - AHEAD F ORECASTING (MAPE(%)) 25 75 09%
Density(%)
5-minute-ahead 1.304 1.336 1.312 1.298 1.313
15
10-minute-ahead 1.453 1.434 1.455 1.477 1.455
20-minute-ahead 1.545 1.578 1.531 1.523 1.544 10
40-minute-ahead 1.645 1.631 1.637 1.661 1.643
5
0
-20 -15 -10 -5 -3 9 0 39 5 10 15 20
As shown in Fig. 6, it is noticed that more than 75% of the Prediction Error (%)
errors are accumulated between (−2%, 2%).
Fig. 7. Percentage error of hours-ahead scales forecasting
30
25 75.03%
where Sskw is the skewness, ǫ is the forecast error, µ is the
20 mean of the forecast error, and σ s the standard deviation of
Density(%)
15
the forecast error.
Furthermore, the kurtosis is used to measure the outlier-
10
prone of the probability distribution of the forecast errors,
5 which can be defned as following [37]–[39]
0
-15 -10 -5 -2 0 2 5 10 15
E(ǫ − µ)4
rediction Error (%) Skrt = (8)
σ4
Fig. 6. Percentage error of minutes-ahead scales forecasting.
where Skrt is the kurtosis, ǫ, µ, and σ are the same as above.
As shown in Table III, the skewness of the minutes-ahead
Hours-Ahead Forecasting
forecasting and hours-ahead are 0.072 and 0.081, which indi-
As shown in previous section, the proposed method has very
cates that the proposed approach has a very small asymmetry
good performance at minutely level. But for some applications,
of the probability distribution of the forecast errors. A few
short-term load forecasting also includes hourly time range.
large forecasting errors, which achieve 11.57% result from the
For the hours-ahead forecasting, it is considered that there are
abrupt load deviation in the distribution system. The kurtosis
several time period forecasting scales, such as 1-hour-ahead,
of the minutes-ahead forecasting and hours-ahead forecasting
2-hour-ahead, 4-hour-ahead, 8-hour-ahead, and 16-hour-ahead
are 6.912 and 5.724, which indicate the high accuracy of the
forecasting. According the forecasting performance in Table II,
proposed approach.
the 1-hour-ahead has the best performance and the 16-hour-
ahead forecasting has the largest forecasting errors. However,
the MAPE of the fve types forecasting are below 3.5%, and TABLE III
the average MAPE is 2.271%. Compared to the forecasting T HE SKEWNESS AND KURTOSIS OF THE PROPOSED APPROACH
results in the minutes-ahead forecasting, the errors of the
skewness kurtosis
hours-ahead forecasting sightly increase. On the other hand, minutes-ahead forecasting 0.072 6.912
this also illustrates that the hours-ahead forecasting is more hours-ahead forecasting 0.081 5.724
complicated than the minutes-ahead forecasting [16], [18].
As shown in Fig. 7, it is noticed that more than 75% of the
errors are accumulated between (−3.9%, 3.9%). 2) Peak Loads Forecasting: As shown in Table IV, the
peak load forecasting is investigated with the 1-hour-ahead
and 2-hour-ahead forecasting for four seasons data load. Based
TABLE II on [40]–[42], the procedures of peak load forecasting is shown
P ERFORMANCE OF H OURS - AHEAD F ORECASTING (MAPE(%)).
as following:
winter spring summer autumn Avg. 1) First, the peak load values {Lse se se
1-hour-ahead 1.639 1.647 1.643 1.672 1.650 p1 , Lp2 , · · · , Lp20 } and
se se se
2-hour-ahead 1.995 1.934 1.981 2.019 1.982 their time indices {Tp1 , Tp2 , · · · , Tp20 } are collected,
4-hour-ahead 2.169 2.173 2.176 2.123 2.161 where se = {1, 2, 3, 4}, which indicates winter, spring,
8-hour-ahead 2.388 2.372 2.331 2.401 2.373 summer and winter. The time range of peak load occur-
16-hour-ahead 3.198 3.155 3.172 3.221 3.187
rence is between 11:00 AM to 3:00 PM.
2) In the second step, the historical data are collected
In this paper, the skewness is used to measure the asymme- to train the SVR models. Specifcally, for day 1, the
try of the probability distribution of the forecast errors, which historical data are collected from 0:01 AM to 11:00 AM.
can be defned as following [37]–[39] 3) In the third step, compare the forecasted loads at time
E(ǫ − µ)3 {Tpse1 , Tpse2 , · · · , Tpse20 } with the collected peak loads, and
Sskw = (7) compute the forecast errors.
σ3
6
1949-3053 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSG.2016.2628061, IEEE
Transactions on Smart Grid
Pursuant to the DOE Public Access Plan, this document represents the authors' peer-reviewed, accepted
manuscript. The published version of the article is available from the relevant publisher.
7
1949-3053 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSG.2016.2628061, IEEE
Transactions on Smart Grid
Pursuant to the DOE Public Access Plan, this document represents the authors' peer-reviewed, accepted
manuscript. The published version of the article is available from the relevant publisher.
in . Combined with the cross-validation in Block 4, the βj,m′′j is the m′′j th element chosen from vector j ,
GTA traverses down their performance to narrow the possible j ∈ {1, 2, · · · , p} and m′′j ∈ {1, 2, · · · , mj }.
solution from the global solution space Rp into a local solution 2) Hj2 is the j2 th element of H, which
space R1p , where R1p ⊂ Rp . In the GTA, all the elements in can be built as a combination of different
are set relatively large to reduce the time consumption yet parameters [β1,m′′1 · · · βj,m′′j · · · βp,m′′p ], where
not neglect the local possible solution space R1p . j2 ∈ {1, 2, · · · , m1 × m2 ×, · · · , ×mp }.
In the second step of the parameter optimization, the 3) After every element vector of H is computed, the
particles of the PSO are initialized with the result of the traversing vector H is fnished.
GTA in the solution space (Block 6). Combined with cross- In addition, because the three parameters γ, C, and ε are
validation in Block 7, at each iteration, the position vector chosen, the dimension of vector is reduced, and ∈
i4 , velocity vector i4 , and the best position vector i4 are Rm1 ×m2 ×m3 . Therefore, the total number of the element
updated with the result of the objective function, where i4 ∈ vector Hj2 in H is reduced, j2 ∈ {1, 2, · · · , m1 × m2 × m3 },
{1, 2, 3, · · · , nP SO }, nP SO is the number of particles in the which means the computation complexity of GTA is reduced.
PSO, and
is a symbol to indicate that the particles of the For example, m1 = 2, m2 = 2, m3 = 1, then, the number
PSO are initialized in the GTA solution space of the SVR of element in H is 2 × 2 × 1 = 4. H1 = [γ1 C1 ε1 ], H2 =
parameters. In the PSO, the elements of the velocity vector [γ1 C2 ε1 ], H3 = [γ2 C1 ε1 ], and H4 = [γ2 C2 ε1 ].
i4 are set relatively small to determine the best parameters
in the local solution space R1p accurately. The best parameter A PPENDIX D
vector Hb can be determined in Block 9. In the next step, L̂ T HE GTA P ROCEDURE
is forecasted by the SVR forecaster with the best parameter
vector Hb in Block 10. In Block 11, the report of the numerical The GTA is shown in Algorithm 1, which aims to traverse
result is generated and compared to other approaches. the global solution space into one or several local spaces
In real-world applications, because the traverse result cor- for the PSO in the next step. In the frst step, the three
responding to each element in H can be calculated inde- parameters are initialized and the traversing vector H is built.
pendently, a parallel computation of GTA is used to reduce In the second step, for each element vector Hj2 in H, the
consumption time. If there are several possible local solution computation of the corresponding RCV is independent and
spaces, R1p , R2p , · · · , Rnr
p
, determined by the GTA, a parallel can be computed in parallel, which can further reduce the
computation of the PSO can be used to determine several computation time. In the last step, if multiple local solution
best parameter vectors Hbp1 , Hbp2 , · · · , Hbpnr for each small spaces are determined with the contour map, all of them are
possible solution space, respectively, where Rjp1 ∈ Rp , j1 ∈ selected for the PSO step in parallel.
{1, 2, · · · , nr}. Finally, comparing all the parameter vectors,
Algorithm 1 GTA for Parameter Optimization
a best parameter vector Hb can be determined for the SVR.
After the SVR-based forecaster is built, the future load data Objective: Transfer the global optimization problem to
L̂ can be forecast from historical data L. one or several local optimization problems.
i4 = [νi
4 ,1 νi
4 ,2 · · · νi
4 ,nOBJP SO ] (12)
P ROCEDURE FOR B UILDING THE T RAVERSING V ECTOR
The traversing vector H is built as follows:
i4 = [ηi
4 ,1 ηi
4 ,2 · · · ηi
4 ,nOBJP SO ] (13)
1) For every different vector j , a parameter βj,m′′j is where
8
1949-3053 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSG.2016.2628061, IEEE
Transactions on Smart Grid
Pursuant to the DOE Public Access Plan, this document represents the authors' peer-reviewed, accepted
manuscript. The published version of the article is available from the relevant publisher.
i4 (t) = i4 (t − 1) + ϕ1 θ1 ( i4 − i4 (t − 1)) (14) and Management, vol. 46, no. 17, pp. 2669–2688, 2005.
[14] D. Niu, Y. Wang, and D. D. Wu, “Power load forecasting using support
+ ϕ2 θ2 (
g −
i4 (t − 1)) vector machine and ant colony optimization,” Expert Systems with
Applications, vol. 37, no. 3, pp. 2531–2539, 2010.
where ϕ1 and ϕ2 are the two positive acceleration coeffcients, [15] P.-F. Pai and W.-C. Hong, “Forecasting regional electricity load based
θ1 and θ2 are two independently uniformly distributed random on recurrent support vector machines with genetic algorithms,” Electric
variables with range (0, 1), i
4 is the vector of the best Power Systems Research, vol. 74, no. 3, pp. 417–425, 2005.
[16] C. Lu, H.-T. Wu, and S. Vemuri, “Neural network based short term
historical position of the i4 th particle, and g
is the best load forecasting: A reivew and evaluation,” IEEE Transactions on Power
position vector among all particles. Systems, vol. 8, no. 1, pp. 336–342, 1993.
At each iteration, the position of the i4 th particle is deter- [17] A. Khotanzad, R.-C. Hwang, A. Abaye, and D. Maratukulam, “An
adaptive modular artifcial neural network hourly load forecaster and
mined as follows: its implementation at electric utilities,” IEEE Transactions on Power
Systems, vol. 10, no. 3, pp. 1716–1722, 1995.
i
4 (t) = i
4 (t − 1) +
i4 (t) (15) [18] S.-J. Huang and K.-R. Shih, “Short-term load forecasting via ARMA
model identifcation including non-gaussian process considerations,”
where
i4 (t) is the position vector of the i4 th particle at time IEEE Transactions on Power Systems, vol. 18, no. 2, pp. 673–679, 2003.
t,
i4 (t) is the velocity vector of the i4 th particle at time t, [19] H. Al-Hamadi and S. Soliman, “Short-term electric load forecasting
based on kalman fltering algorithm with moving window weather and
and
= [γ C ε]. load model,” Electric Power Systems Research, vol. 68, no. 1, pp. 47–59,
After the positions and velocities of all the particles are 2004.
updated at each iteration, the performance of the parameters [20] J. W. Taylor, “Short-term electricity demand forecasting using double
are computed with the designed cross-validation. A cross- seasonal exponential smoothing,” Journal of the Operational Research
Society, vol. 54, no. 8, pp. 799–805, 2003.
validation is used with the PSO similarly as that shown in [21] J. Taylor, A. Maitra, M. Alexander, D. Brooks, and M. Duvall, “Evalu-
Section IV-A2, and the objective function is similar to (6). ations of plug-in electric vehicle distribution system impacts,” in 2010
IEEE Power and Energy Society General Meeting, 2010, pp. 1–6.
[22] K. Clement-Nyns, E. Haesen, and J. Driesen, “The impact of charging
R EFERENCES plug-in hybrid electric vehicles on a residential distribution grid,” IEEE
Transactions on Power Systems, vol. 25, no. 1, pp. 371–380, 2010.
[1] J. M. Carrasco, L. G. Franquelo, J. T. Bialasiewicz, E. Galván, R. P.
[23] H. Jiang, X. Dai, W. Gao, J. Zhang, Y. Zhang, and E. Muljadi, “Spatial-
Guisado, M. A. Prats, J. I. León, and N. Moreno-Alfonso, “Power-
temporal synchrophasor data characterization and analytics in smart
electronic systems for the grid integration of renewable energy sources:
grid fault detection, identifcation and impact causal analysis,” IEEE
A survey,” IEEE Transactions on Industrial Electronics, vol. 53, no. 4,
Transactions on Smart Grid, vol. 7, no. 5, pp. 2525–2536, 2016.
pp. 1002–1016, 2006.
[24] M. Cui, D. Ke, Y. Sun, D. Gan, J. Zhang, and B.-M. Hodge, “Wind
[2] Y. Gu, H. Jiang, Y. Zhang, and D. W. Gao, “Statistical scheduling of
power ramp event forecasting using a stochastic scenario generation
economic dispatch and energy reserves of hybrid power systems with
method,” IEEE Transactions on Sustainable Energy, vol. 6, no. 2, pp.
high renewable energy penetration,” in 2014 48th Asilomar Conference
422–433, 2015.
on Signals, Systems and Computers, 2014, pp. 530–534.
[3] H. Jiang, J. J. Zhang, W. Gao, and Z. Wu, “Fault detection, identifcation, [25] Z. Tang and K. K. Bagchi, “Globally convergent particle swarm op-
and location in smart grid based on data-driven computational methods,” timization via branch-and-bound,” Computer and Information Science,
IEEE Transactions on Smart Grid, vol. 5, pp. 2947 – 2956, 2014. vol. 3, no. 4, pp. 60–71, 2010.
[4] F. He, Y. Gu, J. Hao, J. J. Zhang, J. Wei, and Y. Zhang, “Joint real-time [26] F. Van Den Bergh, “An analysis of particle swarm optimizers,” Ph.D.
energy and demand-response management using a hybrid coalitional- dissertation, University of Pretoria, 2006.
noncooperative game,” in 2015 49th Asilomar Conference on Signals, [27] S. Kotsiantis, D. Kanellopoulos, and P. Pintelas, “Data preprocessing for
Systems and Computers. IEEE, 2015, pp. 895–899. supervised leaning,” International Journal of Computer Science, vol. 1,
[5] F. Ding, P. Li, B. Huang, F. Gao, C. Ding, and C. Wang, “Modeling no. 2, pp. 111–117, 2006.
and simulation of grid-connected hybrid photovoltaic/battery distributed [28] A. J. Smola and B. Schölkopf, “A tutorial on support vector regression,”
generation system,” in CICED 2010 Proceedings. IEEE, 2010, pp. Statistics and Computing, vol. 14, no. 3, pp. 199–222, 2004.
1–10. [29] C. J. Burges, “A tutorial on support vector machines for pattern
[6] X. Fang, S. Misra, G. Xue, and D. Yang, “Smart grid - the new and recognition,” Data Mining and Knowledge Discovery, vol. 2, no. 2, pp.
improved power grid: a survey,” IEEE Communications Surveys and 121–167, 1998.
Tutorials, vol. 14, pp. 944–980, 2012. [30] C.-C. Chang and C.-J. Lin, “Libsvm: a library for support vector
[7] H. Jiang, J. J. Zhang, D. W. Gao, Y. Zhang, and E. Muljadi, “Syn- machines,” ACM Transactions on Intelligent Systems and Technology
chrophasor based auxiliary controller to enhance power system transient (TIST), vol. 2, no. 3, p. 27, 2011.
voltage stability in a high penetration renewable energy scenario,” in [31] A. Shashua, “Introduction to machine learning: Class notes 67577,”
2014 IEEE Symposium Power Electronics and Machines for Wind and arXiv preprint arXiv:0904.3664, 2009.
Water Applications (PEMWA), 2014, pp. 1–7. [32] M. Jiang, S. Jiang, L. Zhu, Y. Wang, W. Huang, and H. Zhang, “Study
[8] E. Dall’Anese, H. Zhu, and G. B. Giannakis, “Distributed optimal power on parameter optimization for support vector regression in solving the
fow for smart microgrids,” IEEE Transactions on Smart Grid, vol. 4, inverse ECG problem,” Computational and mathematical methods in
no. 3, pp. 1464–1475, 2013. medicine, vol. 2013, 2013.
[9] T. Hong, P. Pinson, and S. Fan, “Global energy forecasting competition [33] A. R. Webb, Statistical pattern recognition. John Wiley & Sons, 2003.
2012,” International Journal of Forecasting, vol. 30, no. 2, pp. 357–363, [34] S. Makridakis, S. C. Wheelwright, and R. J. Hyndman, Forecasting
2014. methods and applications. John Wiley & Sons, 2008.
[10] J. Xiao, X.-d. Guo, and L.-q. Bai, “An improved model of total supply [35] S. Fan and L. Chen, “Short-term load forecasting based on an adaptive
capability for distribution systems,” in Power and Energy Engineering hybrid method,” IEEE Transactions on Power Systems, vol. 21, no. 1,
Conference (APPEEC), 2012 Asia-Pacifc. IEEE, 2012, pp. 1–4. pp. 392–401, 2006.
9
1949-3053 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSG.2016.2628061, IEEE
Transactions on Smart Grid
Pursuant to the DOE Public Access Plan, this document represents the authors' peer-reviewed, accepted
manuscript. The published version of the article is available from the relevant publisher.
[36] Y. Del Valle, G. K. Venayagamoorthy, S. Mohagheghi, J.-C. Hernandez, Huaiguang Jiang received his B.E. degree in elec-
and R. G. Harley, “Particle swarm optimization: basic concepts, variants trical engineering from the National University of
and applications in power systems,” IEEE Transactions on Evolutionary Defence Technology, Changsha, China, in 2007,
Computation, vol. 12, no. 2, pp. 171–195, 2008. his M.S. degree from the electrical engineering in
[37] K. V. Mardia, “Measures of multivariate skewness and kurtosis with University of Electronic Science and Technology
applications,” Biometrika, vol. 57, no. 3, pp. 519–530, 1970. of China, Chengdu, China, in 2010, and his Ph.D
[38] ——, “Applications of some measures of multivariate skewness and in Electrical Engineering, University of Denver,
kurtosis in testing normality and robustness studies,” Sankhyā: The Denver, Colorado, USA, in 2015. Currently, his is
Indian Journal of Statistics, Series B, pp. 115–128, 1974. with the National Renewable Energy Laboratory in
[39] E. Jondeau and M. Rockinger, “Conditional volatility, skewness, and Golden, Colorado. His research interests are phasor
kurtosis: existence, persistence, and comovements,” Journal of Economic measurement unit applications, renewable energy
Dynamics and Control, vol. 27, no. 10, pp. 1699–1737, 2003. integration, smart grid, signal processing, time-frequency analysis, big data,
[40] Y.-Y. Hsu and C.-C. Yang, “Design of artifcial neural networks for and machine learning.
short-term load forecasting. part 2: Multilayer feedforward networks for
peak load and valley load forecasting,” in IEE Proceedings-Generation, Yingchen (”Y.C.”) Zhang (M07) received his
Transmission and Distribution, vol. 138, no. 5. IET, 1991, pp. 414–418. B.S.E.E. from Tianjin University, Tianjin, China,
[41] L. Saini and M. Soni, “Artifcial neural network based peak load in 2003 and his Ph.D. in electrical engineering
forecasting using levenberg-marquardt and quasi-newton methods,” in from Virginia Polytechnic Institute and State Univer-
IEE Proceedings-Generation, Transmission and Distribution, vol. 149, sity, Blacksburg, in 2010. He is currently with the
no. 5. IET, 2002, pp. 578–584. National Renewable Energy Laboratory in Golden,
[42] N. Amjady, “Short-term hourly load forecasting using time-series mod- Colorado, USA. His research interests include power
eling with peak load estimation capability,” IEEE Transactions on Power system stability with the large-scale integration of
Systems, vol. 16, no. 3, pp. 498–505, 2001. renewable energies, power system wide-area moni-
toring, and phasor measurement unit applications for
renewable integrations.
10
1949-3053 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.