0% found this document useful (0 votes)
125 views5 pages

Review On ANN Based STLF Models

This document provides an extensive survey of artificial neural network (ANN)-based short-term load forecasting models. It discusses the most important factors that affect the accuracy and efficiency of ANN load forecasters, including backpropagation network structures, input variable selection, training set selection, modifications to the backpropagation algorithm, determining the number of hidden neurons, and parameters of the backpropagation algorithm.

Uploaded by

gsaibaba
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
125 views5 pages

Review On ANN Based STLF Models

This document provides an extensive survey of artificial neural network (ANN)-based short-term load forecasting models. It discusses the most important factors that affect the accuracy and efficiency of ANN load forecasters, including backpropagation network structures, input variable selection, training set selection, modifications to the backpropagation algorithm, determining the number of hidden neurons, and parameters of the backpropagation algorithm.

Uploaded by

gsaibaba
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

A Review of ANN-based Short-Term Load Forecasting Models

Y. Rui A.A. El-Keib

Department of Electrical Engineering


University of Alabama, Tuscaloosa, AL 35487

Abstract - Artificial Neural Networks (AAN) have recently such opinion may be in question.
been receiving considerable attention and a large number of Over the past two decades, ANNs have been receiving
publications concerning ANN-based short-term load considerable attention and a large number of papers on
forecasting (STLF) have appreared in the literature. An their application to solve power system problems has
extensive survey of ANN-based load forecasting models is appeared in the literature. This paper presents an extensive
given in this paper. The six most important factors which survey of ANN-based STLF models. Although many
affect the accuracy and efficiency of the load forecasters are factors affect the accuracy and efficiency of the
presented and discussed. The paper also includes ANN-based load forecaster, the following six factors are
conclusions reached by the authors as a result of their believed to be the most important ones. In section 2,
research in this area. various kinds of Back-Propagation (BP) network structures
are presented and discussed. The selection of input
Keywords: artificial neural networks, short-term load variables is reviewed in section 3. In section 4, different
forecasting models ways of selecting the training set are presented and
evaluated. Because of the drawbacks of the BP algorithm,
some efficient modifications are discussed in section 5. In
Introduction section 6 and 7, the determination of the number of hidden
neurons and the parameters of the BP algorithm are
Accurate and robust load forecasting is of great respectively presented. Conclusions follow in section 8.
importance for power system operation. It is the basis of
economic dispatch, hydro-thermal coordination, unit
commitment, transaction evaluation, and system security The BP network structures
analysis among other functions.
Because of its importance, load forecasting has been Artificial Neural Networks have parallel and distributed
extensively researched and a large number of models were processing structures. They can be thought of as a set of
proposed during the past several decades, such as computing arrays consisting of series of repetitive uniform
Box-Jenkins models, ARIMA models, Kalman filtering processors placed on a grid. Learning is achieved by
models, and the spectral expansion techniques-based models. changing the interconnection between the processors [1].
Generally, the models are based on statistcal methods and To date, there exists many types of ANNs which are
work well under normal conditions, however, they show characterized by their topology and learning rules. As for
some deficiency in the presence of an abrupt change in the STLF problem, the BP network is the most widely used
environmental or sociological variables which are believed one. With the ability to approximate any continuous
to affect load patterns. Also, the employed techniques for nonlinear function, the BP network has extraordinary
those models use a large number of complex relationships, mapping (forecasting) abilities.
require a long computational time, and may result in The BP network is a kind of multilayer feed forward
numerical instabilities. Therefore, some new forecasting network, and the transfer function within the network is
models were introduced recently. usually a nonlinear function such as the Sigmoid function.
As a result of the development of Artificial Intelligence The typical BP network structure for STLF is a three-layer
(AI), Expert System (ES) and Artificial Neural Networks network, with the nonlinear Sigmoid function as the
(ANN) have been applied to solve the STLF problems. transfer function [2-8]. An example of this network is
An ES forecasts the load according to rules extracted from shown in Figure 1.
experts' knowledge and operators' experience. This method is In addition to the typical Sigmoid function, a linear
promising, however, it is important to note that the expert transfer function from the input layer directly to the output
opinion may not always be consistent, and the reliability of layer as shown in Figure 2 was proposed in [9] to account
for linear components of the load. The authors of [9] have forecasting error over the period of a whole year has
reported that this approach has improved their forecasting improved considerably.
results by more than 1%. It is proven that a 3-layer ANN with suitable dimension
is sufficient to approximate any continuous non-linear
function. In [13], it is illustrated that the 4-layer structure is
easier to be trapped in a local minima while possesing the
other features of the 3-layer ANNs. However, attracted by
the compact architecture and efficiency of the learning
process of the 4-layer ANN, a load forecaster using this
structure was recoomended in [1,14] and promising results
were reported.
Based on the above discussion, the topology of BP
network can be of 3-layers or 4-layers, the transfer function
can be linear, nonlinear or a combination of both. Also, the
network can be either fully connected or non-fully
connected. From our experience we have found that the BP
network structure is problem dependent, and a structure
that is suitable for a given power system is not necassarily
Figure 1 A typical BP network structure suitable for another.

Input variables of BP network

As was pointed out earlier, the BP network is a kind of


array which can realize nonlinear mapping from the inputs
to the outputs. Therefore, the selection of input variables of
a load forecasting network is of great importance. In
general, there are two selection methods. One is based on
experience [1,3,9,14], and the other is based on statistical
analysis such as the ARIMA [11] and correlation analysis
[6].
If we denote the load at hour k as l(k), a typical selection
of inputs based on operation experience will be l(k-1),
l(k-24), t(k-1), etc., where t(k) is the temperature
corresponding to the load l(k).
Figure 2 An ANN Structure with linear Unlike those methods which are based on experience, [6]
transfer function applies auto-correlation analysis on the historical load data
to determine the input variables. Auto-correlation analysis
Because fully connected BP networks need more training shows that correlation of peaks occurs at the multiples of
time and are not adaptive enough to temperature changes, a 24 hour lags. This indicates that the loads at the same hours
non-fully connected BP model is proposed in [10,11]. The have very strong correlation with each other. Therefore,
reported results show that although a fully connected ANN is they can be chosen as input variables.
able to capture the load characteristics, a non-fully connected In [11], the authors apply ARIMA procedures and
ANN is more adaptive to respond to temperature changes. auto-correlation analysis to determine the necessary load
The results also show that the forecasting accuracy is related inputs. After load related inputs are determined, the
significantly improved for abrupt temperature changing days. corresponding temperature related inputs are determined.
Moreover, [11] presents a new approach of which combines The authors in [10] discuss the method of using ANN to
several sub-ANNs together to give better forecasting results. forecast the load curve under extreme climatic conditions.
Recently, a recurrent high order neural network (RHONN) In addition to using conventional information such as
is proposed [12]. Due to its dynamic nature, the RHONN historical loads and temperature as input variables,
forecasting model is able to adapt quickly to changing wind-speed, sky-cover are also chosen.
conditions such as important load variations or changes of In all, the input variables can be classified into 8 classes:
the daily load pattern. It is reported in [12] that the 1. historical loads [1-3,6,7,9-12,15]
2. historical and future temperatures [1-3,6,9-11,15] historical loads into five classes. These are Monday,
3. hour of day index [1,3,4,6,11] Tuesday-Thursday, Friday, Saturday, and Sunday/Public
4. day of week index [1,4,6,11] holiday. A different way, used in [2], collects the data with
5. wind-speed [4,10] characteristics similar to the day being forecasted, and
6. sky-cover [4,10] combines these data with the data from the previous 5 days
7. rainfall [4] to form a training set.
8. wet or dry day [4]. In addition to the above conventional day type
There are no general rules that can be followed to classification methods, some unsupervised ANN models
determine input variables. This largely depends on are used to identify the day type patterns. The unsupervised
engineering judgment and experience. Our investigations learning concept, also called self-organization can be
revealed that for a normal climate area, the first 4 classes of effectively used to discover similarities among unlabeled
variables are sufficient to give acceptable forecasting results. patterns. An unsupervised ANN is employed in [5,14] to
However, for an extreme weather-conditioned area the later identify the different day types.
4 classes are recommended, because of the highly nonlinear In all, because of the great importance of appropriate
relationship between the loads and the weather conditions. selection of the training set, several day type classification
methods are proposed, which can be categorized into two
types. One includes conventional method which uses
Selection of training set observation and comparison [1,2,9]. The other, is based on
unsupervised ANN concepts and selects the training set
ANNs can only perform what they were trained to do. As automatically [10,14].
for the case of STLF, the selection of the training set is a
crucial one. The criteria for selecting the training set is that
the characteristics of all the training pairs in the training set Modification of the BP algorithm
must be similar to those of the day to be forecasted.
Choosing as many training pairs as possible is not the The BP algorithm is widely used in STLF and has some
correct approach for the following reasons: good features such as, its ability to easily accommodate
i) Load periodicity. The 7 days of a week have rather weather variables, and its implicit expressions relating
different patterns. Therefore, using Sundays' load data to inputs and outputs. However, it also has some drawbacks.
train the network which is to be used to forecast Mondays' These are its time consuming training process and its
loads would yield wrong results. convergence to local minima. The authors of [16] report
ii) Because loads posses different trends in different their investigation of the problem and point out that one of
periods, recent data is more useful than old data. Therefore, the major reasons for these drawbacks is "premature
a very large training set which includes old data is less useful saturation," which is a phenomenon that remain constant at
to track the most recent trends. a significantly high value for some period of the time
As discussed in i), to obtain good forecasting results, day during the learning process. A method to prevent this
type information must be taken into account. There are two phenomenon by the appropriate selecting of the initial
ways to do this. One way is to construct different ANNs for weights is proposed in [16].
each day type, and feed each ANN with the corresponding In [17], the authors discuss the effects of the momentum
day type training sets [6,15]. The other is to use only one factor to the algorithm. The original BP algorithm does not
ANN but contain the day type information in the input have a momentum factor and is difficult to converge. The
variables [1,7,11]. The two methods have their advantages BP algorithm with momentum (BPM) converges much
and disadvantages. The former uses a number of relatively faster than the conventional BP algorithm. In [3,18], it is
small size networks, while the later has only one network of shown that the use of the BPM in STLF significantly
a relatively large size. improves the training process.
In [9], the authors realized that the selection of the training The authors of [8] present extensive studies on the
cases significantly affect the forecasting result, and effects of various factors such as the learning step, the
developed a selection method based on the "least distance momentum factor to BPM. They proposed a new learning
criteria". Using this approach, the forecasting results have algorithm for adaptive training of neural networks. This
shown significant improvement. algorithm converges faster than the BPM, and makes the
It is worth noting that the day type classification is system selection of initial parameter much easier.
dependent. For instance, in some systems, Mondays' load A new learning algorithm motivated by the principle of
may be similar to that of Tuesdays', but in others this will not "forced dynamic" for the total error function is proposed in
be true. A typical classification given in [1] categarizes the [19]. The rate of change of the network weights is chosen
such that the error function to be minimized is forced to There are no general rules to obtain an optimal learning
"decay" in a certain mode. step. The values used in [1,4,14] are 0.9, 0.25, and 0.05
Another modified approach to the conventional BP respectively.
algorithm is proposed in [20]. The modification consists of a
new total error function. This error function updates the iii). Momentum factor
weights in direct proportion to the total error. With this Like the learning step, the momentum factor is also
modification, the periods of stagnation are much shorter and system dependent. The values chosen by [1,4,14] are 0.6,
the possibility of trapping in a local minima is greatly 0.9, and 0.9 respectively.
reduced. In contrast to the learning step whose value can be larger
than 1.0, the upper limit of the momentum factor is 1.0 [18].
This upper limit can be obtained from the physical meaning
Number of hidden neurons of momentum factor. It is the forgetting factor of the
previous weight changes. The algorithm diverges if the
Determination the optimal number of hidden neurons is a value of the momentum factor is greater than 1.0 is used.
crucial issue. If it is too small, the network can not posses The authors of [8] compare the efficiency and accuracy
sufficient information, and thus yields inaccurate forecasting of the neural network using different learning steps and
results. On the other hand, if it is too large, the training momentum factors, and show that with an adaptive
process will be very long [1]. algorithm, the parameters can be chosen from a much
The authors in [21] discuss the number of hidden neurons wider range.
in binary value cases. In order to make the mapping between In our investigation, we have observed that the initial
the output value and input pattern arbitrary for I learning weights with values between -0.5 and 0.5 yield good
patterns, the necessary and sufficient number of hidden results. As for the learning step and the momentum factor,
neurons is I-1. The authors of [22] also state that a multilayer they should not be fixed but gradually decreased with the
perceptron with k-1 hidden neurons can realize arbitrary increase of the iteration index. Using an adaptive algorithm
functions defined on a k-element set. such as the one proposed by [8] would yield a more stable
Up to our knowledge, there is no absolute criteria to algorithm.
determine the exact number of hidden neurons that will lead
to an optimal solution. Different numbers of hidden neurons
are used in [1,10,11,14]. Conclusions
Based on our experience, the appropriate number of
hidden neurons is system dependent, mainly determined by A summary of an extensive survey of existing
the size of the training set and the number of input variables. ANN-based STLF models is presented. Six factors which
are believed to have a considerable effect on the accuracy,
reliability, and robustness of the models are emphasized
Parameters of the BP algorithm The surveyed publications and the authors' own experience
lead to the conclusion that the ANN structure, input
Three parameters need to be determined before BP variables, number of hidden neurons, and BP algorithm
network can be trained and is able to forecast. These are parameters are mainly system dependent. The development
i) Weights: of a more general ANN model to handle the STLF problem
The initial weights should be small random numbers. It is is a challenging problem and should be investigated timely.
proven that if the initial weights in the same layer are equal,
the BP algorithm can not converge [18].
References
ii) Learning step:
The effectiveness and convergence of the BP algorithm [1]D. Srinivasan, A neural network short-term load
depend significantly on the value of the learning step. forecaster, Electric Power Research, pp.
However, the optimum value of the learning step is system 227-234, 28 (1994).
dependent. For systems which posses broad minima that [2]O. Mohammed, Practical Experiences with an
yield small gradient values, a large value of the learning step Adaptive Neural Network short-term load
will result in a more rapid convergence. However, for a forecasting system, IEEE/PES 1994 Winter
system with steep and narrow minima, a small value of Meeting, Paper # 94 210-5 PWRS.
learning step is more suitable [24]. [3]D.C. Park, Electric load forecasting using an
artificial neural network, IEEE Trans. on Power Back Propagation Learning, Neural Networks,
Systems, Vol. 6, No. 2, pp. 412-449, May 1991. Vol. 6, pp. 719-728, 1993.
[4]T.S. Dillon, Short-term load forecasting using an [17]V.V. Phansalkar, Analysis of the
adaptive neural network, Electrical Power & Back-Propagation Algorithm with
Energy Systems, pp. 186-191, 1991. Momentum, IEEE Trans. on Neural Networks,
[5]M. Djukanvic, Unsupervised/supervised learning Vol. 5, No. 3, May 1994.
concept for 24-hour load forecasting, IEE [18]Y. Rui, P. Jin, The modelling method for
Proc.-C, Vol. 140, No. 4, pp. 311-318, July, ANN-based forecaster, CDC' 94, China,
1993. 1994.
[6]K.Y. Lee, Short-Term Load Forecasting Using an [19]G.P. Alexander, An Accelerated Learning
Artificial neural Network, IEEE Trans. on Algorithm for Multilayer Perceptron
Power Systems, Vol. 7, No. 1, pp. 124-131, Feb. Networks, IEEE Trans. on Neural Networks,
1992. Vol. 5, No. 3, pp. 493-497, May 1994.
[7]C.N. Lu, Neural Network Based Short Term Load [20]A.V. Ooyen, Improving the Convergence of the
Forecasting, IEEE Trans. on Power Systems, Back-Propagation Algorithm, Neural
Vol. 8, No. 1, pp. 336-341, Feb. 1993. Network, Vol. 5, pp. 465-471, 1992.
[8]K.L. Ho, Short Term Load Forecasting Using a [21]M. Arai, Bounds on the Number of Hidden
Multilayer neural Network with an Adaptive units in Binary-Valued Three-Layer Neural
Learning Algorithm, IEEE Trans.on Power Networks, Neural Networks, Vol. 6, pp.
Systems, Vol. 7, No. 1, pp. 141-149, Feb. 1992. 855-860, 1993.
[9]T.M. Peng, Advancement in the application of [22]S.C. Huang, Bounds on the Number of Hidden
neural networks for short-term load forecasting, Neurons in Multilayer Perceptrons, IEEE
IEEE/PES 1991 Summer Meeting, Paper # trans on Neural Networks, Vol. 2, No. 1, pp.
451-5 PWRS. 47-55, Jan. 1991.
[10]B.S. Kermanshahi, Load forecasting Under [23]Y.Rui, P. Jin, Power load forecasting using
extreme climatic conditions, Proceedings, IEEE ANN, Journal of Hehai University, 1993.
Second International Forum on the [24]J.M. Zurada, Introduction to Artificial Neural
Applications of Neural Networks to Power Systems, West Publishing Company, 1992.
Systems, April, 1993, Yokohoma, Japan.
[11]S.T. Chen, Weather sensitive short-term load
forecasting using nonfully connected artificial
neural networks, IEEE/PES 1991 Summer
Meeting, Paper # 449-9 PWRS.
[12]G.N. Kariniotakis, Load forecasting using
dynamic high-order neural networks, pp.
801-805, Proceedings, IEEE Second
International Forum on the Applications of
Neural Networks to Power Systems, April, 1993,
Yokohoma, Japan.
[13]J. Villiers, Back-propagation Neural Nets with
One and Two Hidden Layers, IEEE Trans. on Neural
Networks, Vol. 4, No. 1, pp. 136-146, Jan.
1992.
[14]Y.Y. Hsu, Design of artificial neural networks for
short-term load forecasting, IEE Proc.C, Vol.
138, No. 5, pp. 407-418, Sept. 1991.
[15]A.D. Papalexopoulos, Application of neural
network technology to short-term system load
forecasting, pp. 796-800, Proceedings, IEEE
Second International Forum on the
Applicaitons of Neural Networks to Power
Systems, April, 1993, Yokohoma, Japan.
[16]Y. Lee, An Analysis of Premature Saturation in

You might also like