0% found this document useful (0 votes)
26 views40 pages

K-Factor GARMA

This paper proposes a hybrid model combining a parametric k-factor GARMA model and a nonparametric LLWNN model to forecast electricity spot prices in the Nord Pool market. The k-factor GARMA model is used to model the conditional mean using a wavelet approach. Then a LLWNN model is used to model the conditional variance. Empirical results show the hybrid model outperforms individual ARFIMA-LLWNN, k-factor GARMA-FIGARCH, and LLWNN models.

Uploaded by

dali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views40 pages

K-Factor GARMA

This paper proposes a hybrid model combining a parametric k-factor GARMA model and a nonparametric LLWNN model to forecast electricity spot prices in the Nord Pool market. The k-factor GARMA model is used to model the conditional mean using a wavelet approach. Then a LLWNN model is used to model the conditional variance. Empirical results show the hybrid model outperforms individual ARFIMA-LLWNN, k-factor GARMA-FIGARCH, and LLWNN models.

Uploaded by

dali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/326490926

Forecasting electricity spot price for Nord Pool market with a hybrid k-factor
GARMA-LLWNN model

Article in Journal of Forecasting · July 2018


DOI: 10.1002/for.2544

CITATION READS
1 152

3 authors:

Souhir Ben Amor Heni Boubaker


University of Sousse University of Sousse
2 PUBLICATIONS 1 CITATION 26 PUBLICATIONS 184 CITATIONS

SEE PROFILE SEE PROFILE

Lotfi Belkacem
LaREMFiQ, University of Sousse, Tunisia
55 PUBLICATIONS 169 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Mandelbrot's legacy View project

Article Sources of momentum profits: bootstrap methods View project

All content following this page was uploaded by Heni Boubaker on 23 July 2018.

The user has requested enhancement of the downloaded file.


Journal of Forecasting

Forecasting electricity spot price for Nord Pool market with


a hybrid k-factor GARMA-LLWNN model

Journal: Journal of Forecasting

Manuscript ID FOR-17-0129.R1
Fo
Wiley - Manuscript type: Special Issue Article

Date Submitted by the Author: 26-Jun-2018


rP

Complete List of Authors: Ben Amor, Souhir; Institute of high commercial studies of Sousse (IHEC),
Quantitative Methods in Finance
Boubaker, Heni; IPAG LAB, IPAG Business School, Applied mathematics
ee

Belkacem, Lotfi; Institute of high commercial studies of Sousse (IHEC),


Quantitative Methods in Finance

Keywords: Electricity price, k- factor GARMA, LLWNN, Hybrid, Forecasting


rR

This paper proposes a new hybrid approach, based on the combination of


parametric and nonparametric models by adopting wavelet estimation
approach to model and predict the price electricity for Nord Pool market.
ev

Our hybrid methodology consists in two steps; the first step aims at
modeling the conditional mean of the time series, using a generalized
fractional model with k -factor of Gegenbauer termed the k -factor GARMA
model, the parameters of this model are estimated using the wavelet
iew

approach based on the discrete wavelet packet transform (DWPT). Then


Abstract:
the Second aims at estimating the conditional variance, so we adopt the
local linear wavelet neural network (LLWNN) model. The proposed hybrid
model is tested using the hourly log-returns of electricity spot price from
the Nord Pool market. The empirical results were compared with the
predictions of the ARFIMA-LLWNN, the k -factor GARMAFIGARCH and the
individual LLWNN models. It proves that the proposed hybrid k -factor
GARMA-LLWNN model outperforms all other competing methods. Hence, it
is a robust tool in forecasting time series.

http://mc.manuscriptcentral.com/for
Page 1 of 38 Journal of Forecasting

1
2
3 Forecasting electricity spot price for Nord Pool market with a hybrid k-
4
5 factor GARMA-LLWNN model
6
7
8
9
10
11
12
a ,*
13 Souhir Ben Amor Heni Boubaker a ,b Lotfi Belkacem a
14
15
16
17
18
19
Fo
20
21
22 Abstract
23
rP

24 This paper proposes a new hybrid approach, based on the combination of parametric
25
26 and nonparametric models by adopting wavelet estimation approach to model and predict the
ee

27
28 price electricity for Nord Pool market. Our hybrid methodology consists into two steps; the
29 first step aims at modeling the conditional mean of the time series, using a generalized
rR

30
31 fractional model with k -factor of Gegenbauer termed the k -factor GARMA model, the
32
parameters of this model are estimated using the wavelet approach based on the discrete
ev

33
34 wavelet packet transform (DWPT). Then the Second aims at estimating the conditional
35
36 variance, so we adopt the local linear wavelet neural network (LLWNN) model. The proposed
iew

37 hybrid model is tested using the hourly log-returns of electricity spot price from the Nord Pool
38
39 market. The empirical results were compared with the predictions of the ARFIMA-LLWNN,
40
41 the k -factor GARMA-FIGARCH and the individual LLWNN models. It proves that the
42 proposed hybrid k -factor GARMA-LLWNN model outperforms all other competing
43
44 methods. Hence, it is a robust tool in forecasting time series.
45
46
47
48
49 Keywords: Electricity price, k - factor GARMA, LLWNN, Hybrid, Forecasting.
50
51
52
53 * Corresponding author. Email: [email protected]
54 a
Institute of high commercial studies of Sousse (IHEC), 40 Route de ceinture Sahloul III, 4054 Sousse,
55 Tunisia.
56 b
IPAG LAB, IPAG Business School, 184 Boulevard Saint-Germain, 75006 Paris, France.
57
58 1
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 2 of 38

1
2
3
4 Nomenclature
5
6 ARFIMA: Fractional Integrated yt : The electricity price time series,
7 Autoregressive Moving Average model, t = 1, 2,K , N , where N is the number of
8
ϕ ( L ) and θ ( L ) ; are two polynomials of observations,
9
10 orders p and q , ( L is the lag operator), of µt : The conditional mean of yt ,
11 the ARFIMA model,
12
13 ε t : The white noise of yt
υt : is a white noise of ARFIMA model,
14
15 σ ε2 is the conditional variance of residuals ε t ,
16
(1 − L)d : Fractional differencing operator t

17 associated to the ARFIMA model, ht : The conditional variance modelled with


18
GARMA: Gegenbauer Autoregressive FIGARCH process,
19
Fo
20 Moving Average model,
ψ ( L) and β ( L) : are the polynomials of the
21
22 λ : The Gegenbauer frequency, delay operator L , related to the FIGARCH
23 model,
rP

Φ (L ) and Θ (L ) ; are the polynomials of the


24
25 delay operator L , related to the k-factor ANN: Artificial Neural Network,
26 GARMA model,
ee

WNN: Wavelet Neural Network,


27
28 λi : The Gegenbauer frequency, i = 1, 2,K k , LLWNN: Local Linear Wavelet Neural
29 Network,
rR

30 d i ; The long memory parameter associated


31 ϕ (x ) : The Mother wavelet,
to the k-factor GARMA model,
32
ev

33 P ( L ) : The Gegenbauer polynomial, ai and bi : are the scale parameter and the
34
th
35 DWPT: The discrete wavelet packet translation parameter, respectively, of the i
36
iew

transform, units of the hidden layer,


37
38 Wi : is the linear model of the simple weights
39
{hl }lL=−01
and {g l }l =0 : are the Daubechies
L −1

wi 0 , wi1 x1 ,K , win xn ; i = 1, 2,K , M ,


40 wavelet and scaling filters respectively,
41
W j ,n : Wavelet packet coefficients, ( j , n) BP: The Back-Propagation algorithm,
42
43 (also called nodes) is identified as a wavelet C: the BP objective function to minimize,
44 packet tree and will be represented by
45
46
{
T = ( j, n) : }
j = 0K J ; n = 0K 2 J − 1 , θ : The vector of parameters related to BP
algorithm,
47
48 MODWPT: Maximal overlap discrete
e : The error between the output values ŷ and
49 wavelet packet transform,
the real values y ,
50
51
FIGARCH: Fractionally Integrated
52 Generalized Autoregressive Conditional r : The learning rate of the BP algorithm,
53 Heteroskedasticity model,
c : The convergence criterion of the BP
54
algorithm.
55
56
57
58 2
59
60 http://mc.manuscriptcentral.com/for
Page 3 of 38 Journal of Forecasting

1
2
3 1. Introduction
4
The electricity market price forecasting has become a challenge for all its participants. An
5
6 accurate electricity price forecast represents an advantage for market players and it’s
7
8 important for risk management. More precisely, forecasting electricity price is crucial for the
9 analysis of cash flows, investment budgets, financial markets, the development of regulatory
10
11 rules and the integrated resource planning. Thus, modeling and price forecasting can help
12
13
evaluate bilateral contracts. In addition, electricity price forecasts are extremely important for
14 production companies, who must; in the short term, implement the offers for the spot market,
15
16 in the medium term, define the policies (or strategies) of the contracts, and in the long term
17
define their development plans.
18
19
Fo
20 However, the electricity price behavior is different from other commodity markets and
21
financial markets since it has specific characteristics, which are related to their physical
22
23
rP
particularities and can considerably affect prices. The most obvious of these differences is that
24
25 electricity is a non-storable commodity, so a small variation in the output in a matter of hours
26
ee

or minutes can generate enormous price changes. To exemplify, electricity spot prices exhibit
27
28 seasonality (annual, weekly and daily seasonality), long-term memory, return to average, and
29
rR

30
extreme volatility. Therefore, methods applied to predict other commodities prices are only of
31 limited validity in electricity price prediction and can provide large errors.
32
As a result, the problem of price forecasts can be complex, the electricity price forecasting has
ev

33
34
been the most challenging task and has also motivated the researchers to develop intelligent
35
36
iew

and efficient approaches to forecast the prices, consequently, stakeholders in the market can
37
38 benefit out of it.
39
40 In this framework, the objective of this paper is to describe an adequate model that can
41
42 illustrate these characteristics in order to provide an accurate analysis and prediction of the
43 data. More precisely, our research focuses on examining the performance of hybrid system in
44
45 solving the complexity of electricity prices.
46
47 In this paper, the log-return of spot prices in the Nord Pool electricity market are used to
48
49 illustrate the effectiveness and the accuracy of the proposed hybrid model for electricity price
50
51 forecasting. The remainder of this paper is planned as follows; the next section reviews
52 recent studies in the field. Section 3 presents the econometric methodology adopted in this
53
54 research. The estimation results were presented in section 4. Section 5 gives the forecasting
55
56 results and the comparative study. In section 6 wraps up the conclusions.
57
58 3
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 4 of 38

1
2
3 2. Literature Review
4
Based on the market needs, several methods were already been proposed in the literature to
5
6 model the electricity price. These empirical works tend to focus on several features: mean-
7
8 reversion, spikes, high volatility, and long-memory persistence. An appropriate model for
9 electricity prices forecast must estimate the characteristics mentioned above. In the field of
10
11 electricity price forecasting, two approaches have been applied, the first is the econometric or
12
13
statistical time series models, which are considered as parametric tools, and the second is the
14 soft computing models, which are considered as non-parametric tools.
15
16
17 In statistical models, Auto Regressive Integrated Moving Average ARIMA [Contreras et al
18 2003, and Erdogdu 2007], has been extensively used to load forecasting. However, these
19
Fo
20 models do not allow taking into consideration the behaviour of long memory characterizing
21
the electricity prices. To overcome this limitation Granger and Joyeux (1980), and Hosking
22
23
rP
(1981) introduced the Fractional Integrated Autoregressive Moving Average model
24
25 (ARFIMA). Recent works have applied these methods for the electricity prices [Koopman et
26
ee

al 2007; and Saâdaoui et al 2012]. In the spectral domain, these models are characterized by
27
28 the presence of a peak for very low frequencies close to the zero frequency. Hence, it is
29
rR

30
noteworthy that ARFIMA models do not allow the cyclical behaviour or persistent periodic in
31 the data. In order to grapple with the limitation of such models, Gray et al (1989) introduced a
32
second category of the long-memory model named a generalized long-memory or Gegenbauer
ev

33
34
Autoregressive Moving Average (GARMA) model which has been proposed to estimate both
35
36
iew

the existence of long memory and seasonality in the time series. In the frequency domain, the
37
38 spectral density of this process is not necessarily unbounded at the origin, such as the case of
39
40 the ARFIMA model, but anywhere at a given frequency λ on the interval [ 0, π ] , 0 ≤ λ ≤ π .
41
This frequency called the Gegenbauer frequency or G-frequency. However, the GARMA
42
43 model exhibits a long-memory periodic behaviour at only one frequency λ, thus implying just
44
45 unique persistent cyclical component. Recently, Woodward et al (1998) generalizes the
46
47 GARMA model with single frequency to the so-called the k -factor GARMA model which
48 permits the spectral density function to be not limited at a single frequency but associated
49
50 with a finite number k of frequencies in [ 0,π ] , identified as the Gegenbauer frequencies or
51
52 G-frequencies. The k -factor GARMA model applied by several authors to reproduce the
53
54 seasonal patterns as well as the persistent effects in the stock markets [Boubaker and Sghaier
55 2015; Caporale and Gil-Alana 2014; and Caporale et al 2012], and in some economic time
56
57
58 4
59
60 http://mc.manuscriptcentral.com/for
Page 5 of 38 Journal of Forecasting

1
2
3 series [Ferrara and Guégan 2001]. Despite the compatibility of this model with the
4
characteristics of electricity prices, few applications are oriented in the electricity market
5
6 [Diongue et al 2009; Soares and Souza 2006; and Diongue, Dominique and Bertrand 2004].
7
8 In this paper, with the aim of providing an accurate forecast for electricity spot prices, we
9 suggest using the generalized long memory process ( k -factor GARMA) that allow to
10
11 consider many features observed on the electricity spot prices, especially, the seasonal long
12
13 memory behaviours.
14
15 However, the k -factor GARMA model allows to estimate only the conditional mean of the
16
17 time series, if the residuals are white noise with constant variance over time. The empirical
18 study shows that this hypothesis is not confirmed, and the residuals are often characterized by
19
Fo
20 a time-varying variance. For that purpose, some author suggests extending the k -factor
21
22
GARMA using GARCH or FIGARCH processes [Baillie et al 1996; and Bollerslev and
23
rP
Mikkelsen 1996]. Boubaker (2015) include the GARCH class of models proposed by Engle
24
25 (1982) and Bollerslev (1986). Hence, the obtained model called the k -factor GARMA-
26
ee

GARCH process that allows for long-memory behaviour related to the k -frequency and
27
28 including a GARCH-type model to estimate the time varying volatility. In addition, Boubaker
29
rR

30 and Boutahar (2011) suggest the k -factor GARMA-FIGARCH to reproduce the long-range
31 dependence behaviour in the conditional variance of the exchange rate. Recently, Boubaker
32
ev

33 and Sghaier (2015) developed a new category of semiparametric generalized long-memory


34
35 processes with FIAPARCH errors, which extends the k -factor GARMA model to include the
36
iew

nonlinear deterministic trend and enables for time-varying variance, in some MENA stock
37
38 markets.
39
40 Although, statistical time series methods are well recognised to have good performance,
41
42 nevertheless, due to the use of linear modelling most of them have some difficulties in
43
forecasting the hard-nonlinear behaviours and rapid changes of the price. Since the electricity
44
45 price is a non-linear function of its input features, the features of electricity price cannot be
46
47 completely estimated by the statistical linear methods.
48
49 In the second category, to grapple with these limitations and explain both the nonlinear
50
51 patterns and time-varying variance that exist in real cases, several nonlinear, non-parametric
52
models have been suggested. In this context, artificial neural networks have been extensively
53
54 studied and used in modelling and forecasting electricity spot prices, Wang and Ramsay
55
56 (1998) and Szkuta et al (1999) applied the artificial neural networks (ANN) to model and
57
58 5
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 6 of 38

1
2
3 forecast the dynamics of price series. The main advantage of this model is the ability to
4
resolve the complex behaviour related to the nonlinear problem, such processes present error
5
6 tolerance and excellent robustness. Therefore, with few information and without imposing
7
8 assumptions concerning the nature of the process, the ANNs have the capacity to estimate any
9 non-linear deterministic process. Nevertheless, the traditional sigmoid ANNs present some
10
11 limits. In fact, the weights initial values of the NN model are chosen randomly; and the
12
13
initialization with random weight usually leads an extension of the training times. Moreover,
14 when we adopt a transfer function as a sigmoid function, the learning algorithm will converge
15
16 to local minima. To overcome these limitations, another useful technique proposed in the
17
recent years is wavelet based NN method. In this method, wavelet is merged with NN [Pati
18
19 and Krishnaprasad 1993], and termed as Wavelet Neural Network (WNN). Zhang and
Fo
20
21 Benveniste (1992) suggested WN, which would alleviate the above-mentioned feebleness,
22 associated to the classical NN model. WNs are networks with one hidden layer, which adopt a
23
rP

24 wavelet as an activation function, as an alternative of the conventional sigmoidal functions.


25
This model has been successfully applied in the field of short-term electricity price
26
ee

27 forecasting, [Bashir and El-Hawary 2000; Benaouda, Murtagh, Starck, and Renaud 2006; Gao
28
29 and Tsoukalas 2001; Ulugammai, Venkatesh, Kannan, and Padhy 2007; and Yao, Song,
rR

30 Zhang, and Cheng, 2000]. However, the major shortcoming of the WNN is that for higher
31
32 dimensional problems, it needs many hidden layer units. Hence, it is difficult to apply the
ev

33
34 WNN to high dimensional problems. To take advantage of the local capacity related to the
35 wavelet, while not using too many hidden units, an advanced type of wavelet neural network
36
iew

37 called as Local Linear Wavelet Neural Network (LLWNN) has been developed by Chen and
38
al (2004). This model replaces the connection weights between the hidden layer units and
39
40 output units by a local linear model. Thus, it reduces the wavelets for a given problem
41
42 comparing to the case of WNN model. In addition, this local capacity offers certain
43 advantages such as the efficiency and transparency of the learning structure. In literature, the
44
45 LLWNN model has been used for the electricity price forecasting, [Pany 2011; Chakravarty et
46
47 al 2012; and Pany et al 2013].
48
49 On the other hand, some researchers suggest that the time series is usually complex in nature
50
51 and an individual model may not be able to capture the different features in the data, so no
52 method is the greatest for all situations. Thus, hybrid models or combination of several
53
54 methods has been frequently adopted to overcome the limitations of model’s components and
55
to enhance the prediction accuracy. The fundamental idea of the combination model consists
56
57
58 6
59
60 http://mc.manuscriptcentral.com/for
Page 7 of 38 Journal of Forecasting

1
2
3 of using the strength of each model in order to estimate different patterns in the data. In the
4
literature, the most popular hybrid techniques decompose the time series into linear and
5
6 nonlinear pattern. In this vein several authors combine the auto-regressive integrated moving
7
8 Average (ARIMA) and the artificial neural network (ANN) models [Zhang 2003; Khashei
9 and Bijari 2010; Valenzuela et al 2008; and Tseng et al 2002].
10
11
12 However, the mentioned hybrid methods combine models that are not able to capture some
13 features of the electricity time series, such as the seasonal long memory behaviour. With the
14
15 aim to overcome the shortcoming associated to such models, we combine the k -factor
16
17 GARMA model with the LLWNN, given the superiority of these models compared to its
18 previous versions. In such a way we can benefit each model’s unique strength to estimate
19
Fo
20 different patterns existing in the electricity time series.
21
22 Our proposed methodology of the hybrid system consists of two steps. In the first step, a k -
23
rP

24 factor GARMA model is used to analyse the conditional mean of the problem. In the second
25
26
step, a LLWNN model is developed to model the residuals from the k -factor GARMA model
ee

27 which used as a proxy of the conditional variance. Hence, this hybrid model takes advantage
28
29 of the strength and features of the k -factor GARMA model as well as LLWNN model to
rR

30
estimate different features in the data. Therefore, in contrast to research that adopts either a
31
32 parametric method or a nonparametric method in the forecasting problems of the financial and
ev

33
34 economic series, we adopt a hybrid approach by combining these two methods.
35
36
iew

To sum up, our method has the peculiarity of modeling simultaneously the conditional mean
37
38 and the conditional variance of electricity spot price, by adopting the k -factor GARMA
39 model that simultaneously takes account of long / short-term dependence and seasonal
40
41 fluctuations, to model the conditional mean. Concerning the conditional variance, we use the
42
43 LLWNN model as a nonlinear and non-parametric method to forecast the time varying
44 variance. For a comparison purposes and to improve the effectiveness of our proposed hybrid
45
46 k -factor GARMA-LLWNN model, we estimate the individual LLWNN model, the
47
48
ARFIMA-LLWNN, and the k -factor GARMA- FIGARCH.
49
50
51
52
53
54
55
56
57
58 7
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 8 of 38

1
2
3 3. Econometric methodology
4
5 3. 1. The k-factor GARMA model
6
7 The family of integrated moving average autoregressive processes is widely used in the
8
analysis of time series and is generalized by allowing the degree of differentiation to take
9
10 fractional values. Granger and Joyeux (1980) and Hosking (1981) introduced the ARFIMA
11
12 model as a parametric tool to capture the dynamics of long rang dependence. It is a
13 parsimonious method of modeling the long-term behavior of a time series, compared to
14
15 existing models of long-term persistence, the family of ARFIMA models offers greater
16
17 flexibility in modeling simultaneous short and long-term behavior of a time series. An
18 ARFIMA ( p, d , q ) process is presented as follows:
19
Fo
20
Φ ( L )(1 − L) d ( yt − µ ) = Θ( L)υt (1)
21
22
23
rP

24 Where, υt is a white noise, with zero mean and variance σ 2 , Φ( L) and Θ( L) are two
25
26 p q
polynomials of orders p and q , where Φ ( L ) = 1 + ∑ φi Li et Θ( L) = 1 + ∑ θi Li , ( L is the lag
ee

27
i =1 i =1
28
29 operator), (1 − L) d is a fractional differencing operator.
rR

30
31 In the spectral domain, the AFIRMA processes have a peak for very low frequencies, near the
32
zero frequency. Accordingly, these models do not allow taking into consideration the
ev

33
34 seasonality presented in the data. Therefore, Gray et al (1989) introduced a second category of
35
36
iew

the long-memory model named a generalized long-memory or Gegenbauer Autoregressive


37
38 Moving Average (GARMA) model, which has been proposed to estimate both the existence
39 of long memory and seasonality in the time series. The autocorrelation function of such model
40
41 shows a hyperbolic decay at seasonal lags instead of the slow linear decay that characterize
42
the seasonal differencing model (SARIMA). On the other hand, in the frequency domain, the
43
44 spectral density of this process is not necessarily unbounded at the origin, such as the case of
45
46 the ARFIMA model, but anywhere at a given frequency λ on the interval [ 0, π ] , 0 ≤ λ ≤ π .
47
48 This frequency called the Gegenbauer frequency or G-frequency. However, the GARMA
49
50 model exhibits a long-memory periodic behaviour at only one frequency λ, thus implying
51 just unique persistent cyclical component. Recently, Woodward et al (1998) generalized the
52
53 GARMA model with single frequency to the so-called the k -factor GARMA model which
54
55 permits the spectral density function to be not limited at a single frequency but associated
56
57
58 8
59
60 http://mc.manuscriptcentral.com/for
Page 9 of 38 Journal of Forecasting

1
2
3 with a finite number k of frequencies in [ 0, π ] . The principle characteristic of the k -factor
4
5 GARMA process is that it enables for more diversity in the structure of the covariance of a
6
7 variable witnessed through both the spectral density function and the autocorrelation function
8 that presents k singularities.
9
10
11 The multiple frequency GARMA model is defined as follows;
12
13 k
14 Φ( L)∏ ( I − 2ν i L + L2 ) d i ( yt − µ ) = Θ( L)ε t (2)
15 i =1

16
17 Where Φ ( L ) and Θ ( L ) are the polynomials of the delay operator L such that all the roots of
18
19 Φ ( z ) and Θ( z ) lie outside the unit circle. The parameters ν i provide information about
Fo
20
21 periodic movement in the conditional mean, ε t is a white noise perturbation sequence with
22
23 variance σ ε2 , k is a finite integer, ν i < 1 , i = 1, 2,K k , d i are long memory parameters of the
rP

24
25 conditional mean showing how the autocorrelations are slowly damped, µ is the mean of the
26
ee

27 process, λi = cos −1 (ν i ) , i = 1,2, K k , represent the Gegenbauer frequencies (G-frequencies).


28
29
rR

30 The GARMA model with k -frequency is stationary when ν i < 1 , and d i < 1 / 2 or when
31
32 ν i = 1, and d i < 1/ 4 , the model exhibits a long memory when d i > 0 .
ev

33
34
35 The main characteristic of this process is given by the introducing of the Gegenbauer
36
iew

polynomial;
37
38
39 k
P ( L ) = ∏ ( I − 2ν i L − L ²)
dj
40 (3)
i =0
41
42
43 This polynomial is considered as a generalized long memory filter that estimates the cyclical
44
long memory behavior at k + 1 different frequencies. Considering the λi as the driving
45
46
frequencies of a seasonal pattern of length S , λ i = (2πi / S ) , and k + 1 = [S / 2] + 1, where []
.
47
48 stands for the integer part.
49
50 To highlight the contribution of P (L) at frequencies λ = 0 and λ = π , equation (3) can be
51
52 written as:
53 k +1
54 ∏ ( I − 2ν L − L ²)
d,k I ( E ) dj
P ( L) = ( I − L ) d0 ( I + L ) i (4)
55 i =1
56
57
58 9
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 10 of 38

1
2
3 Where I ( E ) = 1 if S is even and zero otherwise and k + 1 = [S / 2] + 1 − I ( E ).
4
5 For a GARMA model with a single frequency, when ν = 1, the model is reduced to an
6
7 ARFIMA ( p, d , q ) model, and when ν = 1 and d = 1 / 2 , the process is an ARIMA model.
8
Finally, when d = 0 , we get a stationary ARMA model.
9
10
11 Cheung (1993) determines the spectral density function and improves that for d > 0 the
12
13 spectral density function has a pole at λ = cos−1 (ν ) , which varies in the interval [ 0, π ] it is
14
15 important to note that when ν < 1 , the spectral density function is bounded at the origin for
16
17 GARMA processes, and thus does not suffer from many problems associated with ARFIMA
18
models.
19
Fo
20
21 3. 2. Wavelet -based estimation procedure
22
Concerning the estimation of the k -factor GARMA model, we adopt an estimation procedure
23
rP

24 based on the wavelets following the methodology proposed by Whitcher (2004). Indeed, the
25
26 complex behavior characterizes the electricity spot price stimulated researchers to develop
ee

27 and test some methods for statistical analysis in order to realize a parsimonious model with
28
29 greater accuracy. Recently, wavelet analysis is suggested as a very practical tool used to deal
rR

30
31 with complex phenomena. The ability of wavelets to capture variability over both time and
32 scale can provide more insight into the nature of data on energy markets. These regimes can
ev

33
34 be effectively applied to decompose the time series into different time scales. Hence, wavelet
35
analysis is likely to reveal seasonality, discontinuities, and volatility aggregation [Gençay et al
36
iew

37 2002, 2003; Gençay, Selçuk and Whitcher 2001a, 2001b, 2001c]. Compared to Fourier
38
39 analysis, the strength of the wavelet methodology lies in its capacity to localize,
40 simultaneously, a process in both time and frequency. To illustrate, at high scales, the wavelet
41
42 exhibit a small-centralized time support enabling it to concentrate on short-lived time
43
44
phenomena. At low scales, the wavelet possesses a large support allowing it to analyse the
45 long rang dependence behaviour. By moving from low scale to high scales, the wavelet
46
47 zooms in on a process’s behaviour, detecting jumps, singularities and cups [Mallat and Zhang
48
1993; and Mallat 1999]. Accordingly, the wavelet transform adapts itself intelligently in order
49
50 to capture patterns through a wide range of time and frequencies. For these reasons, wavelet
51
52 transforms are applied for energy market data to capture nonlinear patterns and hidden
53 patterns that exist between variables.
54
55
56
57
58 10
59
60 http://mc.manuscriptcentral.com/for
Page 11 of 38 Journal of Forecasting

1
2
3 The discrete wavelet packet transform (DWPT) [Whitcher 2004] generalized the discrete
4
5 wavelet transform DWT that splits the entire frequency band [ 0,1 2] into individual and
6
7 regularly spaced intervals. For a given temporal series X of dyadic length N = 2 J , the j th
8
9
level of DWPT is an orthonormal transform giving a vector of dimension N of wavelet packet
10
coefficients (W j , 2 j −1 ,W j , 2 j − 2 ,KW j ,0 )′ where each W j ,n , n = 0,K,2 j − 1 , has N j = N / 2 j
11
12
13 dimension and it is associated with the frequency interval κ j , n = n / 2 j +1 , (n + 1) / 2 j +1 . [ ]
14
15
Let {hl }l = 0 and {g l }l = 0 the Daubechies wavelet and scaling filters respectively, starting with
L −1 L −1
16
17
18 the recursion X = W0,0 , the t th elements of W j ,n are calculated by the following steps of
19
filtering
Fo
20
21
22
23 W = ∑u W , t = 0,K , N j − 1
rP
j , n ,t n ,l j − 1 , [n / 2 ], ( 2 t + 1 − l ) mod N J − 10 (5)
24
25
26
ee

Where
27
28
29 g if n mod 4 = 0 ou 3
u n ,l =  l
rR

30
31 hl if n mod 4 = 1 ou 2 (6)
32
ev

33
34
Here [.] denotes the integer part operator.
35
36
iew

It is significant to note that the collection of doublets ( j , n) (also called nodes) is identified as
37
38 a wavelet packet tree and will be represented by T = ( j, n) : { j = 0K J ; n = 0K2 J − 1}. The
39
40 orthonormal basis B ⊂ T is determined when a collection of coefficient vectors of DWPT
41
42 gives a disjoint and no overlapping complete covering of the frequency range [ 0,1 2] named
43
44 a "disjoint dyadic decomposition".
45 Hence, in matrix notation, a vector of DWPT coefficients is obtained via wB = W B X where
46
47 WB an orthonormal N × N matrix is defining the DWPT through the basis B .
48
49 To identify the "best base" B , from all possible orthonormal partitions, we use statistical
50
51 white noise tests named the portmanteau test, following the method of Boubaker (2015).
52 Analogously to the maximal overlap discrete wavelet packet transform (MODWPT), the
53
54 down sampling step in the DWPT can also be removed using a variant of this transform
55
56 known as the MODWPT which relies on rescaled versions of the filter u n ,l introduced above.
57
58 11
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 12 of 38

1
2
3 In sum, as we mentioned above, the electricity price series are characterized by periodic long
4
memory behavior (Weron, 2006). For that purpose, the k -factor GARMA model seems
5
6 appropriate in analyzing the long memory periodic behavior that characterize such series.
7
8 Indeed, this model have the ability to estimate the seasonality at different frequency, which
9 insure a better estimation of the data and as consequence enhance the forecasting results.
10
11 Moreover, we adopt wavelet estimation approach where the analytic capacity of the wavelets
12
13 is able to solve the intricate features of electricity prices.
14
15 However, the k -factor GARMA model exhibits two main shortcomings. The first is that it
16
17 does not handle the nonlinear deterministic trend. Through simulation experiments, Beran
18 (1999) proves that the omission of the deterministic trend existing in the fractional integrated
19
Fo
20 model led to a high variability and serious bias in its parameter. The second is that it assume
21
that the residuals are white noise with constant variance over time, since it focuses only on
22
23
rP
modelling the conditional mean of the time series. The empirical study shows that this
24
25 hypothesis is not confirmed, and the residuals are often characterized by a time-varying
26
ee

variance, where time series exhibit both high and low periods of volatility. Hence, we cannot
27
28 rely only on the k -factor GARMA model to estimate all electricity prices features. To
29
rR

30 reproduce these patterns, we extended the k -factor GARMA described above by inserting a
31 fractional filter in the equation of the conditional variance. For this reason, we suggest the k -
32
ev

33 factor GARMA-FIGARCH model, which can estimate the long memory behavior in the
34
35 conditional mean as well as the conditional variance [Boubaker and Boutahar 2011; and
36
iew

Boubaker and Sghaier 2015].


37
38
39 3. 3. The k-factor GARMA- FIGARCH model
40 Baillie et al (1996) and Bollerslev and Mikkelsen (1996) introduced the Fractional Integrated
41
42 Generalized Autoregressive Conditional Heteroscedasticity (FIGARCH) model, in order to
43
estimate the finite persistence related to the conditional variance. In the time domain, in
44
45 contrast to short-memory ARMA models where the autocorrelation function decay with
46
47 geometric rate, in the case of the FIGARCH models, the autocorrelation function is
48 characterized by a slow decay with hyperbolic rate. In the spectral domain, these models show
49
50 a peak correspondent to very low frequency, so close to the zero frequency.
51
52 Thus, we estimate the k -factor GARMA process with FIGARCH type innovations to take
53
54 into consideration the presence of long memory behavior in the conditional variance. This
55
56 model is written as follow:
57
58 12
59
60 http://mc.manuscriptcentral.com/for
Page 13 of 38 Journal of Forecasting

1
2
3 yt = µ t + ε t = µ t + σ t zt (7)
4
5 Where µ t is the conditional mean of yt , modeling using the following k -factor GARMA
6
7 process:
8 k
9 Φ( L)∏ ( I − 2ν i L + L2 ) di ( yt − µ ) = Θ( L)ε t (8)
10 i =1
11
12 ε t / It −1 ~N (0, σ ε2 )
t
(9)
13
14 Where σ ε2t is the conditional variance, I t −1 being the information up to time t − 1 , z t is an i.i.d
15
16 random variable with zero mean and unitary variance.
17
ht = ω + [1 − (1 − β ( L )] ψ ( L)(1 − L)δ ε t2 .
−1
18 (10)
19 s r
Fo
20 Where ψ ( L) = 1 − ∑ψ i Li and β ( L) = 1 − ∑ β i Li are suitable polynomials in the lag operator
21 i =1 i =1
22
23 L with the roots assumed to lie outside the unit circle, 0 < δ < 1 is the fractional differencing
rP

24
(long memory) parameter. The FIGARCH (r , δ , s) model nests the GARCH ( r , s) and the
25
26 integrated GARCH (IGARCH) models in the sense that when δ = 0, FIGARCH model
ee

27
28 reduces to GARCH model while for δ = 1 it becomes an IGARCH model.
29
rR

30
31
Despite, the k-factor GARMA-FIGARCH model well established to have good performance,
32 as a parametric tool, with the ability to capture all its information about the data within its
ev

33
34 parameters, the predicting data value from the current state of the model is based only on its
35
parameters. However, a non-parametric model can capture more subtle aspects of the data. It
36
iew

37 allows more information to pass from the current set of data that is attached to the model at
38
39 the current state, to be able to predict any future data. In this case, the parameters are usually
40 said to be infinite in dimensions and so can express the features in the data much better than
41
42 parametric models. It has more degrees of freedom and it is more flexible. Hence, observing
43
44
more data will help to make a better prediction about the future values. Therefore, according
45 to the non-linear and non-stationary behaviour of the electricity price, the soft computing
46
47 models are better equipped for forecasting such series. In this framework, the Local Linear
48 Wavelet Neural Network (LLWNN) is adopted for electricity price prediction. This non-
49
50 parametric model can improve the predictive results because of its favourable property of
51
52 modelling the non-stationary, non-linearity and high frequency signals such as electricity
53 price.
54
55
56
57
58 13
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 14 of 38

1
2
3 3. 4. A local linear wavelet neural network (LLWNN) model
4
The Local Linear Wavelet Neural Network (LLWNN) is developed by Chen et al (2004) and
5
6 has shown its accuracy for time series forecasts compared with the traditional WNN. In this
7
8 model, a local linear model replaces the connection weights between the hidden layer units
9 and output units. In addition, the number of neurons in the hidden layer is equivalent to the
10
11 number of inputs.
12
13
Remind that in the wavelet transform theory the wavelets are in the following form:
14
15  −1 / 2 x − bi 
16 ϕ = ϕ i = ai ϕ( ) : ai , bi ∈ R n , i ∈ Z  (11)
17  ai 
18
19 Where
Fo
20
21
22 x = ( x1 , x2 ,K, xn ),
23
rP

ai = (ai1 , ai 2 ,K, ain ), (12)


24
25 bi = (bi1 , bi 2 ,K, bin ).
26
ee

27 are families of functions generated by a unique function ϕ ( x ) by the operations of translation


28
29 and dilation of ϕ ( x ) ; which is located in both frequency scale and time scale, is termed a
rR

30
31 mother wavelet and the parameters ai and bi are called scale parameter and translation
32
ev

33 parameter, respectively. The set x represents the inputs introduced to the WNN model.
34
35 In the traditional WNN, the output is given as:
36
iew

37
38
M M
x − bi
f ( x ) = ∑ ω iϕ i ( x) = ∑ ω i ai
−1 / 2
39
ϕ( ) (13)
i =1 i =1 ai
40
41
42 Where ϕi denotes the activation function of wavelets of the i th unit of the hidden layer, and
43
44 ωi represent the connection weight of the i th unit of the hidden layer to the output layer unit.
45
46 Note that we can determine the basis function related to the multivariate wavelet for the n -
47
48 dimensional input space, by calculating the product of n unique wavelet basis functions as
49 following:
50
51
n
52
ϕ ( x) = ∏ ϕ ( xi ) (14)
53 i =1
54
55
56
57
58 14
59
60 http://mc.manuscriptcentral.com/for
Page 15 of 38 Journal of Forecasting

1
2
3 Obviously, the scale parameter ai and the translation parameter bi determine the location of
4
5 the i th units of the hidden layer. According to previous research, these two parameters can be
6
7 predefined either by a learning algorithm or by the basis of the Wavelet transformation theory.
8
9 The main feature of networks with a basic function is the local activation of the units of the
10
11 hidden layer, hence the connection weights related to the units can be considered as locally
12 significant constant models, where, for unique input, the validity of these models is specified
13
14 by the activation functions. To take advantage of the rigidity of the local approximation,
15
16
several basis functions are adopted to estimate a given system. One limitation of the WNN
17 model is that for problems of large dimensions many units of hidden layers are needed.
18
19
To minimize the number of hidden units while taking advantage of the local capacity of the
Fo
20
21 basis wavelet functions, another category of WNN is introduced. This model is termed the
22
23 local linear wavelet neural network (LLWNN), in this model, a local linear model is
rP

24
introduced between the hidden layer units and the output units instead the simple connection
25
26 weights and the number of neurons in the hidden layer is equivalent to the number of inputs.
ee

27
28 In the literature it is recognized that the local linear model offers a more parsimonious
29 interpolation in a large dimension space, and thus make it suitable for forecasting. This local
rR

30
31 capacity associated to the LLWNN model offers several advantages such as the transparency
32
and the efficiency of the learning structure.
ev

33
34
35 Figure 1 illustrates the architecture of the LLWNN model and its output from the output layer
36
iew

37 is computed by the following equation:


38
39 M

40 y = ∑ ( wi 0 + wi1 x1 + K + win xn )ϕi ( x )


i =1
41
42
−1/ 2
43 M
x − bi
44 = ∑ ( wi 0 + wi1 x1 + K + win xn ) ai ϕ( ) (15)
45 i =1 ai
46
47 Where x = [x1 , x2 ,K, x n ],
48
49 A linear model is introduced instead the simple weight wi (locally constant model),
50
51 Wi = wi 0 + wi1 x1 + K + win xn (16)
52
53 The linear models Wi (i = 1, 2,K, M ) are determined using the associated local active wavelet
54
55 functions ϕ i ( x) ( i = 1,2, K, M ) , thus Wi is locally significant. The motivations of introducing
56
57
58 15
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 16 of 38

1
2
3 local linear models in WNN model are as following: (1) the local linear models were adopted
4
in several neuro-fuzzy systems and showed good performance, and (2) in large spaces when
5
6 samples are dispersed, the local linear models offer a more parsimonious interpolation.
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
23
rP

24
25
26
ee

27
28
29
rR

30
31
32 Figure 1: Local Linear Neural Wavelet Network Architecture
ev

33
34 At the beginning, the translation and scale parameters of the local linear model are randomly
35
36
initialized, and then these parameters are optimized using a learning algorithm.
iew

37
38 For example, in the LLWNN model, for p input and l hidden units, there will be
39
40 (2 p + p + 1) * l parameters. Assume the parameters form into a single vector θ . For the
41
42 training of the model, the objective function to minimize is:
43
44 1
45 C= E {( y − yˆ (θ ))²} (17)
2
46
47
48 At the beginning, these parameters are initialized in a random way, then to obtain the optimal
49
values of θ , a learning algorithm is applied (see figure 2).
50
51
52 If, for example, a gradient algorithm is used, in this case we have:
53
54 e = yˆ − y (18)
55
56
57
58 16
59
60 http://mc.manuscriptcentral.com/for
Page 17 of 38 Journal of Forecasting

1
2
3 ∂C
= e * ϕi , (19)
4 ∂w j 0
5
6
7 ∂C
8
= e * ϕi * xi , (20)
∂w ji
9
10
11 ∂C
12
= e *Wj* (−0.5* abs(aij ) ^(−1.5)*ϕ(zi ) − abs(aij ) ^(−0.5)* zi / aij *ϕ(zi )*ϕi * sqrt(abs(aij )) / ϕ(zi ), (21)
∂a ji
13
14
15 ∂C
16
= e *W j*abs(aij ) ^ (−0.5) *(−1/ aij ) *ϕ ( zi ) *ϕ j * sqrt (abs(aij )) / ϕ ( zi ). (22)
∂b ji
17
18
19 With
Fo
20
21
z i = ( xi − bij ) / aij , 1 ≤ i ≤ p, 1 ≤ j ≤ l , (23)
22
23
rP

24 θ will be updated recursively after the arrival of each pair of data element and its output
25
26 {P, y}. In order to control the learning process of the gradient algorithm, two other parameters
ee

27
28 must be determined, the learning rate r and the convergence criterion c . The learning rate r
29 determines the convergent speed of the algorithm. The gradients for each parameter present in
rR

30
31 the equations should be multiplied by the learning rate r , and then can be adopted to adjust
32
the parameters. For example, for each iteration, w ji will be updated as follows:
ev

33
34
35
∂C
36 w ji = w j 0 − r *
iew

(24)
37 ∂w jo
38
39
40
The convergence criterion c indicates how many times the training sample should be
41 introduced in the training process, it could be an exact number of iterations, or a threshold is
42
43 attained (to exemplify, for the training process given for two continuous iterations, the
44
prediction error is less than 0.0001 ).
45
46
47 The difference between WNN model and LLWNN model related on the parameter Wi . For
48
49 WNN model, the parameter Wi is a simple weight and for LLWNN model, this parameter is a
50
51 local linear model. This allows the LLWNN to cover a larger surface in the input space than
52
WNN. Accordingly, the LLWNN can resolve the high dimension problem since it can reduce
53
54 the number of free parameters and hidden units. In addition, the LLWNN model performs a
55
56 good estimation results even if the size of training samples is small.
57
58 17
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 18 of 38

1
2
3 Hence, for the LLWNN model only a small number of wavelet basis functions is needed for a
4
given problem and results sufficient accuracy. For the reason that the local linear models are
5
6 more powerful than constant weight. Moreover, the translation and the dilatation parameters
7
8 of LLWNN are randomly generated, and then optimized without any predetermination.
9
10 In conclusion, the characteristic of this network (LLWNN) is that a local linear model is
11
12 introduced instead of the simple weights. The proposed network decompose the complex
13 nonlinear system into a set of local sub models active and then integrate these sub models by
14
15 their related wavelet basis functions. An advantage of this proposed model is that smaller
16
17 wavelets are used for a given problem than in the case of WNN model.
18
19
Fo
20
21
22
23
rP

24
25
26
ee

27
28
29
rR

30
31
32
ev

33
34
35
36
iew

37
38
39
40
41
42
43
44
45 Figure 2: Local Linear Wavelet Neural Network Model based Back Propagation Algorithm.
46
47
In fact, both the k -factor GARMA models, as a powerful statistical method, and the LLWNN
48
49 model, as an advanced Artificial Intelligence (AI) method, have accomplished their
50
51 performance in the nonlinear parametric and nonparametric domains respectively. However,
52 no one is a universal model, which is suitable for all circumstances. In other words, a time
53
54 series is usually complex in nature and an individual model may not be able to capture the
55
56
different features in the data, so no method is the greatest for all situations. Thus, hybrid
57
58 18
59
60 http://mc.manuscriptcentral.com/for
Page 19 of 38 Journal of Forecasting

1
2
3 models or combination of several methods has been frequently adopted to overcome the
4
limitations of model’s components and to enhance the prediction accuracy. The fundamental
5
6 idea of the combination model consists of using the strength of each model in order to
7
8 estimate different patterns in the data.
9
10 3. 5. Hybrid k-factor GARMA-LLWNN model
11
12 In the literature, several combination techniques have been presented, such as the traditional
13 hybrid model of Zhang (2003), the artificial neural network of Khashei et al (2010), and the
14
15 generalized hybrid model of Khashei et al (2011). These methods use the ARIMA model as a
16
17 linear model and the multi-layer perceptron neural network to model the non-linear model.
18 Although, these methods have shown empirically their effectiveness, insofar as they can be an
19
Fo
20 effective way of improving the accuracy of predictions made by one of the methods used
21
separately. However, these methods are critical. Indeed, these models exploited ARIMA
22
23
rP
modeling to predict the linear trend in the data, but predictions using the ARIMA model
24
25 ( p, d , q ) have not always proved to be very effective. The main criticism mainly concerns the
26
ee

27 modeling of short-term relationships only (short memory), while ignoring the seasonal effect
28 and the presence of long memory that characterizes most financial and economic series.
29
rR

30
31 To overcome this limitation, the k -factor GARMA model offers greater flexibility in
32 modeling simultaneous short and long-term behavior of a seasonal time series.
ev

33
34
35 On the other hand, the hybrid methods existing in the literature neglected the modeling of
36
iew

volatility, a phenomenon that characterizes most financial series. In fact, a good forecast must
37
38 take into account the time varying variance. Thus, the LLWNN approach has been proposed
39
40
to consider the time varying of conditional variance. The choice of LLWNN in our hybrid
41 model is motivated by the wavelet decomposition and its local linear modeling ability.
42
43
44
Furthermore, the previous hybrid models assume that the nonlinear relations are exist only in
45 the residuals and the two components (linear and nonlinear) must be modeled separately, they
46
47 assume that there are no nonlinear relations in the averages since they are always estimated
48
using a linear model in the first step. To overcome this limitation, we use the k -factor
49
50 GARMA model to estimate the nonlinear components as well as the seasonal long memory
51
52 behavior existing in the conditional mean, and then we model the residuals from the k -factor
53 GARMA model, which used as a proxy for the corresponding volatility, using an LLWNN
54
55 model. In other words, the first step consists in modeling the conditional mean using a non-
56
57
58 19
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 20 of 38

1
2
3 linear parametric model ( k -factor GARMA). However, residuals are important in forecasting
4
time series; they may contain some information that is able to improve forecasting
5
6 performance. Thus, in the second step, the residuals resulting from the first step will be
7
8 treated according to a local linear wavelet neural network LLWNN (see figure 3).
9
10 In our hybrid method, we have combined two models that have different characteristics in
11
12 order to estimate the different features existing in the data, thus we adopt a combination of
13 parametric and nonparametric models.
14
15
16 In fact, it may be reasonable to consider time series to be composed by two components; the
17 first one is a parametric form with unknown parameters where a parametric method seems
18
19 appropriate for such processes. The second component related to the residuals, this part
Fo
20
21 usually present no specific process; hence, it is difficult to determine the appropriate model
22 that can be deal with this part of the time series. For this reason, a non-parametric model
23
rP

24 seems appropriate for modelling the residuals. This choice is motivated by the fact that the
25
non-parametric model can reduce modelling bias by imposing no specific model structure,
26
ee

27 rather than certain smoothness assumption, and therefore, they are particularly used when we
28
29 have little information or when we want to be flexible about the underling model.
rR

30
31 To illustrate, a time series can be written as:
32
ev

33
34 yt = µ t +ε t (25)
35
36
iew

Where µ t denote the conditional mean of the time series, and ε t is the residuals. In the first
37
38 step, the main aim is the parametric modelling, therefore the k -factor GARMA model is used
39
40 to reproduce the conditional mean.
41
42 In the second step, the residuals from the parametric model are used as a proxy for the
43
44 corresponding volatility and modeled using the LLWNN model.
45
46
Let ε t denote the residuals at time t from the k -factor GARMA model, then
47
48
49 ε t = y t − µˆ t (26)
50
51
52 Where µ̂ t is the forecast value from the estimated relationship (equation 2).
53
54
55
56
57
58 20
59
60 http://mc.manuscriptcentral.com/for
Page 21 of 38 Journal of Forecasting

1
2
3 On the other hand, residuals are important in the diagnostic of the sufficiency of the
4
parametric models. Since, the parametric model is not sufficient if there are still the
5
6 autocorrelation structure left in the residuals, in addition to the time varying variance.
7
8
Then, the forecast values and the residuals of the parametric modelling are the results of the
9
10 first stage that are used in the next stage. In the second stage, the aim is the conditional
11
12 variance modelling using the LLWNN with n input nodes; the LLWNN for the residuals is:
13
14 ε t = f (ε t −1, ε t −2,Kε t −n ) (27)
15
16
17 Where f is a non-linear, non-parametric function determined by the neural network with the
18
19 reference to the current state of the data, during the training of the network. The output layer
Fo
20 of the network (equation 15) gives the forecasting results, hence:
21
22
23 yˆ t = µˆ t +εˆ t (28)
rP

24
25
Hence, this global prediction, represent the result of forecasting both, the conditional mean
26
ee

27 and the conditional variance of the time series.


28
29
rR

30
31
32
ev

33
34
35
36
iew

37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54 Figure 3: The hybrid k- factor GARMA- LLWNN model
55
56
57
58 21
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 22 of 38

1
2
3 To conclude, the proposed hybrid model exploits the originality and the strength of the k -
4
factor GARMA model as well as the LLWNN model to detect the different features existing
5
6 in the data to benefit the complementary characteristics of the models from which there are
7
8 composed. Thus, the proposed hybrid model can be an efficient way giving a more general
9 and accurate method than other hybrid models. For a comparison purposes and to improve the
10
11 effectiveness of our proposed hybrid k -factor GARMA-LLWNN model, we estimate the
12
13 individual LLWNN model, the ARFIMA-LLWNN, and the k -factor GARMA- FIGARCH
14 (see figure 4).
15
16
17
18
19
Fo
20
21
22
23
rP

24
25
26
ee

27
28
29
rR

30
31
32
ev

33
34
35
36
iew

37
38
39
Figure 4: A schematic representation of the adopted econometric methodology.
40
41
42
4. Estimation results
43
44
45 4. 1. The Nord Pool Electricity Market and Preliminary study
46
The Nordic electricity market known as Nord Pool market is a stock market affected to the
47
48 electricity product. This stock market was created in 1992. It includes the three Scandinavian
49
50 countries: Norway, Sweden and Denmark plus Finland. In fact, this market began functioning
51 officially in 1993 with Norway as the only area. Later, Sweden joined in January 1996.
52
53 Finally, Finland was fully integrated in March 1999. The spot price is the equilibrium price
54
55
and is computed as the point of equilibrium for every 24 hours. It is the unique price
56
57
58 22
59
60 http://mc.manuscriptcentral.com/for
Page 23 of 38 Journal of Forecasting

1
2
3 throughout the north region, and is determined when the supply and demand curves intersect.
4
In case the data does not define an equilibrium point, the transactions will not take place.
5
6
7 In the electricity markets, a dynamic study of data is carried out without interruption. In fact,
8
contrary to most other statistical studies in finance or economics, time series analysis assumes
9
10 that the data represent consecutive measurements taken at equidistant time intervals
11
12 (24h/24h). Although this assumption is violated for a large majority of financial data sets due
13 to the market close, weekends and holidays, it is satisfied in electricity exchange market,
14
15 mainly for spot prices, transaction volumes, production, etc. This allows statistical techniques
16
17 to be applied appropriately and in agreement with the way they were designed for use.
18
19 The methodology proposed in this research is tested on hourly log-return spot prices for Nord
Fo
20
21 Pool electricity market, covering the period between 1st of January 2015 and the 31st of
22 December 2015, in total N = 8761 hourly observations illustrated in figure 5. This data was
23
rP

24 extracted from the official website of Nord Pool market. In this section, we analyze the spot
25
price electricity series on the Nord Pool market, in order to study their statistical and
26
ee

27 econometric features. In most cases, the econometricians consider the logarithm of their series
28
29 because the use of the difference logarithm sometimes makes the series stationary and allows
rR

30 modeling returns series. For this reason, we use the series of returns.
31
32
ev

33
34
35
36
iew

37
38
39
40
41
42
43
44
45
46
47
48
49
50 Figure 5: Hourly Log-Return of spot price for Nord Pool electricity market.
51
52 Figure 5 shows the time evolution of log return electricity spot price (noted RSP) indicates
53
54 that this series seems stationary. This hypothesis can also be supported by the unit root tests
55
(ADF, PP and KPSS). In addition, the series presents a clustering of volatility since periods of
56
57
58 23
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 24 of 38

1
2
3 low volatility are followed by periods of high volatility. This is a sign of the presence of the
4
ARCH effect in the series.
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
23
rP

24
25
26
ee

27 Figure 6: Nord Pool ACF & Periodogram


28
29 As shown in figure 6, for the log-return of electricity spot price series, the spectral density,
rR

30
31 traced by the periodogram, shows peaks at equidistant frequencies, which proves the presence
32
of several seasonalities.
ev

33
34
35 Table 1: Descriptive statistics of the log return spot prices time series (log-returns).
36
iew

37 The Nord Pool log-returns


38
39 Mean -6.2434 ×10 −5
40
41 Standard deviation 0.0835
42
43 Skewness 0.8838***
44
45 Kurtosis 20.2854***
46
47 Jarque-Bera 1101.9457
48
(0.0000) ***
49
50 Note: levels of significance indicated between squared brackets. *** denotes significance at 1% level.
51
52
The summary descriptive statistics of the Nord Pool log-returns are reported in table 1. The
53
54 standard deviation is quite small (0.0835), while the estimated measure of Skewness
55
56 indicating a non-symmetric distribution. Furthermore, the large value of the kurtosis statistic
57
58 24
59
60 http://mc.manuscriptcentral.com/for
Page 25 of 38 Journal of Forecasting

1
2
3 indicates that this time series is leptokurtic. This significant departure from normality is also
4
confirmed by the high value of Jarque-Bera (JB) test. Hence, the log return electricity series
5
6 (RSP) is not normally distributed.
7
8
Table 2: ADF, PP, KPSS unit root testing results for NordPool log-returns.
9
10 Model (3): With an Model (2): With an Model (1): Without
11
intercept and a trend intercept an intercept
12 ADF Test -50.2201*** -50.2230*** -101.3755***
13
PP Test -101.8959*** -101.8214*** -
14
15 KPSS Test 0.0363 0.0414 -
Note: *** indicate rejection of the null hypothesis at the 1-percent level. ADF and PP: Critical values in the
16
model (3): -3.95 (1%), -3.41 (5%), -3.12 (10%). Critical values in the model (2): -3.43 (1%), -2.86 (5%), -2.56
17 (10%). Critical values in the model (1): -2.56 (1%), -1.94 (5%), -1.62 (10%). KPSS: Critical values in the model
18 (3): 0.216 (1%), 0.146 (5%), 0.119 (10%).Critical values in the model (2): -0.739 (1%), -0.463 (5%), -0.347
19 (10%).
Fo
20
21 We tested the stationarity by effecting unit root tests, particularly, the augmented Dickey-
22
23 Fuller (ADF), the Phillips-Perron (PP) and Kwiatkowski, Phillips, Schmidt, and Shin (KPSS)
rP

24 tests. These tests differ in the null hypothesis. The null hypothesis of the ADF and PP tests is
25
26 that a time series contains a unit root, while the KPSS test has the null hypothesis of
ee

27
stationary. The results of these tests are reported in table 2. The results of the ADF and PP
28
29 unit root tests indicate that the Nord Pool log-returns time series is significant to reject the
rR

30
31 null hypothesis of non-stationary. Thus, this series is stationary and appropriate for
32 subsequent tests related to this study. In addition, the statistics of the KPSS test support the
ev

33
34 acceptance of the null hypothesis of stationary. Hence, the series is stationary, and suitable for
35
36 long memory tests.
iew

37 Thereafter, we tested for the long-memory behavior existing in the conditional mean, using
38
39 the GPH (Geweke and Porter-Hudak, 1983) and LW (Robinson, 1995) statistics.
40
Correspondent results present in Table 3 indicate evidence of long-rang dependence.
41
42
43
Table 3: Results of GPH and LW long-range dependence tests in the conditional mean.
44 GPH LW
45
46 d̂ Std-error p-value d̂ Std-error p-value
47 Bandwidth
48
49 T 0.5 =94 0.2347 0.0718 0.0010 0.2362 0.0515 0.0000
50 0.3632 0.0440 0.0000 0.4171 0.0328 0.0000
T 0.6 =232
51 RSP
52 T 0.7 =575 0.3389 0.0273 0.0000 0.3284 0.0208 0.0000
53
54 T=8760 T 0.8 =1425 0.3810 0.0172 0.0000 0.4343 0.0132 0.0000
55
56
57
58 25
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 26 of 38

1
2
3 4. 2. The estimation results
4
5
To identify the appropriate ARFIMA model, we estimate some specifications of ARFIMA
6 ( p , d , q ) models with different orders of ( p , q ) and then we compare the performance of
7
8 some ARFIMA models to define the adequate orders in detecting the long memory property
9
10 of the series. Taking into consideration all combinations possible for the ARMA ( p , q ) part
11
with p = 0, 1, 2 and q = 0, 1, 2 . To select the appropriate models that describe data, we use
12
13 some criteria based on the Akaike information criterion (AIC) and the maximum Log-
14
15 likelihood (Ln (L)). On this basis, an ARFIMA (1, d ,1) seems appropriate for the log-return
16
17 Nord Pool prices (see table 4). In addition, the estimation of the long memory parameter d is
18 statistically significant. Thus, it is clearly accepted that there is significant evidence of long
19
Fo
20 memory in the Nord Pool log- return time series.
21
22
23
rP
Tableau 4: Estimation results of the ARFIMA model.
24
25 ARFIMA φˆ1 θˆ1 d̂ Ln(L) AIC
26
ee

27
28 (1, d ,1) 0.7191** 0.2759*** 0.4635*** 11126.6963 -2.5394
29
rR

30
Note: *** indicate rejection of the null hypothesis at the 1-percent level.
31
32
ev

33
Table 5: Estimation of the k-factor GARMA model: a wavelet-based approach.
34
35
36
Parameters k-factor GARMA model estimation
iew

37
38
39 φˆ1 0.0357***
40
(0.0000)
41
42 θˆ1 -
43
44 d̂ 1 0.2657***
45 (0.0000)
46
47 d̂ 2 0.1238***
48
49 (0.0000)
50 0.0873***
d̂ 3
51
52 (0.0000)
53
54 λ̂1 0.1295***
55
(0.0000)
56
57
58 26
59
60 http://mc.manuscriptcentral.com/for
Page 27 of 38 Journal of Forecasting

1
2
3 λ̂ 2 0.0882***
4
5 (0.0000)
6
λ̂3 0.0479***
7
8 (0.0000)
9
Note: *** indicate rejection of the null hypothesis at the 1-percent level.
10
11
12 It is well known that electricity spot price often exhibits seasonal fluctuations (on the annual,
13
weekly and daily level). The seasonal character of the prices is a direct consequence of
14
15 demand fluctuations that mostly arise due to changing climate conditions like temperature or
16
17 the number of daylight hours. Therefore, these seasonal fluctuations in demand and supply
18 translate into seasonal behavior of electricity prices, and spot prices in particular.
19
Fo
20 In the frequency domain, the seasonality can be observed by the mean of parameter λ i = 1 / T ;
21
22 where λ is the frequency of the seasonality and T is the period of seasonality. As shown by
23
rP

24 the periodogram (figure 6), the spectral density is unbounded at equidistant frequencies,
25 which proves presence of several seasonalities. As reported in table 5, they show special
26
ee

27 peaks at frequencies λˆ1 = 0.01295 (T=7.72 ≈ 8 hours ≈ 1/3 day), λˆ2 = 0.0882 (T=11.5 ≈ 12
28
29 hours ≈ 1/2 day), and λˆ3 = 0.0479 (T=20.87 hours ≈ 1 day), corresponding to cycles with
rR

30
31 third-daily, semi-daily and daily periods, respectively.
32
Thereafter, the residuals result from the k -factor GARMA modeling are adopted as a proxy
ev

33
34 of the correspondent volatility. Then, we tested the existing of the long memory behavior in
35
36
iew

the results time series. As reported in Table 6, the results of the GPH and LW indicate the
37
38 presence of long memory in the conditional variance, which require the use of some
39 fractionally integrated FIGARCH method to model such processes.
40
41
42
Table 6: Results of GPH and LW long-range dependence tests in the conditional
43
44 variance.
45
GPH LW
46
47 δˆ Std-error p-value δˆ Std-error p-value
48 Bandwidth
49
50 T 0.5 =94 0.1701 0.0718 0.3293 0.2164 0.0515 0.0240
51
52 T 0.6 =232 0.2439 0.0440 0.0000 0.2971 0.0328 0.0000
53
RSP
T 0.7 =575 0.2848 0.02737 0.0000 0.3834 0.0205 0.0000
54
55 T=8760 T 0.8 =1425 0.3481 0.01723 0.0000 0.4457 0.0134 0.0000
56
57
58 27
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 28 of 38

1
2
3
4
The second step consist in modeling the conditional variance, so the residuals of the k -factor
5
6 GARMA estimation are treated using a FIGARCH model as a first approach (the FIGARCH
7
8 estimation result are reported in table 7). Then shaped through a LLWNN in order to select
9 the adequate method.
10
11 Table 7: Estimation results of the FIGARCH model
12
13 Parameters Estimation results
14
15 ω̂ 0.0016***
16
17 ψˆ 0.0392***
18
19 β̂ 0.0002***
Fo
20
21
22
δˆ 0.5634***
23
rP

LL 7327.7813
24
25 Note: *** indicate rejection of the null hypothesis at the 1-percent level.
26
ee

27
28 In the second approach, residuals resulting from the k -factor GARMA modeling are
29 considered here as the input of the LLWNN to estimate the conditional variance. For avoiding
rR

30
31 the possibility of coupling among different input and to accelerate convergence, all the inputs
32
are normalized within a range of [0, 1] using the following formula before applying it to the
ev

33
34
network, which is considered as the most commonly used data smoothing method;
35
36
iew

37 yorg − y min
38 y norm = (29)
ymax − y min
39
40
41 Where y norm is the normalized value, y org is the original value, ymin and y max are the
42
43 minimum and maximum values of the corresponding residuals data.
44 The dataset is divided into three successive parts as follows: (a) A sample of 200 observations
45
46 to initialize the network training, (b) a training set and (c) a test set. The predicting
47
48
experiences are effected over the test set by means of an iterative predicting scheme, the
49 model is forecasting for 6, 12, 24, 48 and 72 hours ahead. Details of the datasets are given in
50
51 the figure 7.
52
53
54
55
56
57
58 28
59
60 http://mc.manuscriptcentral.com/for
Page 29 of 38 Journal of Forecasting

1
2
3
4
5
6
7
8
9
10
11
12
13 Figure 7: Details of datasets.
14
15 To find the best neural network architecture, at the beginning the parameters are randomly
16 initialized, thereafter, using the back-propagation algorithm, these parameters are updated to
17
18 minimize the error between the output values and the real values during the training of the
19
network. Table 8 provides the summary of information related to the network architecture.
Fo
20
21 Table 8: LLWNN architecture
22
23
rP
Number of hidden layer 10
24
25
Learning rate 0.5000
26
ee

27
28 Layer activation function Wavelet Function
29
rR

30 Algorithm Back Propagation (BP) Learning Algorithm


31
32
ev

33
34
35 5. Forecasting results: A comparative Approach
36
iew

This section is devoted to the evaluation of the estimated models in a multi-step-ahead


37
38 forecasting task. Since forecasting is basically an out-of-sample problem, we prefer to apply
39
40
out-of-sample criteria. Accordingly, five different periods (6 hours, 12 hours, one day, two
41 days and tree days) were selected to insure the quality and the robustness of modeling and
42
43 forecasting results. To evaluate the forecasting accuracy, we apply three evaluation criteria,
44
45
namely, the out-of-sample R 2 of Campbell and Thompson (2008), the mean absolute
46 percentage error (MAPE) and the logarithmic loss function (LL) given respectively:
47
N
48
49 ∑(y
t =t1
t +h − yˆ t ,t + h )²
50 R = 2
N
, (30)
51
52
∑(y
t =t1
t +h − y t ,t + h )²
53
54 1 N (y − yˆ t ,t + h )
55 MAPE =
N − t1

t = t1
t+h

yt +h
∗ 100 , (31)
56
57
58 29
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 30 of 38

1
2
3  2

 Log  yˆ t ,t + h 
N
1 .
4 LL =
N − t1
∑   y 

(32)
5 t =t1
  t +h  
6
7 Where N is the number of observations, N − t1 is the number of observations for the
8
9 predictive period, yt + h is the log-return series through period t + h , yˆ t ,t + h is the predictive log-
10
11 return series of the predictive horizon h at time t and y t ,t + h is the historical average forecast.
12
13 To evaluate the prediction performance of the proposed hybrid methodology, this paper has
14 considered four models: the individual LLWNN model, the hybrid ARFIMA-LLWNN model,
15
16 the k -factor GARMA-FIGARCH model and the proposed hybrid k -factor GARMA-
17
18 LLWNN model. Moreover, we have taken into consideration five-time horizons, 6 hours, 12
19 hours, one day, 2 days and 3 days ahead forecasting, using the R², the LL and the MAPE out
Fo
20
21 of sample criteria, the forecast evaluation results are reported in Table 9. It proves that the
22
23
proposed hybrid k -factor GARMA-LLWNN model outperforms all other competing methods
rP

24 in terms of forecasting accuracy. Indeed, the k -factor GARMA-LLWNN model forecast


25
26 errors are the lowest for all evaluation criteria and for all forecast time horizons.
ee

27
Table 9: Out of sample forecasts results
28
29 Models Criterion h=6 h = 12 h = 24 h = 48 h = 72
rR

30
31 LLWNN R² 0.14812 0.1987 0.2031 0.2176 0.2869
32 model
ev

33 LL 0.0983 0.1221 0.0849 0.0687 0.0432


34
MAPE 7.4592 6.5877 3.9474 4.6398 5.5632
35
36
iew

ARFIMA- R² 0.2278 0.2901 0.4092 0.5658 0.6531


37 LLWNN
LL 0.0839 0.0725 0.0642 0.0411 0.0496
38 model
39 MAPE 5.7662 4.7210 3.3546 2.4431 3.1924
40
41 k -factor R² 0.5472 0.6451 0.6849 0.7209 0.8628
42 GARMA- LL 0.0632 0.0584 0.0426 0.0387 0.0310
43 FIGARCH
44 model MAPE 1.3689 2.3254 3.2351 2.0321 2.7247
45 k -factor R² 0.9428 0.8940 0.9977 0.9527 0.9889
46 GARMA-
47 LL 1.1352 × 10 −4 1.3426 ×10 −3 5.7213 × 10 −5 5.8243 × 10 −4 1.5432 × 10 −4
LLWNN
48 MAPE 0.9278 1.4437 0.7319 1.0084 0.9902
49
model
50
51 As shown in figures 9, 10, 11, 12 and 13, the hybrid k -factor GARMA-LLWNN model
52
53 shows its performance in terms of forecast accuracy, as the predictions of all the five horizons
54
55 are very close to the real values. This result can be explained by the fact that the proposed
56 model considers the features that characterize the electricity spot price. In particular, the
57
58 30
59
60 http://mc.manuscriptcentral.com/for
Page 31 of 38 Journal of Forecasting

1
2
3 seasonal long-memory; in both the conditional mean and the conditional variance, the
4
volatility and the non-linearity; making this model a robust tool that can be deal with the
5
6 features of the electricity prices and thus provide the best forecasting results. However, in the
7
8 case of the ARFIMA-LLWNN model, in the first step, the ARFIMA model is unable to
9 estimate the seasonality, this result in high errors; consequently, these biased results will be
10
11 transmitted to the second step of the conditional variance modeling via the LLWNN model.
12
13
This last one is unable to correct the inestimable seasonality problem in the first step, since
14 the predicted results with the k -factor GARMA model are more accurate to that of the
15
16 ARFIMA model using the same model for the conditional variance (LLWNN). On the other
17
hand, despite its capacity as a nonlinear, nonparametric model, and its particularity by having
18
19 a wavelet activation function and local linearity, the individual LLWNN model is unable to
Fo
20
21 detect, to model and to predict the features existing in the electricity prices. Since, when it is
22 compared with the proposed hybrid model, this last one provides prediction that is more
23
rP

24 accurate. Thus, this network need an external filter to better estimate the data since the
25
26 adoption of the k -factor GARMA model in the first step enhance the LLWNN forecasting
ee

27 results. This result contradicts existing literature; Pany (2011), Pany and Ghoshal (2013), and
28
29 Chakravarty and al (2011), which proving that the use of individual LLWNN models for
rR

30
forecasting electricity prices provides the best prediction results compared to other ordinary
31
32 networks such as multi-layer perceptron networks (MLP). This is because the features of the
ev

33
34 electricity prices can be detected by the wavelet basis activation function for hidden layer
35 neurons without any external decomposer/ composer and not having too many hidden units.
36
iew

37 In addition, this result confirms the effectiveness of the hybrid model in improving the
38
39 forecasting accuracy. Moreover, the k -factor GARMA-LLWNN model outperforms the k -
40 factor GARMA-FIGARCH model, in this regard, the nonparametric model more accurate
41
42 than the parametric model in forecasting the conditional variance.
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58 31
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 32 of 38

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
23
rP

24
25
26
ee

27
28 Figure 8: LLWNN Training results (residuals of k-factor GARMA modeling).
29
rR

30 0.46
network output versus original during testing

31 0.44

32
ev

0.42
33
34 0.4

35 0.38

36
iew

0.36

37
0.34
38
39 0.32
original output

40
network output
0.3
0 10 20 30 40 50 60 70 80
41
42
43 Figure 9: Tree days (72 hours) ahead prediction during testing (residuals of k-factor
44 GARMA modeling).
45
46
47
48
49
50
51
52
53
54
55
56
57
58 32
59
60 http://mc.manuscriptcentral.com/for
Page 33 of 38 Journal of Forecasting

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15 Figure 10: Two days (48 hours) ahead prediction during testing (residuals of k-factor
16 GARMA modeling).
17
18
19
Fo
20
21
22 network output versus original during testing
23
rP
0.46

24 0.44

25 0.42

26
ee

0.4
27
28 0.38

29 0.36
rR

30 0.34

31 0.32
32 original output
network output
ev

33 0.3
0 5 10 15 20 25

34
35 Figure 11: One days (24 hours) ahead prediction during testing (residuals of k-factor
36 GARMA modeling).
iew

37
38
39 0.39
network output versus original during testing

40 0.38

41 0.37

42 0.36

43 0.35

44 0.34

45 0.33

46 0.32

47 0.31

48 0.3 original output

49
network output
0.29

50
0 2 4 6 8 10 12

51 Figure 12: Semi daily (12 hours) ahead prediction during testing (residuals of k-factor
52 GARMA modeling).
53
54
55
56
57
58 33
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 34 of 38

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15 Figure 13: 6 hours ahead prediction during testing (residuals of GARMA modeling).
16
17
18 6. Conclusion
19
Fo
20 The electricity market price forecasting has become a challenge for all its participants. An
21
accurate electricity price forecasts represent an advantage for market players facing
22
23
rP
competition. Despite various existing time series methods, research and experience that aim to
24
25 improve the electricity spot price forecasts accuracy have never stopped. In order to overcome
26
ee

the shortcomings of the existing methods and make more accurate results. the present study
27
28 has presented a framework for using jointly a parametric model named k -factor GARMA
29
rR

30 model and non-parametric model termed Local linear wavelet neural network (LLWNN)
31 model, to capture several features related to the electricity price, notably the seasonal
32
ev

33 component, the long-range dependence, high volatility and non-linearity. The proposed
34
method was applied to the Nord Pool market as the most promising and rising power markets
35
36
iew

in the world. To illustrate, our hybrid methodology consists into two steps; in the first step,
37
38 we focus on modeling the conditional mean so we adopt a generalized fractional model with
39 k -factor of Gegenbauer ( k -factor GARMA), and we use an estimation approach based on the
40
41 discrete wavelet packet transforms (DWPT) developed by Whitcher (2004). Thereafter, in the
42
43 second step, in order to model and predict the conditional variance we adopt two different
44 approaches. Firstly, the FIGARCH model is applied to the residual of the k -factor GARMA,
45
46 so we estimate a k -factor GARMA-FIGARCH model. Secondly, the Local Linear Wavelet
47
48
Neural Network (LLWNN) model is adopted to model and predict the conditional variance
49 (applied to the residual of k -factor GARMA model). Thereafter, to evaluate the prediction
50
51 capability of each model we used the R 2 of Campbell and Thompson (2008), the Mean
52
53 Absolute Percentage Error (MAPE) and the Logarithmic Loss function (LL) as performance
54 indexes. Comparative results indicated that the predictive performance of the proposed hybrid
55
56 k -factor GARMA-LLWNN model provides evidence of the power compared to the hybrid
57
58 34
59
60 http://mc.manuscriptcentral.com/for
Page 35 of 38 Journal of Forecasting

1
2
3 ARFIAM-LLWNN model, the individual LLWNN model and the k -factor GARMA-
4
FIGARCH model. Therefore, this model leads to an improved performance. It can be an
5
6 effective way in the forecasting task, especially when higher forecasting accuracy is needed.
7
8 Obtained results are very interesting in the meaning that it was always difficult to accomplish
9 such precision when forecasting electricity spot prices. This highlights the importance of the
10
11 k -factor GARMA-LLWNN methodology as a robust forecasting method.
12
13
14
15 References
16
17 [1]. Baillie, R.T., C. F. Chung, and M.A. Tieslau. 1996. “Analyzing Industrialized
18
19 Countries Inflation by the Fractionally Integrated ARFIMA-GARCH Model.” Journal of
Fo
20 Applied Econometrics 11: 23-40.
21
22 [2]. Baillie, R.T., T. Bollerslev, and H.O Mikkelsen. 1996. “Fractionally integrated
23
rP

generalized autoregressive conditional heteroscedasticity.” Econ 74 (1): 3-30.


24
25 [3]. Bashir, Z., and M.E. El-Hawary. 2000. “Short term load forecasting by using wavelet
26
ee

27 neural networks, in: Electrical and Computer Engineering.” IEEE 163-166.


28 [4]. Benaouda, D., G. Murtagh, J.L. Starck, and O. Renaud. 2006. “Wavelet-based
29
rR

30 nonlinear multiscale decomposition model for electricity load forecasting”. Neurocomputing,


31
32 70, 139-154.
ev

33 [5]. Beran, J. 1999. “SEMIFAR Models-a Semiparametric Framework for Modelling


34
35 Trends, Long-Range Dependence and Nonstationarity”. Center of Finance and Econometrics,
36
iew

University of Konstanz.
37
38 [6]. Bernard, C., Mallat, S., and J.J. Slotine .1998. “Wavelet interpolation networks”.
39
40 ESANN: 49-52.
41 [7]. Bollerslev, T. 1986. “Generalized autoregressive conditional heteroscedasticity”.
42
43 Journal of Econometrics 31: 307-327.
44
45
[8]. Bollerslev, T., H.O. Mikkelsen. 1996. “Modeling and pricing long memory in stock
46 market volatility.” Journal of econometrics 73: 151-184.
47
48 [9]. Boubaker, H. 2015. “Wavelet Estimation of Gegenbauer Processes: Simulation and
49
Empirical Application.” Computational Economique 46: 551-574.
50
51 [10]. Boubaker, H., and N. Sghaier. 2015. “Semi parametric generalized long-memory
52
53 modeling of some MENA stock market returns: A wavelet approach.” Economic Modeling
54 50: 254-265.
55
56
57
58 35
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 36 of 38

1
2
3 [11]. Boubaker; H., and M. Boutahar. 2011. “A wavelet-based approach for modeling
4
exchange rates.” Stat Methods Appl 20:201-220.
5
6 [12]. Caporale, G.M., J. Cuñado, and L.A. Gil-Alana. 2012. “Modelling long-run trends and
7
8 cycles in financial time series data.” J. Time Ser. Anal. 34 (2): 405-421.
9 [13]. Caporale, G.M., L.A. Gil-Alana. 2014. “Long-run and cyclical dynamics in the US
10
11 stock market.” J. Forecast. 33 (2): 147-161.
12
13
[14]. Chakravarty, S., M. Nayak, and M. Bisoi. 2012. “Particle Swarm Optimization Based
14 Local Linear Wavelet Neural Network for Forecasting Electricity Prices, Energy, Automation,
15
16 and Signal (ICEAS).” Energy, Automation, and Signal (ICEAS), IEEE: 1-6.
17
[15]. Chen, Y., J. Dong, B. Yang, and Y. Zhang, 2004. “A local linear wavelet neural
18
19 network.” Intelligent Control and Automation, IEEE: 1954-1957.
Fo
20
21 [16]. Cheung, Y.W. 1993. “Long memory in foreign-exchanges rates.” J Bus Econ Stat 11:
22 93-101.
23
rP

24 [17]. Contreras, J., R. Espinola, F.J. Nogales, and A.J. Conejo. 2003. “ARIMA models to
25
predict next day electricity prices.” IEEE Transactions on Power systems 18(3): 1014-1020.
26
ee

27 [18]. Diongue, A.K., D. Guégan, and B. Vignal. 2009. “Forecasting electricity spot market
28
29 prices with a k-factor GIGARCH process” Applied Energy 86: 505-510.
rR

30 [19]. Diongue, A.K., G. Dominique, and V. Bertrand. 2004. “A k-factor GIGARCH


31
32 process: estimation and application on electricity market spot prices.” Probabilistic Methods
ev

33
34 Applied to Power Systems: 12-16.
35 [20]. Engle, R.F., 1982. “Autoregressive conditional heteroscedasticity with estimates of the
36
iew

37 variance of United Kingdom inflation.” Econometrica 50: 987-1008.


38
[21]. Erdogdu E. 2002. “Electricity demand analysis using cointegration and ARIMA
39
40 modelling: a case study of Turkey.” Energy Policy;35 1129 - 46
41
42 [22]. Ferrara, L., and D. Guegan. 2001. “Forecasting with k-factor Gegenbauer processes
43 theory and applications.” Journal of Forecasting 20: 581-601.
44
45 [23]. Gao, R., and H. I. Tsoukalas. 2001. “Neural-wavelet methodology for load
46
47 forecasting.” Journal of Intelligent & Robotic Systems 31: 149-157.
48 [24]. Gençay R., F. Selçuk, and B. Whitcher. 2001. “Scaling properties of foreign exchange
49
50 volatility.” Physica A 289: 89-106.
51
[25]. Gençay R., F. Selçuk, and B. Whitcher. 2001a. “Differentiating intraday seasonalities
52
53 through wavelet multi-scaling.” Physica A 289: 543–556.
54
55 [26]. Gençay R., F. Selçuk, and B. Whitcher. 2001b. “An Introduction to Wavelets and
56 Other Filtering Methods in Finance and Economics.” Academic Press.
57
58 36
59
60 http://mc.manuscriptcentral.com/for
Page 37 of 38 Journal of Forecasting

1
2
3 [27]. Gençay, R., G. Ballocchi, M. Dacorogna, R. B. Olsen, and O. V. Pictet. 2002. “Real-
4
time trading models and the statistical properties of foreign exchange rates”. International
5
6 Economic Review 43: 463 -491.
7
8 [28]. Geweke, J., and S. Porter-Hudak. 1983. “The Estimation And Application Of Long
9 Memory Time Series Models.” Journal of Time Series Analysis 4: 221-238.
10
11 [29]. Granger, C.W.J., and R. Joyeux. 1980. “An introduction to long memory time series
12
13
models and fractional differencing.” Journal of Time Series Analysis 1: 15-29.
14 [30]. Gray, H.L., N.F. Zhang, and W.A. Woodward. 1989. “On generalized fractional
15
16 processes.” Journal of Time Series Analysis 10: 233-257.
17
[31]. Hosking, J.R.M. 1981. “Fractional differencing.” Biometrika 68: 165-176.
18
19 [32]. Khashei, M. and M. Bijari. 2010. “An artificial neural network model for time series
Fo
20
21 forecasting.” Expert Syst. Appl. 37: 479-489.
22 [33]. Khashei, M., and M. Bijari. 2011. “A novel hybridization of artificial neural networks
23
rP

24 and ARIMA models for time series forecasting.” Applied Soft Computing 11: 2664-2675.
25
[34]. Koopman, S.J., M. Ooms, and M.A Carnero. 2007. “Periodic seasonal Reg ARFIMA-
26
ee

27 GARCH models of daily electricity spot prices.” Journal of the American Statistical
28
29 Association 102(477): 16-27.
rR

30 [35]. Mallat, S. 1999. “A wavelet tour of signal processing”. Academic press.


31
32 [36]. Mallat, S., and Zhang. 1993. “Matching pursuits with time-frequency dictionaries.”
ev

33
34 IEEE Transactions on Signal Processing 41: 3397-3415.
35 [37]. Pany, P.K., and S.P. Ghoshal. 2013. “Day-ahead Electricity Price Forecasting Using
36
iew

37 PSO-Based LLWNN Model.” International Journal of Energy Engineering (IJEE) 3: 99-106.


38
[38]. Pany. 2011. “Short-Term Load Forecasting using PSO Based Local Linear Wavelet
39
40 Neural Network.” International Journal of Instrumentation 1: Issue-2.
41
42 [39]. Robinson, P.M. 1995. “Log-Periodogram Regression of Time Series with Long-Range
43 Dependance.” Annals of statistics 23: 1048-1072.
44
45 [40]. Soares, L.J., and L.R. Souza. 2006. “Forecasting electricity demand using generalized
46
47 long memory.” International Journal of Forecasting 22, 17-28.
48 [41]. Szkuta. B., L. Sanabria, and T. Dillon. 1999. “Electricity price short-term forecasting
49
50 using artificial neural networks.” IEEE Trans Power Syst 14: 851-857.
51
[42]. Tseng, F.M., H.C. Yu, and G.H. Tzeng. 2002. “Combining neural network model with
52
53 seasonal time series ARIMA model.” Technological Forecasting & Social Change 69: 71-87.
54
55
56
57
58 37
59
60 http://mc.manuscriptcentral.com/for
Journal of Forecasting Page 38 of 38

1
2
3 [43]. Ulugammai, M., P. Venkatesh, P.S. Kannan, and N.P. Padhy. 2007. “Application of
4
bacterial foraging technique trained artificial and wavelet neural networks in load
5
6 forecasting.” Neurocomputing 70: 2659-2667.
7
8 [44]. Valenzuela, O., I. Rojas, F. Rojas, H. Pomares, L. Herrera, A. Guillen, L. Marquez,
9 and M. Pasadas. 2008. “Hybridization of intelligent techniques and ARIMA models for time
10
11 series prediction.” Fuzzy Sets Syst. 159: 821-845.
12
13
[45]. Wang, A., and B. Ramsay. 1998. “A neural network-based estimator for electricity
14 spot-pricing with particular reference to weekend and public holidays.” Neurocomputing 23:
15
16 47-57.
17
[46]. Whitcher, B. 2004. “Wavelet-based estimation for seasonal long-memory processes.”
18
19 Technometrics 46.
Fo
20
21 [47]. Woodward, W.A., Q.C. Cheng, and H.L. Gray. 1998. “A k-factor GARMA long-
22 memory model.” J Time Series Analysis 19: 485-504.
23
rP

24 [48]. Yao S.J., Y.H. Song, L.Z. Zhang, and X.Y. Cheng. 2000. “Wavelet transform and
25
neural networks for short-term electrical load forecasting.” Energy Conversion &
26
ee

27 Management 41: 1975-1988.


28
29 [49]. Zhang, G.P. 2003. “Time series forecasting using a hybrid ARIMA and neural
rR

30 network model.” Neurocomputing 50: 159- 175.


31
32 [50]. Zhang, Q. 1997. “Using wavelet network in nonparametric estimation.” IEEE
ev

33
34 Transactions on Neural Networks 8(2): 227-236.
35 [51]. Zhang, Q., and A. Benveniste. 1992. “Wavelet networks.” IEEE Transactions on
36
iew

37 Neural Networks, 3(6): 889-898.


38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58 38
59
60 http://mc.manuscriptcentral.com/for

View publication stats

You might also like