Comparison of Trend Forecast Using ARIMA and ETS Models For S&P500 Close Price
Comparison of Trend Forecast Using ARIMA and ETS Models For S&P500 Close Price
Comparison of Trend Forecast Using ARIMA and ETS Models For S&P500 Close Price
57
ICEBI 2020, October 09–11, 2020, Singapore, Singapore Zhanao Sun
MA, the ARIMA models can be used when input data are stationary.
One prominent feature of stock price is that they have an upward
trend. Having a trend means that it is not stationary, thus not
an input for the ARIMA models. The Augmented Dickey-Fuller
unit root test (ADF) can help in determining the stationarity of
the time series. To address the stationarity, the S&P500 close price
is transformed into a stationary time series by taking differences.
Apart from stationarity, seasonality is another feature we need to
look at. The time series, not least the that of temperatures, will
possibly have seasonal patterns. For seasonal time series, SARIMA
models is the more appropriate models. In this paper, ARIMA is the
model used among the two, because the observed seasonal pattern
after overlaying seasonal trend is not statistically significant. Figure 2: Plot of the differenced time series data
The ARIMA model (p,d,q) can be written as follows:
Yt = ϕ 0 +ϕ 1Yt −1 +ϕ 2Yt −2 +. . .+ϕp Yt −p +εt −θ 1εt −1 −θ 2εt −2 −. . .−θq εt −q
(1)
Where Yt represents stationary stochastic process with non-zero
average, ϕ 0 being the constant coefficient, εt represents white noise
disturbance, ϕ 1 ϕ 2 and so on represent autoregressive coefficients
and θ 1θ 2 and so on represent the moving average coefficients.
Figure 3: ADF test details for differenced time series data
2.2 ETS MODEL
Building the forecast on the foundation of weighted averages of ADF test can be written as:
past observations, the exponential smoothing methods, since its
inception if the 1950, have been developed and modified by vari- yt = c + βt + αyt −1 + ϕ 1 ∆Yt −1 + ϕ 2 ∆Yt −2 . . . + ϕp ∆Yt −p + et (4)
ous researchers. [11] [12] Until now, there are fifteen exponential Where c is a constant, β is a coefficient, ∆Yt −1 is the first difference
smoothing methods proposed. The methods used here is a different of the time series at t − 1, and yt −1 is the lag 1 of the time series.
branch of exponential smoothing called developed as automatic As shown in figure 1, close is the univariate time series of S&P500
forecasting methods, comprised of thirty or so methods. These close prices extracted from the multivariate time series data from
methods are called ETS methods, as the E represents error, T repre- yahoo finance. The close data are not stationary, as the ADF test
sents trend, and S represents seasonality. The state space equations demonstrates a p-value larger than the typical 0.05 standard, mean-
can be written as: ing it failed to conclude the alternate hypothesis that it is stationary.
yt = w (x t −1 ) + r (x t −1 ) εt (2) Specific models, like ARIMA, require stationary input data. To
make the time series stationary, we can take a difference, form-
x t = f (x t −1 ) + д (x t −1 ) εt (3) ing a time series variable stored as differenced original data. The
differencing formula goes as follows:
Where w, f , and д are coefficients. εt stands for the Gaussian white
′
noise series. The first equation is known as observation equation, yt = yt − yt −1 (5)
describing the relationship between the observation x t −1 and yt . ′
The equation below is the transition equation, describing the evo- Where yt is the difference between yt and its previous value yt −1 .
lution of states over time. In figure 2, the stationarity of DY, the differenced time series of
the original close data, can be observed. It demonstrates no trend
3 METHODOLOGIES and a consistent mean value. Apart from graphical estimate, to
ensure the stationarity of the time series after differencing, ADF
3.1 Augmented Dickey Fuller test(ADF Test) test can be utilized again.
As a common form of unit root test, Augmented Dickey-Fuller test According to the p-value shown in figure 3, a value that is lower
[13] is used to determine the stationarity of the time series. The than the 0.05 standard, the stationarity of the S&P500 close price
58
Comparison of Trend Forecast Using ARIMA and ETS Models for S&P500 Close Price ICEBI 2020, October 09–11, 2020, Singapore, Singapore
Figure 5: Residual graphs for ARIMA(2,1,3) with drift Figure 8: Residual graph for ETS(M,A,N)
data is confirmed. After having a stationary time series, seasonal- 3.3.2 ETS analysis. Not requiring time series to be stationary, ETS
ity check is another pre-model-fitting step that is essential to the models are powered by forecast package in R. For the S&P500
accuracy of the forecast. stock close price time series from 2005 to 2015, the chosen model is
ETS(M,A,N), and the output results, in figure 8, figure 9, and figure
3.2 Seasonality 10, are as follows:
As shown in figure 4, no seasonal pattern can be derived from the As shown in figure 6 and figure 9, the ME, RMSE, MAE, MPE,
seasonal plot of the S&P500 differenced data. Seasonality is not MAPE, MASE, ACF1 from ARIMA models have smaller error values
present in this time series, so there is no need to apply seasonal than those observed from ETS models, demonstrating a higher
models like Seasonal ARIMA. For the methods mentioned in the goodness of fit of ARIMA models. In the next section, the forecast
next chapter, the ARIMA models and ETS models, this time series is conducted on the basis of ARIMA model.
has adequate pre-processing that they can function as input data.
4 RESULTS
3.3 Determining the model with the best fit for Having a better fit with the S&P500 closing price univariate time
forecasting series, the ARIMA is used to forecast. The chosen ARIMA model
3.3.1 ARIMA analysis. Taking in the stationary differenced data, is ARIMA(2,1,3), which demonstrated the best goodness of fit. The
the ARIMA model analysis is powered by fpp2 package in R. For the forecast period is 200 days from 2015-1-1 to 2015-7-20, and the real
S&P500 stock close price time series from 2005 to 2015, the chosen data overlaid on the forecast result is real data from yahoo finance,
model is ARIMA(2,1,3), and the output results, in figure 5, figure 6, with days not opened substituted by the average of the last trading
and figure 7, are as follows: day and the day after the last closed day. The application of having
the real data is to validate the forecast, but for non-continuous time
59
ICEBI 2020, October 09–11, 2020, Singapore, Singapore Zhanao Sun
5 CONCLUSION
Presenting extensive process of building these stock price fore-
casting models, this paper establishes ARIMA(2, 1, 3) as the model
best fitted with the ten-year period S&P500 time series data from
2005-2014. The result of ARIMA forecast demonstrated satisfactory
short-term predictions. This ARIMA model can help approximating
the values of future close prices of S&P500. This could offer as a
guide for potential investors. To improve the performance of the
models mentioned in this paper, one may use hybrid models or
deep learning methods [14].
Figure 9: Summary of ETS models
REFERENCES
[1] Yaser, S. A. M., & Atiya, A. F. (1996). Introduction to financial forecasting. Applied
Intelligence, 6, 205–213.
[2] L.Y. Wei, “A hybrid model based on ANFIS and adaptive expectation genetic
algorithm to forecast TAIEX”, Economic Modelling vol. 33 pp. 893-899, 2013
[3] S. Bekiros, R. Gupta, and C. Kyei, “On economic uncertainty, stock market pre-
dictability and nonlinear spillover effects,” North American Journal of Economics
and Finance, vol. 36, pp. 184–191, 2016.
[4] Contreras, 1., Espinola, R.NogaJes, F1.and conejo,AJ.(2003) "ARIMA models to
predict next day electricity prices", IFEE transactions on power system, vo1.18,
noJ,pp: I 014-1 020.
[5] Tsitsika,E.V;Maravelias,C.D& Haralatous,J. (2007)"Modelling and forecasting
pelagic fish production using univariate and multivariate ARIMA models". Fish-
eries science volume 73,pp:979-988.
[6] N. Merh, V.P. Saxena, and K.R. Pardasani, “A Comparison Between Hybrid Ap-
Figure 10: Residual check for ETS models proaches of ANN and ARIMA For Indian Stock Trend Forecasting”, Journal of
Business Intelligence, vol. 3, no.2, pp. 23-43, 2010.
[7] L.C. Kyungjoo, Y. Sehwan and J. John, “Neural Network Model vs. SARIMA Model
In Forecasting Korean Stock Price Index (KOSPI), Issues in Information System,
vol. 8 no. 2, pp. 372-378, 2007.
[8] A. A. Adebiyi, A. O. Adewumi, and C. K. Ayo, “Comparison of ARIMA and
artificial neural networks models for stock price prediction,” Journal of Applied
Mathematics, vol. 2014, Article ID 614342, 7 pages, 2014.
[9] A. A. Ariyo, A. O. Adewumi, and C. K. Ayo, “Stock Price Prediction Using the
ARIMA Model,” 2014 UKSim-AMSS 16th International Conference on Computer
Modelling and Simulation, 2014.
[10] Box, G.E.P., Jenkins, G.M., 1976. Time Series Analysis: Forecasting and Control.
Holden Day, San Francisco.
[11] Pegels, C.C., 1969. Exponential forecasting: some new variations. Manage. Sci. 15
(5), 311–315.
[12] Hyndman, R.J., Koehler, A.B., Snyder, R.D., Grose, S., 2002. A state space frame-
work for automatic forecasting using exponential smoothing methods. Int. J.
Forecast. 18 (3), 439–454.
[13] Dickey, D.A., Fuller, W.A., 1981. Likelihood ratio statistics for autoregressive time
series with a unit root. Econometrica 49,1057–1072.
[14] Lu, C.-J. (2010). Integrating independent component analysis-based denoising
scheme with neural network for stock price prediction. Expert Systems with
Applications, 37(10), 7056–7064.
series, additional processes on real data are not necessary steps that
would contribute to better accuracy.
Half Width Figures.
In the figure 11, dark blue area represents 85% confidence interval,
and light blue area represents 95% confidence interval. ARIMA
forecast is based on S&P500 stock close price time series input data,
shown in black line. The red line is the real close price of the S&P500
during that period.
As can be seen from figure 11, the real result falls in the 85%
confidence interval. The real data have similar traits with the fore-
cast in ways that the real data first experienced a dip and then a
60