Comparison of Trend Forecast Using ARIMA and ETS Models For S&P500 Close Price

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Comparison of Trend Forecast Using ARIMA and ETS Models for

S&P500 Close Price


Zhanao, ZS, Sun
Maple Leaf International School—Zhenjiang, China
[email protected]

ABSTRACT prediction of the market, given the unstable and non-stationary


Stock price forecast is pivotal for various financial and economic in- nature of stock prices [1], [2], [3]. Notwithstanding the obstacles,
stitutions and individuals. The aim of this study is to present viable researchers are making considerable developments in the general
and general approaches that would improve the understanding of application of predictive models, partly prompted by the persistent
forecasting stock market close price of individual stock. This paper profit-maximizing or risk-avoiding motivations among investors.
explains processes of applying methods including Autoregressive ARIMA models are known for its efficient forecast among re-
Integrated Moving Average (ARIMA) and Exponential Smoothing searchers interested in time series [4], [5], particularly those study-
(ETS) on the close price data of S&P500 index, but the ticker of ing financial markets, since its short-term prediction outperforms
the stock can be swapped for forecasting other stocks. In terms many well-recognized methods, including ANN methods [6], [7],
of determining the accuracy of the models, we center on the sim- [8]. Previous studies demonstrate that ARIMA is a reliable forecast
plest methodology. Of the two models involved in this study, we methodology [9], and that is particularly so in forecast for stock
compare them on the basis of standard deviation. Stock data are market, where analysts perceive it as a method close to the gold
obtained from yahoo finance using quantmod package in R studio. standard of stock market forecasting.
Forecasting result shows that the ARIMA model has a better fit This paper demonstrates a comparison between three popular
with the data and can give a promising general trend prediction forecasting methods for time series, one being the aforementioned
compared with existing methods. ARIMA (Autoregressive Integrate Moving Average), and one being
the Naïve model, the last being the ETS (Exponential Smoothing),
CCS CONCEPTS using the data of S&P 500 index close price from 2005-1-3 to 2014-
12-31, an approximate ten-year period. The reason of not using
• Mathematics of computing → Probability and statistics; Sta-
data from 2005-1-1 to 2005-1-2 is because the stock exchanges did
tistical paradigms; Time series analysis; Probability and statistics;
not open during that period. This paper presents a detailed process
Statistical paradigms; Exploratory data analysis; • Theory of com-
of implementing both models and their pre-processing procedures
putation → Theory and algorithms for application domains; Data-
on the S&P close price time series.
base theory; Data modeling.
The statistical analyses have been done with the help of the
software R Studio Version 1.3.1056.
KEYWORDS
Both ARIMA and ETS forecasts are made possible by R packages
R, ARIMA, ETS, time series, forecast, univariate, stock market, such as fpp2, quantmod, tseries, timeSeries, forecast. To determine
S&P500 which model has a better fit with our existing univariate time series,
ACM Reference Format: mean error (ME), root mean squared error (RMSE), mean absolute
Zhanao, ZS, Sun. 2020. Comparison of Trend Forecast Using ARIMA and ETS error (MAE), mean percentage error (MPE), mean absolute per-
Models for S&P500 Close Price. In 2020 The 4th International Conference on E- centage error (MAPE), mean absolute scaled error (MASE), and the
Business and Internet (ICEBI 2020), October 09–11, 2020, Singapore, Singapore. autocorrelation of errors at lag 1 (ACF1) are observed for compari-
ACM, New York, NY, USA, 4 pages. https://doi.org/10.1145/3436209.3436894 son.
The forecast of both the method with less error and the method
1 INTRODUCTION with greater error will be used for comparison and prediction.
Having a hefty role in the financial decision-making among vari- The rest of the paper is organized as follows. Section 2 demon-
ous investors, stock prices prediction is an intriguing problem for strates forecasting models used in this study. Section 3 presents the
researchers who are looking for desirable and practical methods, procedures of setting up forecasts. Section 4 gives the prediction
unlike human intuition. Despite the considerable attention it is cur- results in comparison with the real data. Conclusions and future
rently receiving, the challenge remains as to how to gain accurate works are described in Section 5.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed 2 MODELS
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM 2.1 ARIMA MODEL
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a In this study, we used the ARIMA model, introduced by Box and
fee. Request permissions from [email protected]. Jenkins [10] and a popular model for time series forecasting analysis,
ICEBI 2020, October 09–11, 2020, Singapore, Singapore to estimate the stock price of the 200 days period after last day of
© 2020 Association for Computing Machinery.
ACM ISBN 978-1-4503-8857-3/20/10. . . $15.00 2014. Developed on the basis of autoregressive models(AR), moving
https://doi.org/10.1145/3436209.3436894 average models(MA), and ARMA model comprised of both AR and

57
ICEBI 2020, October 09–11, 2020, Singapore, Singapore Zhanao Sun

Figure 1: ADF test details for raw time series data

MA, the ARIMA models can be used when input data are stationary.
One prominent feature of stock price is that they have an upward
trend. Having a trend means that it is not stationary, thus not
an input for the ARIMA models. The Augmented Dickey-Fuller
unit root test (ADF) can help in determining the stationarity of
the time series. To address the stationarity, the S&P500 close price
is transformed into a stationary time series by taking differences.
Apart from stationarity, seasonality is another feature we need to
look at. The time series, not least the that of temperatures, will
possibly have seasonal patterns. For seasonal time series, SARIMA
models is the more appropriate models. In this paper, ARIMA is the
model used among the two, because the observed seasonal pattern
after overlaying seasonal trend is not statistically significant. Figure 2: Plot of the differenced time series data
The ARIMA model (p,d,q) can be written as follows:
Yt = ϕ 0 +ϕ 1Yt −1 +ϕ 2Yt −2 +. . .+ϕp Yt −p +εt −θ 1εt −1 −θ 2εt −2 −. . .−θq εt −q
(1)
Where Yt represents stationary stochastic process with non-zero
average, ϕ 0 being the constant coefficient, εt represents white noise
disturbance, ϕ 1 ϕ 2 and so on represent autoregressive coefficients
and θ 1θ 2 and so on represent the moving average coefficients.
Figure 3: ADF test details for differenced time series data
2.2 ETS MODEL
Building the forecast on the foundation of weighted averages of ADF test can be written as:
past observations, the exponential smoothing methods, since its
inception if the 1950, have been developed and modified by vari- yt = c + βt + αyt −1 + ϕ 1 ∆Yt −1 + ϕ 2 ∆Yt −2 . . . + ϕp ∆Yt −p + et (4)
ous researchers. [11] [12] Until now, there are fifteen exponential Where c is a constant, β is a coefficient, ∆Yt −1 is the first difference
smoothing methods proposed. The methods used here is a different of the time series at t − 1, and yt −1 is the lag 1 of the time series.
branch of exponential smoothing called developed as automatic As shown in figure 1, close is the univariate time series of S&P500
forecasting methods, comprised of thirty or so methods. These close prices extracted from the multivariate time series data from
methods are called ETS methods, as the E represents error, T repre- yahoo finance. The close data are not stationary, as the ADF test
sents trend, and S represents seasonality. The state space equations demonstrates a p-value larger than the typical 0.05 standard, mean-
can be written as: ing it failed to conclude the alternate hypothesis that it is stationary.
yt = w (x t −1 ) + r (x t −1 ) εt (2) Specific models, like ARIMA, require stationary input data. To
make the time series stationary, we can take a difference, form-
x t = f (x t −1 ) + д (x t −1 ) εt (3) ing a time series variable stored as differenced original data. The
differencing formula goes as follows:
Where w, f , and д are coefficients. εt stands for the Gaussian white

noise series. The first equation is known as observation equation, yt = yt − yt −1 (5)
describing the relationship between the observation x t −1 and yt . ′
The equation below is the transition equation, describing the evo- Where yt is the difference between yt and its previous value yt −1 .
lution of states over time. In figure 2, the stationarity of DY, the differenced time series of
the original close data, can be observed. It demonstrates no trend
3 METHODOLOGIES and a consistent mean value. Apart from graphical estimate, to
ensure the stationarity of the time series after differencing, ADF
3.1 Augmented Dickey Fuller test(ADF Test) test can be utilized again.
As a common form of unit root test, Augmented Dickey-Fuller test According to the p-value shown in figure 3, a value that is lower
[13] is used to determine the stationarity of the time series. The than the 0.05 standard, the stationarity of the S&P500 close price

58
Comparison of Trend Forecast Using ARIMA and ETS Models for S&P500 Close Price ICEBI 2020, October 09–11, 2020, Singapore, Singapore

Figure 6: Summary of ARIMA model

Figure 4: Seasonal plot with no clear trend, produced by over-


laying the differenced data in one-year period.

Figure 7: Residual check ARIMA models

Figure 5: Residual graphs for ARIMA(2,1,3) with drift Figure 8: Residual graph for ETS(M,A,N)

data is confirmed. After having a stationary time series, seasonal- 3.3.2 ETS analysis. Not requiring time series to be stationary, ETS
ity check is another pre-model-fitting step that is essential to the models are powered by forecast package in R. For the S&P500
accuracy of the forecast. stock close price time series from 2005 to 2015, the chosen model is
ETS(M,A,N), and the output results, in figure 8, figure 9, and figure
3.2 Seasonality 10, are as follows:
As shown in figure 4, no seasonal pattern can be derived from the As shown in figure 6 and figure 9, the ME, RMSE, MAE, MPE,
seasonal plot of the S&P500 differenced data. Seasonality is not MAPE, MASE, ACF1 from ARIMA models have smaller error values
present in this time series, so there is no need to apply seasonal than those observed from ETS models, demonstrating a higher
models like Seasonal ARIMA. For the methods mentioned in the goodness of fit of ARIMA models. In the next section, the forecast
next chapter, the ARIMA models and ETS models, this time series is conducted on the basis of ARIMA model.
has adequate pre-processing that they can function as input data.
4 RESULTS
3.3 Determining the model with the best fit for Having a better fit with the S&P500 closing price univariate time
forecasting series, the ARIMA is used to forecast. The chosen ARIMA model
3.3.1 ARIMA analysis. Taking in the stationary differenced data, is ARIMA(2,1,3), which demonstrated the best goodness of fit. The
the ARIMA model analysis is powered by fpp2 package in R. For the forecast period is 200 days from 2015-1-1 to 2015-7-20, and the real
S&P500 stock close price time series from 2005 to 2015, the chosen data overlaid on the forecast result is real data from yahoo finance,
model is ARIMA(2,1,3), and the output results, in figure 5, figure 6, with days not opened substituted by the average of the last trading
and figure 7, are as follows: day and the day after the last closed day. The application of having
the real data is to validate the forecast, but for non-continuous time

59
ICEBI 2020, October 09–11, 2020, Singapore, Singapore Zhanao Sun

big uptick, though not as smooth as the forecasted upward facing


curve which ends with a higher close price than the original point.

5 CONCLUSION
Presenting extensive process of building these stock price fore-
casting models, this paper establishes ARIMA(2, 1, 3) as the model
best fitted with the ten-year period S&P500 time series data from
2005-2014. The result of ARIMA forecast demonstrated satisfactory
short-term predictions. This ARIMA model can help approximating
the values of future close prices of S&P500. This could offer as a
guide for potential investors. To improve the performance of the
models mentioned in this paper, one may use hybrid models or
deep learning methods [14].
Figure 9: Summary of ETS models
REFERENCES
[1] Yaser, S. A. M., & Atiya, A. F. (1996). Introduction to financial forecasting. Applied
Intelligence, 6, 205–213.
[2] L.Y. Wei, “A hybrid model based on ANFIS and adaptive expectation genetic
algorithm to forecast TAIEX”, Economic Modelling vol. 33 pp. 893-899, 2013
[3] S. Bekiros, R. Gupta, and C. Kyei, “On economic uncertainty, stock market pre-
dictability and nonlinear spillover effects,” North American Journal of Economics
and Finance, vol. 36, pp. 184–191, 2016.
[4] Contreras, 1., Espinola, R.NogaJes, F1.and conejo,AJ.(2003) "ARIMA models to
predict next day electricity prices", IFEE transactions on power system, vo1.18,
noJ,pp: I 014-1 020.
[5] Tsitsika,E.V;Maravelias,C.D& Haralatous,J. (2007)"Modelling and forecasting
pelagic fish production using univariate and multivariate ARIMA models". Fish-
eries science volume 73,pp:979-988.
[6] N. Merh, V.P. Saxena, and K.R. Pardasani, “A Comparison Between Hybrid Ap-
Figure 10: Residual check for ETS models proaches of ANN and ARIMA For Indian Stock Trend Forecasting”, Journal of
Business Intelligence, vol. 3, no.2, pp. 23-43, 2010.
[7] L.C. Kyungjoo, Y. Sehwan and J. John, “Neural Network Model vs. SARIMA Model
In Forecasting Korean Stock Price Index (KOSPI), Issues in Information System,
vol. 8 no. 2, pp. 372-378, 2007.
[8] A. A. Adebiyi, A. O. Adewumi, and C. K. Ayo, “Comparison of ARIMA and
artificial neural networks models for stock price prediction,” Journal of Applied
Mathematics, vol. 2014, Article ID 614342, 7 pages, 2014.
[9] A. A. Ariyo, A. O. Adewumi, and C. K. Ayo, “Stock Price Prediction Using the
ARIMA Model,” 2014 UKSim-AMSS 16th International Conference on Computer
Modelling and Simulation, 2014.
[10] Box, G.E.P., Jenkins, G.M., 1976. Time Series Analysis: Forecasting and Control.
Holden Day, San Francisco.
[11] Pegels, C.C., 1969. Exponential forecasting: some new variations. Manage. Sci. 15
(5), 311–315.
[12] Hyndman, R.J., Koehler, A.B., Snyder, R.D., Grose, S., 2002. A state space frame-
work for automatic forecasting using exponential smoothing methods. Int. J.
Forecast. 18 (3), 439–454.
[13] Dickey, D.A., Fuller, W.A., 1981. Likelihood ratio statistics for autoregressive time
series with a unit root. Econometrica 49,1057–1072.
[14] Lu, C.-J. (2010). Integrating independent component analysis-based denoising
scheme with neural network for stock price prediction. Expert Systems with
Applications, 37(10), 7056–7064.

Figure 11: Forecast result

series, additional processes on real data are not necessary steps that
would contribute to better accuracy.
Half Width Figures.
In the figure 11, dark blue area represents 85% confidence interval,
and light blue area represents 95% confidence interval. ARIMA
forecast is based on S&P500 stock close price time series input data,
shown in black line. The red line is the real close price of the S&P500
during that period.
As can be seen from figure 11, the real result falls in the 85%
confidence interval. The real data have similar traits with the fore-
cast in ways that the real data first experienced a dip and then a

60

You might also like