Stock Market Analysis and Prediction Using Time Series
Stock Market Analysis and Prediction Using Time Series
a r t i c l e i n f o a b s t r a c t
Article history: Over the years the stock market has been considered a very risky investment by people around the globe.
Received 3 November 2020 This project aims to understand the historical data of the stock market and derive analysis from it to
Accepted 12 November 2020 reduce the gap of knowledge between the market behavior and the investor. A stock data comprises of
Available online xxxx
a lot of statistical terms which are difficult to understand by a normal person who wants to step into
stock market investments, this project aims at reducing the gap of knowledge. This study aims to tell
Keywords: the market scenario of the future by supporting it with statistical answers. Stock market volatility,
ARIMA
Daily returns, cumulative returns, Correlations between different stocks, Sharpe Ratio of the stocks,
Monte-Carlo
Fbprophet
CAGR value, Simple Moving Average are some important statistical terms to understand the risk of the
Cumulative return investment in the stocks. For the prediction of the future behavior of stocks work on ARIMA models,
Volatility Monte Carlo Method and Forecasting using Facebook’s prophet library have been used here.
CAGR Ó 2020 Elsevier Ltd. All rights reserved.
Sharpe ratio Selection and peer-review under responsibility of the scientific committee of the Emerging Trends in
Materials Science, Technology and Engineering.
1. Introduction were also not so easily available to the people. But Data Scientists,
statisticians, and tried to reduce the gap bit by bit over the years.
Time Series is a series of continuous data point’s index in order Now a day’s mutual fund applications use AI models, use certain
of date and time. Data connected through a continuous series of statistic benchmarks to make it easy for a newcomer to understand
time data. Analysis of time series data is done to extract meaning- it to some extent.
ful statistics and other characteristics of data. After statistical cal- Moreover, there has been a lot of studies done by people around
culations of time series data and after data analysis one can the globe to predict the future prices of the stocks but the stock
understand the behavior of the stock and deduce the amount of market does not solely depend on the historical data. It is also
risk involved in it before making any investments. affected by the sentiments of the people, which depend on some
Time series forecasting is a step further in making a valuable future events and so one cannot predict future events with 100%
step towards the understanding of future behavior. It refers to accuracy.
the use of models to predict future values based on previously The objective of this study is to understand and predict the
observed values. Models used in this study are Auto-Regressive stock behavior through statistical calculations and visualizations
Integrated Moving Average (ARIMA) model, Augmented Dickey- of historical data analysis. These objectives are:
Fuller Test tells about Stationarity of Time Series Data, Monte Carlo
Model is used to tell possible future predictions of stock for some a) Analysis of change in prices of the stock over time.
time, prophet library by Facebook is very robust in processing b) Comparative analysis of the daily and cumulative return of
the time series data and giving future predictions based on a daily the stocks.
trend of data, a weekly trend of data and yearly trend of data. c) Analysis using the Simple Moving Average of various stocks.
Stock market data over the years was considered to be very d) To find the correlation between different stocks’ closing
unpredictable and investment in the stock market was not very prices and daily returns.
difficult for newcomers. Moreover, so much data and analytics e) To find the Sharpe Ratio of the stocks and to learn how it can
be a helpful parameter while making investments.
f) To find the compounded annual growth rate of the stocks
⇑ Corresponding author.
over the last 10 years.
E-mail address: [email protected] (J. Christy Jackson).
https://doi.org/10.1016/j.matpr.2020.11.364
2214-7853/Ó 2020 Elsevier Ltd. All rights reserved.
Selection and peer-review under responsibility of the scientific committee of the Emerging Trends in Materials Science, Technology and Engineering.
Please cite this article as: J. Christy Jackson, J. Prassanna, Md. Abdul Quadir et al., Stock market analysis and prediction using time series analysis, Materials
Today: Proceedings, https://doi.org/10.1016/j.matpr.2020.11.364
J. Christy Jackson, J. Prassanna, Md. Abdul Quadir et al. Materials Today: Proceedings xxx (xxxx) xxx
g) To predict future stock behavior and future prices using High: Depicts the Highest value gained by the stocks on a par-
algorithms. ticular day.
Low: Depicts the Lowest value gained by the stocks in a partic-
ular day.
2. Related work
Adj. Close: The adjusted closing price is calculated after analy-
ses of the stock’s dividends, stock splits, and the new stock offer-
The motivation behind this topic of study was the gap of knowl-
ings which determine a new value of the stock know as adjusted
edge most people have before they start investing in the stock mar-
price.
ket. Every year smart investors make a good chunk of money out of
Volume: Volume, or trading volume, is the amount of a security
the stock market. Self-made billionaire Warren Buffet is one such
that was traded during a given time.
example who earned most of his wealth through smart investing.
Prediction of future stock behavior is another motivation. This
gap of investment knowledge needs to be reduced for a common 4. Statistical parameters
man.
The author Banarjee D. [1], has done his study on forecasting of 4.1. Daily stock return
National Stock Exchange data using the ARIMA model and has
done a comparative study on different values of p, d, and q on This parameter tells us how much the stocks gained or lost per
ARIMA and did a validation check of the forecasted stock price with day per share. It is calculated by subtracting the previous day’s
the actual stock price. In this study, ARIMA (1,0,1) gave the best fit closing price from todays’ closing price.
compared to other models. The authors Viswam, N., & Reddy, G. S.
[2], did a study on historical stock market data predicted future 4.2. Stock volatility
prices using the ARIMA model and they used the MACD model to
better analysis of the data. Volatility is a measure of the dispersion of returns for a given
The authors Angadi, M. C., & Kulkarni, A. P. [3], used the ARIMA stock or the market index. Mostly, the higher the volatility, the
model for prediction of stock prices, for p, d, q values they used riskier is the stock. It is often measured as either the standard devi-
auto ARIMA to get the best fit for the model. Obtained results ation or variance between returns from that same stock or the
reveal that the ARIMA model has a strong tendency to make short market index.
time predictions. The authors Devi, B. U., Sundar, D., & Alli, P. [4], In the stock markets, volatility is often associated with big
used NIFTY MIDCAP 50 as the index and selected the top four MID- swings in the price in either direction of the trend. For ex., when
CAP companies. They used ARMA and ARIMA models to predict the market gains and loses value more than one percent over a
future stock prices and used AIC and BIC criteria to get the best specific time, this is known as volatility of a market.
fit for the model. The authors Varghese, A., Tarhen, H., Shaikh, A., pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
DailyVolatilityFormula ¼ Variance ð1Þ
Banik, P., & Ramadasi, A. [5], used ARMA, Moving average, an
ANN model to predict the future prices of the stock and used max- pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
imum likelihood estimator and Yule-Walker estimation to check AnnualVolatilityFormula ¼ 252 Variance ð2Þ
the validation of their models.
The authors Sharma, A., Modak, S., & Sridhar, E. [6], using the 4.3. Cumulative stock return
LSTM model from RNN to predict the random nature of stock prices
in the future. The MSE value of the model came out to be signifi- This parameter tells us about how much the stocks gained or
cant and improved. lost per share over time, independent of the time. Cumulative
The authors Selvin, S., Vinayakumar, R., Gopalakrishnan, E. A., Return is equal to:
Menon, V. K., & Soman, K. P. [7], used NSE listed companies as
the stock price data. They used a sliding window approach on ðCurrentPriceÞ ðOriginalPriceÞ
ð3Þ
non-linear model RNN, LSTM, and CNN and linear model ARIMA. OriginalPrice
The non-linear model outperformed the linear model during error
calculation. The authors Mondal, P., Shit, L., & Goswami, S. [8], did a 4.4. Compounded Annual Growth Rate (CAGR)
study on the effectiveness of the ARIMA model in forecasting secu-
rity values. They used Indian Stock market data from NSE for the CAGR is the rate of return by which tell us that this rate would
analysis. AIC has been used for selecting the best ARIMA model. be required for a company to grow from its starting value to its
The author’s Wang, J., & Wang, J. [9], used principle component ending value, it is assumed that the profits gained were again
analysis and Stochastic time effective neural networks for forecast- invested at the end of each business year of an investment firm.
ing the future and used MAE, MAPE, MSE, and RMSE to calculate 1=t
the performance of the model. V final
CAGR ¼ 1 ð4Þ
V Begin
3. Dataset description Where:
CAGR = compound annual growth rate
Data used for this project is day-wise historical time series data Vbegin = beginning value
of stock of the past 10 years in numerical form. Vfinal = final value
Dataset size – 10 Business years T = time in years
Data Source used – Yahoo finance
Dataset imported from yahoo finance consists of 7 columns 4.5. Correlation of stocks
consisting of Open, High, Low, Close, Adj. Close, Volume, and Date
as the index. Correlation(r) is a statistical measure that tells us the amount to
Date (Index of the Dataset): Dates of all Business Days in a year. which two variables move about each other. In finance, the corre-
Open: Depicts the Opening price of the Security. lation can measure the movement of a stock price with the stock
Close: Depicts the Closing price of the Security. market’s benchmark index.
2
J. Christy Jackson, J. Prassanna, Md. Abdul Quadir et al. Materials Today: Proceedings xxx (xxxx) xxx
P
Rp = return of portfolio
ðX X ÞðY Y Þ
r ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffirffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð5Þ
P 2 2
Rf = risk free return rate
XX Y Y rp = standard deviation of the portfolio excess return
3
J. Christy Jackson, J. Prassanna, Md. Abdul Quadir et al. Materials Today: Proceedings xxx (xxxx) xxx
4
J. Christy Jackson, J. Prassanna, Md. Abdul Quadir et al. Materials Today: Proceedings xxx (xxxx) xxx
Table 1
Daily returns of stocks.
5. Analysis
Table 2
Cumulative return of the stocks over the years.
5
J. Christy Jackson, J. Prassanna, Md. Abdul Quadir et al. Materials Today: Proceedings xxx (xxxx) xxx
Fig. 9. Simple moving average for 10, 50, and 200 days.
Table 3
Annual Sharpe Ratio of last 9 years from 2011 to 2019.
Fig. 14. Seasonal decomposition of reliance stock into trend, seasonal, and residual components.
Fig. 15. ARIMA forecasting plot of future values with 95% confidence.
6.1. Augmented dickey-fuller test which type of test is being used on the data, which is usually sta-
tionarity or trend-stationarity. For a large and complex set of time
In the AD Fuller test, time-series data is tested for the null series models AD Fuller test is used.
hypothesis. The null hypothesis states that a unit root is found in The ADF statistic measure which is used for the test is a nega-
time-series data. An alternate hypothesis depends on whether tive number and the more the number is negative, the hypothesis
8
J. Christy Jackson, J. Prassanna, Md. Abdul Quadir et al. Materials Today: Proceedings xxx (xxxx) xxx
9
J. Christy Jackson, J. Prassanna, Md. Abdul Quadir et al. Materials Today: Proceedings xxx (xxxx) xxx
impact of risk and uncertainty of prediction and forecasting models Declaration of Competing Interest
during analysis. Monte Carlo model can also be used to solve var-
ious problems such as in fields of engineering, supply chain, The authors declare that they have no known competing finan-
science, and finance. This model is also referred to as multiple cial interests or personal relationships that could have appeared
probability simulation models. to influence the work reported in this paper.
Fbprophet is a very handy library made by Facebook for Time [1] D. Banerjee. (2014, January). Forecasting of Indian stock market using the
time-series ARIMA model. In 2014 2nd International Conference on Business
Series data. Here, The Blue line in the center implies the future and Information Management (ICBIM) (pp. 131-135). IEEE.
trend of average stock prices while the actual value can vary within [2] N. Viswam, G.S. Reddy. (2018). Stock market prediction using time series
the blue band. The blue band helps in tracking in what range can be analysis.
[3] M.C. Angadi, A.P. Kulkarni, Time series data analysis for stock market
the future stock prices can lie. Black scattered points are the actual prediction using data mining techniques with R, Int. J. Adv. Res. Comput. Sci.
stock data over the period. Using prophet data can be properly dis- 6 (6) (2015).
tributed into an overall trend, weekly trend, and yearly trend. Here [4] B.U. Devi, D. Sundar, P. Alli, An effective time series analysis for stock trend
prediction using the ARIMA model for nifty midcap-50, Int. J. Data Min. Knowl.
we can observe a yearly trend that has some seasonality (Fig. 17.
Manage. Process 3 (1) (2013) 65.
Fig. 18). [5] A. Varghese, H. Tarhen, A. Shaikh, P. Banik, A. Ramadasi, Stock market
prediction using time series, Int. J. Recent Innov. Trends Comput. Commun. 4
(5) (2016) 427–430.
7. Conclusion [6] A. Sharma, S. Modak, E. Sridhar. (2019). Data Visualization and Stock Market
and Prediction.
In this study, we could analyze and understand the risk [7] S. Selvin, R. Vinayakumar, E.A. Gopalakrishnan, V.K. Menon, K.P. Soman, in:
Stock price prediction using LSTM, RNN, and CNN-sliding window model, IEEE,
involved in the stock price and are well accounted for by statistical 2017, pp. 1643–1647.
terms such as CAGR, Sharpe Ratio volatility, and the cumulative [8] P. Mondal, L. Shit, S. Goswami, Study of effectiveness of time series modeling
return. This study has proved to be successful in comparison anal- (ARIMA) in forecasting stock prices, Int. J. Comput. Sci. Eng. Appl. 4 (2) (2014)
13–29.
ysis between stocks and get stock with lesser risk and good return.
[9] J. Wang, J. Wang, Forecasting stock market indexes using principle component
In this study, we worked with a linear model for the forecasting analysis and stochastic time effective neural networks, Neurocomputing 156
of the future price. But in there no evidence that stock data was lin- (2015) 68–78.
[10] How to Use the Sharpe Ratio to Analyze Portfolio Risk and Return. (2020).
ear, this motivates us to work on forecasting using the non-linear
Retrieved 22 May 2020, from https://www.investopedia.com/terms/s/
data shortly. Maybe with some statistical measurements, we can sharperatio.asp.
develop a model that can help in choosing the right stock. [11] J. Brownlee. (2020). What Is Time Series Forecasting? Retrieved 22 May 2020,
from https://machinelearningmastery.com/time-series-forecasting/.
10