Stock Price Prediction Using Machine Learning Algorithms: ARIMA, LSTM & Linear Regression
Stock Price Prediction Using Machine Learning Algorithms: ARIMA, LSTM & Linear Regression
Krushali Sohil Patel1, Udit Rajesh Prahladka1, Jaykumar Babulal Patel1, Yogita Shelar2
autocorrelation function (ACFs) and component d = At ARIMA, we equate and convert the related time
autocorrelation function (PACFs) using Q statistics and series into a standard time series by dividing. We use d to
integration sites. In addition, in static data, it is stabilized determine the number of different numbers.
with the help of various techniques. It was concluded near
the end of the study that the ARIMA model was very useful q = q is used to indicate a feature error. Part of the error is
for short-term prediction [4]. part of the unforgivable historical data for the general
price range
3. METHODOLOGY Autoregressive component: The independent AR model
relies on a combination of historical values. This
dependence is so great that it is seen in the reversal of the
old line that the number of parts of the Auto Regressive
has a direct dependence on the calculation of previous
times.
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2153
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 03 | Mar 2022 www.irjet.net p-ISSN: 2395-0072
B. Long Short Term Memory (LSTM) (B) and is commonly known as a coefficient. In addition to
this, another coefficient is added to give the line additional
The Long Short-Term Memory network is a RNN that is degrees of freedom. This additional term is often referred
trained using Backpropagation. It takes care of the to as the bias coefficient. Typically, the bias coefficient is
disappearing gradient problem encountered earlier. LSTM calculated or otherwise measured by finding the distance
networks have their own memory and so they prove to be of our mathematical points from the most relevant line.
efficient in creating large RNNs and handle time specific This can be displayed as a straight line at right angles to
scheduling problems. The memory blocks in LSTM the vertex and calculated using the line bias. Statistically, a
network are connected through recurrent layers rather line tangent is used to measure its proximity to the
than having neurons. relative linear Regression.
A block has many basic and a few complex components A problem model model in Linear Regression will be
that make it smarter as compared to the standard neuron. provided as follows:
It consists of many gates that coordinate relative input
functions with output functions. Whenever a block y = B0 + Bt * xt + Et
receives an input, a gate is triggered which takes decision
about whether or no to pass the block forward for further This same line is also called a plane or plane when we are
processing. dealing with more than one input. This is often the case
with high-volume data. The Linear Regression model is
The standard LSTM block, in its simplest form, consists of therefore represented by the mathematical and
an input gate, an output gate, a cell and a forget gate. introverted values measured by the specific coefficients.
However, before using this line number, we are faced with
1) Cell: It is used to remember the values over arbitrary a number of issues. These issues often increase the
time intervals. complexity of the model which makes accurate estimates
difficult. This complexity is often discussed in terms of the
2) Input Gate: It decides which information to keep in the number of dependent and independent factors.
cell.
The effect of input variables on the model is effectively
3) Output Gate: It is used to decide which part of cell state disrupted when a certain coefficient becomes zero.
should be given as an output. Therefore, due to the empty values, the accuracy is
reduced in the estimates made from the model (0 * x = 0).
4) Forget Gate: It is used to decide which information to When we analyze adaptive techniques that can change the
throw away from the cell. learning algorithm to reduce the complexity of models by
emphasizing the importance of the perfect coefficient,
which drives some to zero, this exact position is important.
C. Line Decline
Fig-4: Linear Regression [7]
In the Line Redistribution model, the calculation line
calculation is used to combine a set of input data values (x) Twitter Sentiment Analysis-
into a predicted output data set of input values (y). Both
the input and output variables and values are considered Social media data has high impact today than ever, it can
integers. The unique number given by the Line Rotation aid in predicting the trend of the stock market. The
equation is represented using the Greek capital letter Beta method involves collecting news and social media data
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2154
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 03 | Mar 2022 www.irjet.net p-ISSN: 2395-0072
and extracting sentiments expressed by individual. Then In regression, according to the input given, a curve is
the correlation between the sentiments and the stock plotted in a graph. The curve represents the variations in
values is analyzed. The learned model can then be used to the stock prices over the years. Here, the X-axis will
make future predictions about stock values contain the date of the stock and the Y-axis will contain the
closing price of a stock.
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2155
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 03 | Mar 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2156
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 03 | Mar 2022 www.irjet.net p-ISSN: 2395-0072
Fig 5.10 Real Time Stock Data for NSE (HDFCBANK) stock Fig 5.14: LSTM prediction and Root Mean Squared Error
(RMSE) for NSE (HDFCBANK) stock
F. ARIMA forecast for NSE stock
H. Linear Regression is a stock forecast for the NSE
The ARIMA model was used in test set data (20% of all
data). The predicted values are compared to real values Lineback Model was used for test set data (20% of all
and the results are reflected in python. data). The predicted values are compared to real values
and the results are reflected in python.
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2157
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 03 | Mar 2022 www.irjet.net p-ISSN: 2395-0072
C. CLASSIFICATION MODELS COMPARISON [2] M. İ. Y. Kaya and M. E. Karsligil, "Stock price prediction
using financial news articles," 2010 2nd IEEE
The three models in consideration, namely Arima, Lstm, International Conference on Information and
Linear regression, were then compared based on multiple Financial Engineering, Chongqing, 2010, pp. 478-482.
areas of attention like the number of parameters and a [3] Hedayati, Amin & Moghaddam, Moein & Esfandyari,
comparison of trainable and non-trainable parameters, in Morteza. (2016). Stock market index prediction using
Table-3, the rmse value and loss graphs of each of the artificial neural network:. Journal of Economics,
models were investigated along with the ease of learning Finance and Administrative Science.
taken into account, and finally, the classification results 10.1016/j.jefas.2016.07.002.
were examined individually for all evaluation parameters,
[4] Ayodele A. Adebiyi., Aderemi O. Adewumi, “Stock Price
in table-1.
Prediction Using the ARIMA Model”, IJSST, Volume-15,
Issue-4. [Online]. Available :https://ijssst.info/Vol-
Table-1: Classification Results of All Models
15/No-4/data/4923a105.pdf
[5] Chen, Peiyuan. (2020). STOCHASTIC MODELING AND
Algorithms RMSE Predicted Present value
Value value
ANALYSIS OF POWER SYSTEM WITH RENEWABLE
GENERATION, ResearchGate Publication
[6] Angle Qian (2018), Structure of LSTM RNNs, Stack
ARIMA 3.06 175.57 177.77 Exchange [Online]. Available:
https://ai.stackexchange.com/questions/6961/struct
LSTM 6.66 166.61 177.77
ure-of-lstm-rnns
[7] Rob J Hyndman and George Athanasopoulos,
LINEAR 9.11 173.71 177.77
Forecasting: Principles and Practice, OTexts, Kindle
REGRESSION Edition. [Online]. Available: https://otexts.com/fpp2/
8. CONCLUSION
9. REFERENCES
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2158