0% found this document useful (0 votes)
37 views8 pages

Stock Forecasting Using Prophet vs. LSTM Model Applying Time-Series Prediction

Uploaded by

sdhar1602
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views8 pages

Stock Forecasting Using Prophet vs. LSTM Model Applying Time-Series Prediction

Uploaded by

sdhar1602
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

IJCSNS International Journal of Computer Science and Network Security, VOL.22 No.

2, February 2022 185

Stock Forecasting Using Prophet vs. LSTM Model Applying


Time-Series Prediction

Mohammed Ali Alshara [email protected]


College of Computer and Information Sciences
Imam Mohammad Ibn Saud Islamic University (IMSIU)
Riyadh, Saudi Arabia

Abstract publicly traded financial instruments. A successful forecast


Forecasting and time series modelling plays a vital role in the data of the future price of the stock may yield a significant profit
analysis process. Time Series is widely used in analytics & data
science. Forecasting stock prices is a popular and important topic
[3]. Predicting how the stock markets will perform is a
in financial and academic studies. A stock market is an difficult thing to do. [4] Countless factors move the stock
unregulated place for forecasting due to the absence of essential price [2]. There are many factors involved in forecasting -
rules for estimating or predicting a stock price in the stock market. physical factors Vs. Psychological, rational, irrational
Therefore, predicting stock prices is a time-series problem and
challenging. Machine learning has many methods and applications behaviour. These aspects make stock prices volatile and
instrumental in implementing stock price forecasting, such as difficult to predict with a high degree of accuracy. One of
technical analysis, fundamental analysis, time series analysis, the prevailing theories says that stock prices are completely
statistical analysis. This paper will discuss implementing the stock
random, and their value cannot be predicted. This theory
price, forecasting, and research using prophet and LSTM models.
This process and task are very complex and involve uncertainty. raises the question of why large companies are employing
Although the stock price never is predicted due to its ambiguous quantitative analysts to build predictive models [1].
field, this paper aims to apply the concept of forecasting and data Is machine learning predicting stock prices effective?
analysis to predict stocks.
Investors make guesses calculated by analysing the data. [1,
Keywords: Predicting; Modelling; Analysis; Machine 12] Using some of the features such as the latest
Learning; Time-series; Stock price; data analysis, Long announcements about the organization, quarterly revenue
Short-Term Memory (LSTM), forecasting. results, read the news, study company history, industry
trends, and many other variables. Machine learning
1. Introduction technologies can discover patterns and insights that we do
not see and have not seen before and can be used to make
Various imminent characteristics of people's lives rely accurate, unerringly predictions.
on historical data arithmetic analysis. For example, This paper seeks to use machine learning models,
prediction of illness, changes in stock market activities,
prophet, and Long Short-Term Memory (LSTM) to predict
weather prediction. can be forecasted if a pattern in
historical data is due to time. It can be for example, daily, prices. Work is done with a historical dataset for the stock
weekly, monthly, or annually. This form of the forecast is price of a listed company (Google inc.). One machine-
commonly called Forecasting of the Time Series. learning algorithm to predict the company's future stock
Observations were sequentially taking in time, usually price will be implemented using advanced and popular
called time series [13]. techniques; the name is a prophet.
The increasing availability of historical data with the need The company may become vulnerable to market
for production forecasting has attracted the attention of fluctuations outside of control, including market sentiment,
Time Series Forecasting (TSF), which gives a sequence of economic conditions, or developments in the sector.
predicting future values, especially with the limitations of The hypothesis for this experiment is that LSTMs will
traditional forecasting, such as complexity and time- demonstrably outperform other techniques as a prophet and
consuming. [12] provide more in-depth insight into the technical analysis's
Investment firms and even individuals use financial validity.
models better understand market behaviour and make
profitable investments. [1] Stock price prediction attempts
to determine the future value of a company's stocks or other

Manuscript received February 5, 2022


Manuscript revised February 20, 2022
https://doi.org/10.22937/IJCSNS.2022.22.2.24
186 IJCSNS International Journal of Computer Science and Network Security, VOL.22 No.2, February 2022

2. Literature Review 2.2 Technical Analysis


Technical analysis is almost the opposite of
In this section, foundations and basic working fundamental analysis. Technical analysis is an analysis
definitions are provided. An overview of the relevant methodology to predict and studying past market data, price,
purposes and concepts is fundamental and technical and volume.
analysis, which are non-machine learning methods for stock In general, investors who use this approach formulate their
valuation, and machine learning approaches. trading strategy based on some technical indicators
Forecasting financial time series has always been an calculated according to price, volume, and time [4]. The
important topic and an exciting research area with many only input to technical analysis is past stock price data. The
business, economics, finance, and computer science technical analyst believes that the previous pattern in the
applications. stock indicates future designs and prices [2].

Time series analysis aims to study path observations of 2.3 Analysis Based on Models
time series and build a model to describe the data structure There are many different machine learning algorithms
and predict future time series values. Due to the importance and approaches and finding the right method has proven
of time series prediction in many applied science branches, challenging [2]. Time series models and machine learning
it is necessary to build a useful model to improve prediction models are independent of technical and fundamental
[5]. The prevailing traditional methods of dealing with the analysis. They rely on mathematical theories and devise
problem consist mainly of fundamental analysis and useful models by entering training data. The derived model
technical analysis. Simultaneously, there are more can then be used to predict new data [4].
experiments to introduce new advanced techniques such as
machine learning for forecasting in recent years [4]. This paper proposed using a machine-learning model
Multiple data visualization is shown for different pattern for predicting the price of a given stock.
creation discussed in [15] and Student performance This project's challenge accurately predicts the future
prediction model is introduced in [16] applying data mining closing value of a given stock across a given period in the
regression model approach and getting the outcomes via future. This project was being used a prophet model and
some study factors from the dataset. Another simulation Long Short-Term Memory network, usually called
prediction outcomes is discussed in [17] by COVID-19 data "LSTMs," to predict Google's price in this paper and using
effects in Saudi Arabia. a data set of past prices.

2.1 Fundamental Analysis 3. The Research Method


Traditional methods of analysing the stock market and
forecasting stock prices include fundamental analysis that In this paper, Quantitative methods is applied using
looks at the stock's performance and the company's general models and python code to analyse and visualize the data.
credibility, and statistical analysis that is concerned only
with multiplying numbers and identifying patterns in stock 3.1 Analytic models:
price variation [1]. Initially tried the prophet model to predict the stock
prices using historical closing prices and visualize both the
In general, the fundamental analysis attempts to predicted price and values over time. The model predicts
analyse some of the macro features that the company shows. five years of data points based on the test data set. Then
It is based on its principles that the market value tends to have used Long Short-Term Memory networks to predict
move towards the real deal or intrinsic value [4]. The Google's closing price using a data set of past prices.
fundamental analysis refers to the stock's valuation,
considering the company's information, related news, the This project used Root Mean Squared Error (RMSE)
general economy, and the specific economy of the as a performance measure to calculate the difference of
company's sector, among other factors [2]. predicted and actual values of stock at the adjusted close
price between the performance of the model (prophet) and
model (LSTM).
IJCSNS International Journal of Computer Science and Network Security, VOL.22 No.2, February 2022 187

Mean, the standard deviation, maximum, and minimum of


3.2 Exploring the Stock Prices Dataset the data, as shown in (Table.2):
The dataset used in this paper is of Google from
October 7, 2015, to October 7, 2020. This type of data is a
series of data points indexed in time order or a time series.
The goal is to predict the price for any given date after
training. For ease of reproduction and reusability, all data
was pull from the Yahoo finance Python API.

There are multiple variables in the dataset: date, open, high,


low, close, adj close, and volume.
Table.2: Mean SD, Max, and Min of the dataset.
 The columns Open and Close represent the starting
and final price at which the stock is traded on a Infer to the data set that date, high and low values are
particular day. not essential features of the data. The features High, Low,
 High, Low, and close represent the maximum, Volume important, but it observed that Open and Close
minimum, and last price of the day's share. prices have a direct relation. It matters the opening price of
 The adjusted closing price amends a stock's the stock and closing prices of the stock. If have higher
closing price to reflect its value after accounting closing prices than the opening prices that have some profit
for any corporate actions [3]. otherwise see losses.
 Volume- it is the amount of an asset or security The volume of stocks is also essential. The rising market
that is subject to change during a specific period, should see rising volume, i.e., increasing price and
often over a day [3]. decreasing volume show a lack of interest and warning of a
potential reversal. A price drop (or rise) on large volumes is
a stronger signal that something in the stock has
The prediction must be making for the adjusted closing
fundamentally changed [1].
price. Yahoo finance already adjusts the closing prices; it
just needs to make predictions for the "CLOSE" price. [1]
The upcoming sections explore these variables and use
The "Adjusted Close" variable is the only feature of
different techniques as a prophet to predict the stock's daily
financial time series to be fed into the prophet and LSTM
closing price. Hence, removed high, low, close the volume
models [10].
features from the data set during the processing step (fig.1).
Setup starting by importing all necessary libraries
(NumPy), (pandas), (matplotlib). Load the dataset and
define the target variable for the problem. Then import the
CSV file into Python using read_csv () from pandas. The
dataset is of the following form (Table.1):

Fig.1: data after removing High and low features

The mean, standard deviation, maximum, and minimum of


the processed data was found to be following (fig.2)
Table .1: Head of the data set
188 IJCSNS International Journal of Computer Science and Network Security, VOL.22 No.2, February 2022

positively correlated. They are directly proportional, and if


they share a negative correlation, then they are inversely
proportional. [6] Not all text is understandable, so visualize
the correlation coefficient (fig.3).

Fig.2: mean SD, Max, and Min of the dataset.

3.3 Exploratory Visualization to Visualize The


Data: Fig.3 Correlation Map
In this paper used the Matplotlib python package for
the initial graphing of the data set. (Fig.3) show the The Dark Maroon zone denotes the highly correlated
hysterical data plotted in scale. features.
4. Algorithms and Techniques Used

This paper aims to study time-series data and explore


as many options as possible to accurately predict the Stock
Price.

4.1 Prophet model:


Several time series techniques can be implemented on
a stock prediction dataset, but most of these techniques
Fig.3: Visualization of processed hysterical data fetched from the require much pre-processing of the data before constructing
API the model. [3]
The prophet is an open-source library designed for
The closing price of a stock usually determines the profit or forecasts for time-series datasets. It is easy and designed to
loss calculation for the day; hence, it will consider the target automatically find a good set of hyperparameters for the
variable's adj closing price. So, plot the target variable to model to make skilful forecasts for data with trends and
understand how it shapes up in the data (fig.2). seasonal structure by default. [8]
The prophet is an additive model with the following
components: y(t) = g(t) + s(t) + h(t) + ϵₜ

The prophet is an algorithm to build forecasting


models for time series data. It is unlike the traditional
Fig.2: Representing of Google Stocks Adjusted Closing Values. approach as it tries to fit additive regression models.
Moreover, it is very flexible when it comes to the data that
Correlation is a measure of the correlation between is fed to the algorithm. [9]
two features: how much Y will vary with a variation in X. Prophet only takes data as a data frame with a "ds"
The correlation method that used is name as the Pearson (date stamp) and "y" (value want to forecast). Therefore, the
Correlation. Coefficient is a popular to measure correlation, data had been converted to the appropriate format by adding
as the range of values ranges from -1 to 1. In mathematics the dates and value to the new attribute "ds," "y." The ds
terms, it can be understanding as if two features are (date stamp) column should be a format expected by pandas;
it can be of any format like YYYY-MM-DD HH:MM: SS
IJCSNS International Journal of Computer Science and Network Security, VOL.22 No.2, February 2022 189

for a timestamp and YYYY-MM-DD for a date. The y Apply the forecasting on the dataset using
column must be numeric, and it should represent the make_future_dataframe (). To store the data frame forecast
measurement or attribute which needs to forecast. and make prediction predict () function had been call.
Calling forecast () to see the predictions and inspect the data
Then create the data frame by use data.frame() frame and print the prediction's value. Forecasting the
function. then Fit prophet class prophet () into a new prophet model showed the prediction which predicted that
instance named "m." Prophet follows the sklearn model API. stocks would go up as shown in Fig. 7
The instance of the Prophet class is created and then call its
fit and predict methods. [11] The functions in the list below
were use in the model, which are part of the prophet library:
● cross_validation () to apply a cross-validation test for
testing the accuracy of the prophet model before use.
● performance_metrics () to compute the performance
MAPE metric on the output of our cross-validation.
● prophet () to apply for the prophet forecast.
 prophet_plot_components () to plot components of
a prophet forecast, which will print with the trend,
weekly, yearly.

Using cross cross_validation () determines the period, Fig.7 forecasting prophet


which is the number of times between cut-off dates, a
horizon, the number of days, and the initial, which is the plot_components () function had been calling to inspect the
first training period. The result will be a data frame with the forecast components, as shown in Figure. 8.
forecast "yhat," actual value” y," and the cut-off date.
Using performance_metrics () will get a table with various
prediction performance metrics, as shown in Figure.5.

Fig.5 prophet Accuracy Results.

Use plot_cross_validation_metric () to plot RMSE as


shown is Figure 6.

Fig. 8. Forecast components.

The values are what will take on consider here the


RMSE. The model's accuracy and result are depicted as
follows using the root mean square error function.
#rmse
Fig. 6. RMSE plot
190 IJCSNS International Journal of Computer Science and Network Security, VOL.22 No.2, February 2022

forecast_valid = forecast['yhat’] [987:] Long-short term memory tackles learning to remember


rms=np.sqrt(np.mean(np.power((np.array(valid['y']) RMS information over a time interval by introducing memory
cells and gate units in the neural network architecture. A
The Root Mean Square Error (RMSE) is a measure typical formulation involves the use of memory cells, each
frequently used for assessing the accuracy of prediction of which has a cell state that stores previously encountered
obtained by a model. [10] the accuracy results by information. Every time an input is passed into the memory
calculating the Mean Absolute Percentage Error RMSE cell, the output is determined by a combination of the cell
showed on Fig. 9 state (representing the previous information), and the cell
state is updated. When another input is passed into the
memory cell, the updated cell state and the new input can
compute the new output [7].
Fig. 9. Prophet Model RMSE result.
The algorithm implements by Keras library along with
Prophet (like most time series forecasting techniques) Theano were install on a cluster of high-performance
tries to capture the trend and seasonality from past data. computing centre 10]. The algorithm defines a function
This model usually performs well on time-series datasets called "fit" to build the LSTM model. It takes the training
but fails to live up to its reputation in this case. fig.10 dataset, the number of epochs and the number of times a
given dataset is fit, and the number of neurons and the
number of memory units or blocks. When the network is
built must be compiled to comply with the mathematical
notations used in Theano. When compiling a model, the loss
function must be defined together with the optimization
algorithm.
The "mean squared error" and "ADAM" are used as
the loss function and the optimization algorithm.
After compilation, it is time to fit the model to the training
dataset. Since the network model is stateful, the network's
Fig.10 RMSE resetting stage must be managed and controlled, especially
when there is more than one epoch.
Prophet offered promising results. There was a moment Furthermore, since the objective is to train an
when prediction (in yellow) intersected with the actual price optimized model using earlier stages, it is necessary to set
(orange). the shuffling parameter to false to improve the learning
mechanism.
4.2 LSTM model A small function is created to call the LSTM model
A neural network is an architecture for processing and predict the next step in the dataset. The algorithm's
distributed and parallel information that consists of active part starts where an LSTM model is built with a given
processing elements called neurons, interconnected, and training dataset, the number of the epoch, and neurons.
unidirectional signal channels called connections. Each Furthermore, the forecast is taking place for the training
processing element branches into as many output data. Then use the built LSTM model to forecast the test
connections as desired and carries signals known as a dataset and report the obtained RMSE values fig.11.
neuron output signal. The neuron output signal can be of
any mathematical type desired. [14]
Recurrent Neural Nets have vanished Gradient descent
problem, which does not allow it to learn from past data.
This problem has been solving by using long-term memory
networks, usually referred to as LSTMs. [1]
Long short-term memory was first introduced by Hoch
Reiter and Schmid Huber in 1997 to address the problems.
IJCSNS International Journal of Computer Science and Network Security, VOL.22 No.2, February 2022 191

decisions. The forecasting of Google stocks on LSTM


showed continuity in value. This research compared the
results and calculated the accuracy based on two models.
Future work will compare more than two models and
calculate the accuracy to reach the most accurate one.

References
[1]. Ashutosh Sharma, Sanket Modak, Eashwaran Sridhar. Data
Visualization and Stock Market and Prediction. International
Fig. 11 forecast LSTM Research Journal of Engineering and Technology (IRJET). Volume:
06 Issue: 09, 2019
[2]. Frank Saldivar, Mauricio Ortiz. Stock Market Price Prediction Using
The data related to the Google stock market show that the
Various Machine Learning Approaches. 2019
average Rooted Mean Squared Error (RMSE) using LSTM [3]. Stock market prediction.
models are 78.831. Fig.12. https://en.wikipedia.org/wiki/Stock_market_prediction.
[4]. Xin-Yao Qian. Financial Series Prediction: Comparison Between
Precision of Time Series Models and Machine Learning Methods.
2017
[5]. SIMA SIAMI NAMIN, AKBAR SIAMI NAMIN. FORECASTING
ECONOMIC AND FINANCIAL TIME SERIES: ARIMA VS.
Fig.12. LSTM Accuracy Result. LSTM. 2018
[6]. PETER FOY. Machine Learning for Finance: Price Prediction with
5. Results and Discussion Linear Regression. 2019. https://www.mlq.ai/price-prediction-with-
linear-regression/
[7]. CHAU Tsun Man, SUEN Heung Ping, TO Cheuk Lam, WONG
Recalling the ideas of technical analysis in stock price Cheuk Kin. Stock Price Prediction App using Machine Learning
for pattern prediction [15] shows that with the use of Models Optimized by Evolution. 2019.
LSTMs, it can nearly correctly predict a future stock price. [8]. Jason Brownlee. Time Series Forecasting with Prophet in Python.
2020 https://machinelearningmastery.com/time-series-forecasting-
Consider these results to be very favourable and can serve
with-prophet-in-python/
as a baseline for future work. The RMSE calculating [9]. Kan Nishida. An Introduction to Time Series Forecasting with
showed that the accuracy of forecasting the two models Prophet in Exploratory. 2017. https://blog.exploratory.io/an-
must value. The LSTM model showed better accuracy than introduction-to-time-series-forecasting-with-prophet-package-in-
exploratory-129ed0c12112
the prophet. The prediction of Google stocks on LSTM [10]. SIMA SIAMI NAMIN, AKBAR SIAMI NAMIN.FORECASTING
showed continuity in value, where this prediction to the next ECONOMIC AND FINANCIAL TIME SERIES: ARIMA VS.
year 2021/22, there will be a significant increase in the LSTM. 2018
value of stocks. [11]. Ashish Vishwakarma, Alok Singh, Avantika Mahadik, and Rashmita
Pradhan. Stock Price Prediction Using Sarima and Prophet Machine
Prophet algorithm was not as robust as an LSTM
Learning Model. International Journal of Advanced Research in
implementation. Considering that our only data input was Science, Communication, and Technology (IJARSCT). Volume 9,
previous stock prices as training data, to predict the next Issue 1, September 2020
year of future stock price movement, which high accuracy [12]. Shakir Khan and Hela Alghulaiakh, "ARIMA Model for
Accurate Time Series Stocks Forecasting." International Journal of
shows the prowess of LSTMs and recurrent neural networks.
Advanced Computer Science and Applications (IJACSA), 11(7),
2020. http://dx.doi.org/10.14569/IJACSA.2020.0110765
6. Conclusion [13]. Shakir Khan and Amani Alfaifi, "Modelling of Coronavirus
Behaviour to Predict it is Spread" International Journal of Advanced
Computer Science and Applications (IJACSA), 11(5),
The research used Google stocks historical data for the past 2020. http://dx.doi.org/10.14569/IJACSA.2020.0110552
five from October 7, 2015, to October 7, 2020, to compare [14]. Abu Sarwar Zamani, Nasser Saad Al-Arifi and Shakir Khan,.
the prophet model and LSTM models' results. After several Response Prediction of Earthquake motion using Artificial Neural
tests, LSTM showed accurate results in its calculating Networks. International Journal of Applied Research in Computer
Science and Information Technology. 2012. Vol. 1, No. 2, pp. 50-
values, which showed the potential of using the LSTM 57.
model on time series data to accurately predict stock data,
which will help investors in stocks in their investment
192 IJCSNS International Journal of Computer Science and Network Security, VOL.22 No.2, February 2022

[15]. S. Khan, “Data Visualization to Explore the Countries Dataset


for Pattern Creation”, Int. J. Onl. Eng., vol. 17, no. 13, pp. pp. 4–19, Dec.
2021.
[16]. Shakir Khan (2021). “Study Factors for Student Performance
Applying Data Mining Regression Model Approach", IJCSNS
International Journal of Computer Science and Network Security, Vol.
21 No. 2, pp. 188-192. https://doi.org/10.22937/IJCSNS.2021.21.2.21
[17]. S. Khan, “Visual Data Analysis and Simulation Prediction for
COVID-19 in Saudi Arabia Using SEIR Prediction Model”, Int. J. Onl.
Eng., vol. 17, no. 08, pp. pp. 154–167, Aug. 2021.

Mohammed Ali Alshara received


BSc from Imam Mohammad Ibn
Saud Islamic University in 2002 and
MSc in 2008 and PhD in 2016 from
University of North Texas. He is
currently working as Assistant
Professor and Vice Dean for Quality
and Development in College of
Computer and Information Sciences (CCIS), Imam
Mohamad Ibn Saud Islamic University (IMSIU), Riyadh,
Saudi Arabia. His research interest includes Data Analytics,
Knowledge Management Systems, and Information
Retrieval.

You might also like