0% found this document useful (0 votes)
19 views8 pages

BT4023 - Research - Paper of Stock Price Prediction Using LSTM

This study explores stock price prediction using Long Short-Term Memory (LSTM) networks, highlighting their effectiveness in capturing complex patterns in financial time series data. The research emphasizes the importance of data preprocessing, model architecture, and evaluation metrics, demonstrating that LSTM models outperform traditional forecasting methods like ARIMA and GARCH. The findings suggest that LSTM-based models can significantly enhance predictive accuracy, providing valuable tools for investors and financial institutions.

Uploaded by

maxxash0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views8 pages

BT4023 - Research - Paper of Stock Price Prediction Using LSTM

This study explores stock price prediction using Long Short-Term Memory (LSTM) networks, highlighting their effectiveness in capturing complex patterns in financial time series data. The research emphasizes the importance of data preprocessing, model architecture, and evaluation metrics, demonstrating that LSTM models outperform traditional forecasting methods like ARIMA and GARCH. The findings suggest that LSTM-based models can significantly enhance predictive accuracy, providing valuable tools for investors and financial institutions.

Uploaded by

maxxash0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Stock Price Prediction Using LSTM

Amir Sohail
School of Computing Science and Engineering
Galgotias University
U.P., India
[email protected]
Hammad Afroz Dr. Jn Singh
School of Computing Science and Engineering School of Computing Science and Engineering
Galgotias Universtiy Galgotias University
U.P., India U.P., India
[email protected] [email protected]

I. INTRODUCTION
Abstract—Stock price prediction is a critical task in the
finan- cial industry, with profound implications for traders, The financial markets are characterized by their intrinsic
investors, and financial institutions. This study presents a complexity, constant volatility, and the rapid dissemination of
comprehensive analysis of stock price prediction using Long information. In this dynamic environment, the ability to predict
Short-Term Mem- ory (LSTM) networks, a type of recurrent stock prices accurately is of paramount importance for investors,
neural network (RNN) known for its ability to capture long-
range dependencies in time series data. Leveraging a dataset financial institutions, and policymakers. Stock price prediction, a
of historical stock prices, we investigate the effectiveness of fundamental challenge in the field of finance, has been the
LSTM in modeling and forecasting price movements. subject of extensive research. It presents a unique intersection of
Our research focuses on three key aspects: data preprocessing, finance and machine learning, where advanced computational
model architecture, and evaluation metrics. We employ rigor- techniques are employed to capture and analyze intricate patterns
ous data preprocessing techniques, including feature scaling, within historical stock price data.
to
Historically, stock price prediction has been approached using
ensure the LSTM model’s optimal performance. We detail the
architecture of our LSTM model, including the number of traditional time series analysis methods, such as au- to regressive
layers and cells, as well as the hyperparameters employed in integrated moving average (ARIMA) models and
training. The dataset is partitioned into training, validation, GARCH (Generalized Autoregressive Conditional
and testing sets, facilitating a robust evaluation of the model.
Heteroskedasticity) models. While these approaches have pro-
In the experimental phase, we compare the LSTM-based vided valuable insights, they often struggle to handle the non-
approach to traditional time series forecasting methods, linearity and complex dependencies inherent in financial time
offering insights into the relative strengths and weaknesses of
these approaches. Various evaluation metrics, such as Mean series data. In recent years, machine learning techniques,
Absolute Error (MAE) and Mean Squared Error (MSE), are particularly deep learning, have emerged as promising
employed to assess the predictive accuracy of the LSTM alternatives for stock price prediction.
model. The results indicate the superior predictive capability Among the deep learning models, Long Short-Term Memory
of LSTM in capturing complex patterns within stock price
data. (LSTM) networks, a type of recurrent neural network (RNN),
have gained significant attention for their ability to capture long-
Our findings hold significant implications for the financial
range dependencies within sequential data. LSTM networks are
industry, suggesting that LSTM-based models can be valuable
tools for stock price prediction. The high predictive accuracy particularly well-suited for financial time series prediction due to
and ability to model long-term dependencies make LSTM their capacity to model intricate temporal relationships and adapt
particularly attractive for practitioners seeking improved to various patterns within the data. The LSTM architecture, with
forecasting capabilities.
its memory cells and gate mechanisms, is designed to effectively
This research contributes to the growing body of literature on process and forecast sequences of data, making it an attractive
stock price prediction, shedding light on the practical choice for modeling stock price movements.
advantages of LSTM-based models. It opens doors to further
exploration in this domain, particularly in optimizing This research focuses on investigating the application of LSTM
hyperparameters, exploring different network architectures, networks to the challenging task of stock price pre- diction. It
and enhancing model interpretability. As the financial sector seeks to assess the effectiveness of LSTM in modeling the
continues to evolve, our work provides a valuable foundation dynamics of stock prices, thus providing market participants and
for more robust and accurate stock price forecasting,
ultimately benefiting market participants and stakeholders. financial institutions with a valuable tool for improved decision-
making. Through rigorous data preprocess-
Keywords-Stock price prediction ,LSTM networks , time ,
Financial data , Machine learning, Predictive accuracy
ing, model architecture design, and evaluation metrics, this networks, a subset of recurrent neural networks (RNNs), to
study aims to offer a comprehensive analysis of LSTM- this domain.
based stock price prediction. 1. Traditional Time Series Models: Traditional time series
The following sections of this paper will delve into the models have long been the foundation of stock price pre- diction.
methodology, experiments, and results of our research, shed- Approaches such as autoregressive integrated mov- ing average
ding light on the advantages of LSTM models in this (ARIMA) and general autoregressive conditional
context. By comparing LSTM-based approaches to heteroskedasticity (GARCH) have provided valuable tools for
traditional time series methods, we aim to provide insights modeling market volatility and time-dependent patterns. While
into the relative strengths of these models and the potential these methods are well-established, they have limitations in
benefits of adopting deep learning techniques in the capturing the non-linear and complex relationships within
financial sector. The outcomes of this research hold the financial data.
promise of enhancing predictive accuracy and contributing
• 1.1. Adebiyi A. Ariyo (2014) This paper presents ex-
to a better understanding of the intricate dynamics of stock
tensive process of building stock price predictive model
markets.
using the ARIMA model. Published stock data obtained
In an era characterized by data-driven decision-making, from New York Stock Exchange (NYSE) and Nigeria
the intersection of finance and machine learning has the Stock Exchange (NSE) are used with stock price predictive
potential to reshape the landscape of stock price prediction. model developed. Results obtained revealed that the
The investigation presented here forms an essential part of ARIMA model has a strong potential for short- term
this evolving field, offering a fresh perspective on LSTM- prediction and can compete favourably with existing
based forecasting and paving the way for future techniques for stock price prediction.
advancements in modeling financial time series data.
• 1.2. Lokesh Kumar Shrivastav (2019) compared ARIMA
model and SVM model to check the accuracy of both the
II. LIBRARIES
models which performs well in which condition. So he
A. A. NumPy is an open-source Python library
founds that for short period of data ARIMA model
tailored for array analysis, offering a range of functionalities
performance is very good.
including linear algebra, Fourier transform matrices, and
more. 2. Machine Learning for Stock Prediction: The emergence of
B. B. Pandas has gained recognition as a machine learning has brought new hope to stock price prediction.
prominent Python package known for efficiently managing Researchers have explored various algorithms, including
fast and flexible data structures. It excels in handling data decision trees, support vector machines (SVM), and random
structures tailored for processing "relational" or "tagged" forests, to model stock price movements. These techniques offer
data, making it a foundational tool for practical data analysis greater flexibility and have demon- strated improved predictive
in Python. capabilities over traditional meth- ods. However, they still face
C. Matplotlib, a Python library for data challenges in handling sequential data and capturing long-term
visualization, specializes in creating 2D boxplots. dependencies. • 2.1. ”Stock Price Prediction using Support
Developed by John Hunter in 2002 and built upon NumPy Vector Ma- chine Approach” by S. Karthik, V. Pandiselvam, and
arrays, Matplotlib serves as a powerful tool for visualizing K. Ramaraju (2019) This paper proposes a SVM-based approach
data in the Python community. [4] for stock price prediction. The authors use four features to
D. yfinance is a free, open-source Python library predict the direction of the stock price movement: index
that provides programmatic access to Yahoo Finance data. It volatility, index momentum, stock price volatility, and stock
is a popular tool for financial analysis, research, and market momentum. They find that the SVM model achieves a
education. yfinance allows you to retrieve historical and prediction accuracy of 57.8
• 2.2. ”An SVM-based approach for stock market trend
real- time stock prices, fundamental data, and financial
news. It also provides functions for calculating various prediction” by J. Huang, C. Huang, and L. Wang (2010)
technical indicators and chart- ing data. This paper proposes an SVM-based approach for stock
market trend prediction. The authors use a quasi-linear
III. LITERATURE SURVEY SVM model and a set of selected financial indexes as input
features. They find that the SVM model achieves a
Stock price prediction has been a subject of considerable prediction accuracy of 58.2
interest and research in the financial and machine learning 3. Feature Engineering and Preprocessing The choice of
domains due to its profound implications for investment features and data preprocessing techniques are pivotal in
decisions, risk management, and market stability. Over the building effective predictions. Researchers have explored novel
years, researchers have explored various methodologies and approaches for feature engineering.
techniques to tackle this challenging task. In this section, we
• 3.1. A Comprehensive Feature Engineering Framework for
review the pertinent literature on stock price prediction,
Stock Market Prediction by J. Wang, J. Xu, B. Zhang, X.
highlighting the evolution of methods and the growing
Zhou, and D. Xie (2020) This paper proposes a
interest in applying Long Short-Term Memory (LSTM)
comprehensive feature engineering framework for stock c. Retrieve the historical closing price data in a
market prediction, encompassing data cleaning, feature suitable format, such as a pandas DataFrame or a CSV
selection, feature transformation, and feature creation. The file.
authors demonstrate the effectiveness of their frame- work B. Data Preprocessing
by achieving superior prediction accuracy compared to
traditional methods. Before feeding the data into the machine learning model, it’s
essential to preprocess it to ensure its quality and consistency.
IV. METHODOLOGY • Data Cleaning:
Here, we present the recommended techniques and the a. Check for missing values, outliers, or inconsistencies in
strategy for the proposed solution. Furthermore, we outline the data.
the framework blueprint including algorithmic details and b. Handle missing values by either imputing them using
deployment specifics. appropriate techniques or removing them if the extent of
missingness is significant.
c. Identify and address outliers using methods like cap-
ping, trimming, or Winsorization.
d. Verify the data types and ensure consistency in data
representation.
• Data Transformation:
a. Convert the ’Date’ column into a datetime format to
Fig. 1. Shows the number of imported libraries. enable time-series analysis.
b. Set the ’Date’ column as the index for easy referencing
and manipulation of time-series data.
• Feature Selection:
a. Identify the primary target feature, ’Close’, represent-
ing the closing price of AAPL stock.
b. Consider additional relevant features, such as moving
averages, technical indicators, or sentiment analysis met-
rics, for inclusion through feature engineering.
• Data Normalization:
a. Normalize the closing price data using a technique like
MinMaxScaler to ensure all features are on a similar scale.
b. This prevents any single feature from dominating the
model’s training process and improves the stability of
Fig. 2. Exploring Dataset.
numerical operations.

A. Data Collection C. Model Development


The data collection phase involves gathering historical clos- • Model Selection:
ing price data for AAPL stock from a reliable financial data a. Choose an appropriate machine learning model for
source. This process is crucial for providing the machine time- series forecasting, such as Long Short-Term Mem-
learning model with a rich and informative dataset to learn ory (LSTM), a type of recurrent neural network (RNN).
from. b. LSTM models are well-suited for handling sequential
• Data Source Selection: data and capturing temporal dependencies, making them
a. Identify a reputable financial data provider, effective for predicting AAPL stock prices.
such as Yahoo Finance or Google Finance, that offers • Model Architecture:
compre- hensive and reliable historical stock market a. Design the LSTM model architecture, including the
data. number of LSTM layers, the number of units per layer, and
b. Assess the data quality and completeness to the activation functions.
ensure the validity and integrity of the information. b. In this case, the model consists of four LSTM layers
• Data Retrieval: with 50, 60, 80, and 120 units, respectively, allowing for
a. Establish a connection to the chosen data increasing capacity to learn complex patterns as the data
source using appropriate APIs or libraries, such as passes through deeper layers.
yfinance for Yahoo Finance. c. Dropout layers, with dropout rates of 0.2, 0.3, 0.4, and
b. Specify the desired data parameters, including 0.5, respectively, are added after each LSTM layer to prevent
the stock symbol (AAPL), date range (January 1, 2019, overfitting.
to June 12, 2021), and frequency (daily closing prices).
historical patterns and establish relationships between
past prices

Fig.

4. Shows closing prices of training data.

• Model Compilation:
a. Configure the model’s optimizer, such as
’adam’, which efficiently updates the model’s weights
during training.

Fig. 3. The Architecture of Processed Recurrent Neural Network.

b. Define the loss function, such as ’mean


squared error’, which measures the discrepancy
between the model’s predictions and the actual closing Fig. 5. Shows closing prices of testing data.
prices.
c. Specify metrics to monitor during training, such
as ’mean squared error’, to gauge the model’s E. Model Evaluation
performance. After training the LSTM model on the training data, it is crucial
D. Model Training to evaluate its performance on unseen data to ensure its
generalization ability. In this case, the testing data, comprising
The dataset is going to be divided into two sections: training
30% of the preprocessed data, is used for evaluation.
and testing. No two distinct datasets should be loaded for
• Evaluation Metric: Mean Squared Error (MSE)
the training and testing. To mitigate the issue of overfitting,
the original dataset is typically divided into training and To assess the model’s predictive accuracy, the mean
testing subsets. By assessing the model’s performance on squared error (MSE) is chosen as the evaluation metric.
the testing dataset, we can determine if it is correctly MSE measures the average squared difference between
predicting the desired outcomes, thereby providing an the model’s predicted closing prices and the actual closing
indication of the model’s effectiveness and potential to prices in the testing set. A lower MSE value indicates
perform well on new, unseen data. better predictive performance.
Calculating MSE-:
• Data Splitting:
To calculate MSE, the following formula is used:
a. Divide the preprocessed data into training and n
testing sets, typically using a 70-30 split. Σ
b. The training set (70%) shown in fig4.is used to
train the model’s parameters, allowing it to learn from
and future trends. 1
c. The testing set (30%) shown in fig5.is used to evaluate MSE = (y true,i — ypred,i )2 (1)
the model’s generalization ability, its performance on n i=1 Where:
and visualization of the data. Tableau is the most effectivetool,
offering a plethora of options for manipulating data and
obtaining amazing outcomes.
V. RESULTS
Here, we will focus on the results of our proposed model. The
model predicts the daily closing price, and to evaluate its accuracy
in gauging a particular stock's performance, it must be compared to
the actual closing price. The dataset utilized to train and test the
model is sourced directly from 'finance.yahoo.com'. Figures 7 and
8 illustrate the actual data and the difference between actual and
predicted data, respectively.

Fig. 6. Model Flow Diagram Of LSTM.

unseen data. n is the number of data points in the testing set y true is the actual closing price
y pred is the model’s predicted closing price
• The MSE value provides an indication of the average squared difference between the model’s predictions
and the actual closing prices. A lower MSE value signifies Fig. 7. Graph of Actual Closing Prices.
better predictive accuracy, while a higher MSE value indi- cates poorer performance.With the help of MSE formula
we get
Mean squared error=0.024097932201506068
It is a very low value of MSE so, the accuracy of our
model is very good for predicting the future trend.
F. Model Prediction
Once the model is trained and evaluated, it is used to
generate predictions for future closing prices of AAPL
stock. The trained model is fed with the most recent closing
price data, and it produces predictions for the next few days,
weeks, or months, depending on the desired forecasting
horizon.
G. Data Visualization
Visualizing the data from the graphs provides a better
understanding of these graphs which allows us to observe
the data that makes it easier to observe the Trend in large
datasets. Data scientists use data visualization, which is a
technique for presenting data in a graphical and pictorial
manner, to tell a
story using the insights they gain from their analysis Fig. 8. Graph of Actual VS Predicted closing Prices.
Our model demonstrates high accuracy with the provided
dataset. The Root Mean Squared Error (RMSE) is 13.35%,
and the Mean Absolute Percentage Error (MAPE) is
8.71%.
Therefore, we can calculate the model accuracy as follows:
Accuracy = (100-8.71)%
Accuracy = 91.29%

VI. CONCLUSION
Our approach to improving an existing model differs from
previous methods addressing similar issues. Instead of
introducing a complex LSTM model, we propose a fully
functional and efficient custom deep learning prediction
system integrated with the LSTM algorithm to forecast future
stock market trends. We employ a feature extension
technique followed by recursive feature reduction to bridge
the gap between market investors and analysts, resulting in a
noticeable enhancement in model performance.

When delving into the application of LSTM in predicting


stock prices, several considerations come to light. The
accuracy of our forecasts heavily relies on the quality and
accessibility of our data.
The stock market's inherent volatility poses a challenge for
LSTM models to adapt swiftly to abrupt shifts in investor
behavior. Overfitting, especially with concise or noisy
datasets, can lead to suboptimal performance on unseen data.

Predicting stock prices is inherently challenging due to the


need for models to process diverse inputs and anticipate
outcomes. Factors such as significant political and economic
events should be considered alongside stock prices. It's
essential to incorporate elements like market psychology and
sentiment into the model, as human behavior significantly
impacts market volatility. Recognizing the impulsive and
unpredictable nature of the market is crucial for developing
models that account for these dynamics effectively.

VII. FUTURE SCOPE


The model I used in my project showed very promising results.
You can see that my model is able to track the evolution of
closing prices based on the various results shown through the
graphs and tables of the study. In future work, we will try to
find an optimal set of approximate data length and number of
training epochs that better fits resources and maximizes
prediction accuracy.
REFERENCES [10] Mojtaba Nabipour, Pooyan Nayyeri, Hamed Jabani,
[1] A. A. Ariyo, A. O. Adewumi and C. K. Ayo, ”Stock Shahab S., (Senior Member, Ieee), and Amir Mosavi (2020).
Price Prediction Using the ARIMA Model,” 2014 UKSim- Predicting Stock Market Trends Using Machine Learning and
AMSS 16th International Conference on Computer Modelling Deep Learning Algorithms Via Continuous and Binary Data; a
and Simulation, Cambridge, UK, 2014, pp. 106-112, doi: Comparative Analysis. DOI :
10.1109/UKSim.2014.67. 10.1109/ACCESS.2020.3015966
[2] L. K. Shrivastav and R. Kumar, ”An Empirical Analysis of [11] Shen, J., Shafiq, M.O. Short-term stock market price trend
Stock Market Price Prediction using ARIMA and SVM,” 2019 prediction using a comprehensive deep learning system. J Big
6th International Conference on Computing for Sustainable Data 7, 66 (2020).
Global Development (INDIACom), New Delhi, India, 2019, [12] Ajinkya Rajkar , Aayush Kumaria , Aniket Raut , Nilima
pp. 173-178. Kulkarni, 2021, Stock Market Price Prediction and Analysis,
[3] Stock Price Prediction Using Machine Learning: An Easy INTERNATIONAL JOURNAL OF ENGINEERING
Guide RESEARCH & TECHNOLOGY (IJERT) Volume 10, Issue 06
[4] Banerjee, Sharanya & Dabeeru, Neha & (June 2021)
Ra, [13] Kompella, Subhadra and Chakravarthy Chilukuri,
Lavanya. (2020). StockMarket Prediction. Kalyana, Stock Market Prediction Using Machine Learning
International Journal of Innovative Methods (March 16, 2020). International Journal of Computer
Technology and Exploring Engineering. 9. 2278- Engineering and Technology, 10(3), 2019, pp. 20-30,
3075. 10.35940/ijitee.I7642.079920. Available at SSRN:
[5] Kalkhoran, L.S., Tabibian, S. & Homayounvala, E. [14] S. Wichaidit and S. Kittitornkun, ”Predicting SET50 stock
Detecting Persian speaker-independent voice commands prices using CARIMA (cross correlation ARIMA),” in 2015
based on LSTM and ontology in communicating with International Computer Science and Engineering
the smart home appliances. Artif Intell Rev (2022). Conference (ICSEC), IEEE, 2015, pp. 1-4
[6] Dropout Regularization with Keras -
Dropout
Regularization in Deep Learning Models with Keras -
MachineLearningMastery.com
[7] N. Sirimevan, I. G. U. H. Mamalgaha, C. Jayasekara, Y.
S. Mayuran and C. Jayawardena, ”Stock Market Prediction
Using Machine Learning Techniques,” 2019 International
Conference on Advancements in Computing (ICAC), 2019,
pp. 192-197, doi:
10.1109/ICAC49085.2019.9103381.
T. [8] S. Singh, T.K. Madan, J. Kumar and A. K. Singh,
”Stock Market Forecasting using Machine Learning: Today
and Tomorrow,” 2019 2nd International Conference on
Intelligent Computing, Instrumentation and Control
Technologies (ICICICT), 2019, pp. 738-745, doi:
10.1109/ICICICT46008.2019.8993160.
[9] B. Jeevan, E. Naresh, B. P. V. kumar and P. Kambli,
”Share Price Prediction using Machine Learning Technique,”
2018 3rd International Conference on Circuits, Control,
Communication and Computing (I4C), 2018, pp. 1-4,
doi:10.1109/CIMCA.2018.8739647.

You might also like