Stock Price Prediction Based On CNN-LSTM Model in The Pytorch Environment

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Stock Price Prediction based on CNN-LSTM Model in

the PyTorch Environment


Weidong Xu1,*
1
Shanghai University of Electric Power
*
Corresponding author. Email: [email protected]

ABSTRACT
The stock market, as the main financing channel for listed companies and the most accessible wealth creation
opportunity for investors, has always attracted attention from all walks of life. With the evolution of the technology,
deep learning has started to play a very important role in forecasting stock price. Based on in-depth research on CNN
and LSTM, this paper builds a CNN-LSTM stock price prediction model in PyTorch environment, and takes the data
from the A-share market, choosing Shanghai Composite Index for a total of ten years from January 2012 to December
2021 as the experimental object, then verifying the feasibility of this joint model in the field of stock price forecasting,
while comparing with the predicted values obtained using CNN and LSTM alone. The result confirms that the CNN-
LSTM joint model performs well.

Keywords: Stock price prediction, PyTorch, CNN, LSTM.

1. INTRODUCTION selectivity, memory and internal influence of time series.


Since the CNN model was proposed, it has been mostly
As the most important financing channel for listed used in image feature extraction and face recognition.
companies and the most accessible opportunity for However, the advantage that CNN can extract the
investors to create wealth, the stock market has always abstract feature vector which reflects the data
attracted attention from all walks of life. Nowadays, the information from the given training data set also do well
majority of investors in the stock market are "retail in processing and learning data, so CNN may also have a
investors". In the investment process, the biggest good effect in the field of stock prediction.
problem that retail investors faced with is their lack of Based on the in-depth study of CNN and LSTM, in
relevant investment knowledge and the "information order to further improve the stock prediction accuracy,
asymmetry" exists. To obtain valuable information, retail this paper builds a joint stock price prediction model of
investors always have to pay a high price, so that they CNN-LSTM in the PyTorch environment. It is hoped that
often rely on "grass news" which is cost-free rather than the empirical study of this model in the field of stock
scientific technical analysis, resulting in the phenomenon price forecasting can broaden the research perspective
of "blind investment" and "following suit”. Therefore, and enrich the content of stock price forecasting based on
stock price prediction has become particularly important. neural network technology. At the same time, I also hope
It is the emergence of stock price forecasting that can that it can provide a more accurate prediction reference
help retail investors to obtain effective data at a low cost for "retail investors" in real life.
in a market that lacks transparency, discover stocks and
industries that may rise, and to some extent even reduce 2. LITERATURE REVIEW
noise in the stock market.
In recent years, we are gradually entering the era of Before the neural network technology became the
big data, neural network models have begun to be used in mainstream, scholars all over the world regarded
the field of stock price prediction. Stock price, as a proxy historical stock data as time series, and were keen to
for time series data, time series models have become the establish time series models to fit stock price changes to
mainstream to predict them. As a new type of recurrent predict the future share price trends, such as linear
neural network model, LSTM can solve the problem of regression methods like ARIMA, ARCH and GARCH.
gradient disappearance very well because of its good Adebiyi et al. extracted the influencing factors of stock

© The Author(s) 2022


Y. Jiang et al. (Eds.): ICEDBC 2022, AEBMR 225, pp. 1272–1276, 2022.
https://doi.org/10.2991/978-94-6463-036-7_188
Stock Price Prediction based on CNN-LSTM 1273

prices and used the ARIMA model to forecast stock by the CNN and LSTM models alone, which proved the
prices in the future, and the outcome confirmed that the superiority and efficiency of the hybrid model.
ARIMA model performed well in short-term.[1] Xu Feng
selected China Southern Airlines and China Eastern 3.2. Stock Data Feature Extraction And
Airlines as the research objects, building a complete Preprocessing
model for prediction of two stock through the
autoregressive process. The study found that the GARCH The prediction of the closing price of the Shanghai
model had short-term memory in predicting stock Stock Exchange Index was the purpose of this
prices.[2] It was concluded that the method of linear experiment, and it was particularly important to decide
regression was only suitable for short-term stock price what kind of characteristics of the stock should be
forecasting, while the long-term forecasting performance selected as the input factor. In this experiment, the basic
was poor. trading indicators that are most closely related to stock
price fluctuations were selected, namely opening price,
With the advancement of science and technology,
closing price, high price, low price and volume.
machine learning and deep learning have begun to be
favored by scholars. Compared with linear regression After obtaining the dataset, the first 70% was
models, neural network models have strong potential in classified as the training set, and the last 30% was
further improving the accuracy of stock price forecasting, classified as the test set. Since each trading indicator of a
because they can extract nonlinear features. Nowadays, stock has different dimensions and units, this will make
most of the current research is devoted to exploring how it impossible to compare the data and finally affect the
to form a mixed model according to the different experimental results. So it is necessary to use data
characteristics of each model, find the optimal model normalization to eliminate the dimensional influence
collocation, and complement each other. Xu Yuemei et among indictors. In this experiment, the Z-score
al. used the CNN-BiLSTM model combined with standardization method was used to perform
financial news sentiment analysis to predict stock prices, dimensionless processing on each feature factor, which
confirming that it does perform well in long-term was beneficial to accelerate the training speed and
forecasting.[3] Yu Z et al. firstly utilized LLE, after increase prediction accuracy of the model.
obtaining the processed data, then implanting them into
In the PyTorch environment, DataLoader was used in
the BP neural network, and found that the prediction
this experiment as the uploading tool, which could
accuracy of LLE-BP was better in comparison with single
convert the preprocessed dataset into an iterator, output a
BP, PCA-BP and ARIMA.[4]
predefined batch_size number of images per iteration,
PyTorch, as a concise, efficient and fast Python open and then use shuffle random numbers to shuffle the data
source machine learning library, is seldom used in the in the batch to avoid overfitting.
existing research on price forecasting. Therefore, this
experiment attempts to use the CNN-LSTM model in the 3.3. Construction And Training Of The CNN
PyTorch environment to realize stock price prediction.
After the pre-commissioning of the convolutional
3. STOCK PRICE PREDICTION BASED ON neural network in this experiment, since the convolution
CNN-LSTM kernel extracted features in two dimensions, the two-
dimensional convolution Conv2D function was used to
construct the convolution layer. The first layer of CNN
3.1. Technical Rout And Research Method
was the input layer, with a 7-day window, and 5 data of
This experiment used PyTorch as the framework of "opening price, closing price, highest price, lowest price
the neural network, used the Python language as the code and trading volume" as input feature, and the number of
implementation of the network, and selected the data channels was set to 1. The second layer was the
information of Shanghai Stock Index stocks from convolution layer. Since the convolution kernel acted on
January 2012 to December 2021 using one of the most the stock factor data, it was essentially performing factor
influential websites in China, "NetEase Finance". A total synthesis, so only one convolution layer was used in this
of 2431 samples were chosen to establish a dataset, experiment. The number of convolution kernels was set
implanting them into CNN to obtain output data, and then to 64, and the size of the convolution kernel was set to
placed output data into the LSTM model to obtain 3*3. At the same time, in order to keep the data size
predicted values. Through this progress to realize unchanged, padding was applied. After the convolution
empirical research in the field of stock price prediction. layer, ReLU was adopted as the activation function.
Because ReLU is linear and unsaturated, so that the
In this experiment, the comparative analysis method convergence speed of the ReLU activation function is
was adopted to compare the trend map of the CNN- faster than other activation functions. In addition, adding
LSTM forecasting outcomes with the actual outcomes, a BatchNormalization layer after the convolutional layer
and at the same time compared it with results predicted could alleviate the over-fitting phenomenon and speed up
1274 W. Xu

the training and convergence of the neural network. The loss value of the CNN-LSTM joint model was
third layer was the pooling layer, the pooling area was 2 significantly lower than the loss value when CNN and
×2, and the stride was 1. The function of pooling was to LSTM were used alone.
further sample the convolved sample features. The fourth
layer was the Dropout layer, and the dropout rate was set
to 0.3. The function of this layer was to alleviate the
overfitting phenomenon of the model. So far, in the joint
model, the construction of the CNN model was
completed.

3.4. Construction And Training Of The LSTM


The LSTM model is a special RNN model. After
effective features are extracted by CNN model, and when Figure 2 MSE loss value of CNN model.
implanting them into LSTM, it can not only find the
interdependence of the data in the time series data, but
also automatically detect the best mode suitable for the
relevant data. Mainly thanks to the three gates of LSTM:
input gate, forget gate and output gate. In practical
application, it is necessary to adjust the parameters of
LSTM according to different situations to ensure the
optimal results.
In this experiment, after the CNN output the data, it
was then inputing into the LSTM layer. After continuous Figure 3 MSE loss value of LSTM model.
debugging, a total of two LSTM layers were set in this
experiment. The neuron nodes of the first layer of LSTM 3.6. Analysis Of Research Results
layer were set to 128, and the neuron nodes of the second
layer were set to 64. In terms of hyperparameter Conclusions drawn from the changes between the
adjustment, the learning rate was set to 0.0001, the predicted results and the real values in Figure 4 were
number of iterations (epochs) was 30, and the ReLU listed as follows: Firstly, the stock price trend predicted
function was still adopted for activation, and finally by the CNN-LSTM joint model can be almost consistent
outputed the data. with the real price trend, and the error results are
relatively small. Secondly, although the real stock price
fluctuates in a wide range and with sharp fluctuations, the
3.5. Loss Value Assessment
joint model of CNN-LSTM can well capture a large
The final loss value of this experiment used MSE for amount of mutation information in the real stock price
loss evaluation. In this paper, Adam was the optimizer fluctuation, and can better predict the price trend of these
chosen to use, and the 0.001 was set as learning rate to mutations, which reflects the robustness of the CNN-
calculate the optimized loss value. The loss diagram of LSTM joint model.
the CNN-LSTM joint model was shown in Figure 1. Although the CNN-LSTM joint model reflects better
prediction results, it still has some flaws. It can be seen
from the trend chart that there will be a certain lag
between the predicted trend and the real trend, and it is
difficult for prediction value to respond immediately.
This is also the evidence that the Chinese stock market is
a weakly efficient market.

Figure 1 MSE loss value of CNN-LSTM model.

In the process of building the joint model, this


experiment also realized the prediction of stock price by
CNN and LSTM model separately in the PyTorch
environment. The fitted MSE loss graph standing for
CNN predicting result was shown in Figure 2, and the
fitted MSE loss graph standing for LSTM predicting Figure 4 Trend comparison between predicted value of
result was shown in Figure 3. The result proved that the CNN-LSTM and true value.
Stock Price Prediction based on CNN-LSTM 1275

In order to further verify the excellent prediction input. Subsequent experiments can consider adding more
ability of the CNN-LSTM joint model, a comparison stock features, such as technical indicators, in order to
experiment was conducted with the forecasting outcome achieve better prediction effect. Secondly, due to the
of the CNN and the LSTM model alone. The comparison existence of many influencing factors in the stock market,
results of CNN and the real value are shown in Figure 5. only numerical data used can not fully capture
The comparison results of LSTM and the real value are fluctuations. In the future experimental process, stock
shown in Figure 6. public opinion analysis or financial news evaluation
analysis will also be considered to extract emotional
factors, and combined with CNN-LSTM model for
prediction, I believe there will be better results.

REFERENCES
[1] Ayodele Adebiyi A, Aderemi O, Adewumi Charles
K.Ayo. (2014) Stock price prediction using the
ARIMA model. 2014 UK Sim-AMSS 16th
Figure 5 Trend comparison between predicted value of International Conference on Computer Modelling
CNN and true value. and Simulation, Cambridge: IEEE, 106-112.
[2] Xu, F. (2006) GARCH Model for Stock Price
Prediction. Statistics and Decision,18:107-109.
[3] Xu, Y. M., Wang, Z. H., Wu, Z. X. (2021) A CNN-
BiLSTM based Multi-feature Integration Model for
Stock Trend Prediction. Data Analysis and
Knowledge Discovery, 5(7): 126-137.
[4] Yu Z, Qin L, Chen Y. (2020) Stock price forecasting
based on LLE-BP neural network model. Physica A:
Figure 6 Trend comparison between predicted value of Statistical Mechanics and its Application, 553:
LSTM and true value. 124197.
From the comparison of Figure 4, Figure 5, and
Figure 6, it can be seen that the prediction result based on
the CNN-LSTM joint model is the best, indicating that
CNN-LSTM can achieve good results in stock price
prediction. The prediction ability of the LSTM model is
second. It can be seen from the comparison that it can
roughly predict the trend, but the ability to capture
mutation information is not as strong as that of the CNN-
LSTM model. The CNN model shows the worst
prediction result, whose trend curve changes slightly, and
the predicted curve is too smooth, which makes it almost
impossible to accurately predict the stock price.
Therefore, the predicted stock price deviates greatly from
the real stock price.

4. CONCLUSION
This experiment realized the prediction of stock
prices with the CNN-LSTM joint model in the PyTorch
environment. Through its performance on the Shanghai
Composite Index dataset, it can be concluded that when
making stock predictions, the model can not only show
the overall trend of stock prices, but also show subtle
changes, finally achieving good prediction results.
However, at the same time, there are still some
shortcomings in this experiment. For example, this paper
only uses five basic trading indicators of stocks as feature
1276 W. Xu

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International
License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution
and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a
link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license, unless indicated
otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons license and your intended use
is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright
holder.

You might also like