Stock Price Forecast: Comparison of LSTM, HMM, and Transformer
Stock Price Forecast: Comparison of LSTM, HMM, and Transformer
Abstract. With the development of deep learning, different kinds of neural net-
work models are applied to the analysis and prediction of time series data. In
the field of finance, deep learning models are widely used to forecast the stock
market, which is an integration of technical data that can directly provide advice
to investors. We chose three neural network models that have been very popular
in the last decade: Long Short-Term Memory (LSTM), Hidden Markov model
(HMM), and Transformer. We use the data of the new energy vehicles sector in
the A-share market to establish and evaluate the model and compare the predictive
performance of the three models. The result shows that Transformer performed the
best-predicting capability of stocks of the new energy sector in the A-share mar-
ket. The model’s performance was quantified using the Mean Absolute Percentage
Error (MAPE) and Matthews Correlation Coefficient (MCC).
1 Introduction
The stock market is an important part of today’s financial markets. The stock market has
always been one of the most popular investing target because of its high returns. However,
due to the unpredictability of the stock market, investing in the stock market carries a
high level of risk. There is a lot of research in the academic community on stock price
forecasting, and it is dedicated to finding a more suitable stock price prediction model for
the stock market. In recent years, new energy vehicles have become a key industry in the
world. In 2021, China/overseas electric vehicle sales will be 3.5 million/3 million units,
up 158%/57% year-on-year. On the one hand, in 2021, the domestic new energy vehicle
market will continue the high boom since the second half of the 20th year. The annual
sales of new energy vehicles will reach 3.5 million, a year-on-year increase of nearly
160%, and the penetration rate will exceed 13%. The 22-year high boom continues, with
sales of 2 million units from January to May, doubling year on year. In 2022, as the
domestic epidemic recedes, the marginal impact will gradually weaken.
Q. Wang and Y. Yuan—These authors contributed equally to this work and should be considered
co-first authors.
Many of the forecasting research has employed the statistical time series analysis
techniques like HMM. Hassan and Nath [1]’s research showed that HMM is explain-
able and has a solid statistical foundation. In recent years, increasing number of stock
prediction systems were based on AI techniques, including artificial neural networks
(ANN) and Transformer [2] Chen et al. [3] used LSTM to highly improved the accuracy
of stock prediction in the Chinese stock market. Nadeem Malibari1 and Iyad Katib [4]
used the Transformer [2] to predict closing prices has a probability above 90% (which
is at most 72% in other ways).
There are many stock price forecasting models based on financial time series analysis
and artificial intelligence algorithms, and each has its advantages and disadvantages.
LSTM is suitable for processing and predicting the important events of interval and long
delays in time series. However, for LSTM, it’s essentially critical to choose sequence
learning features. There’s a limit not to including the economic fundamentals, but also
not the technical analysis data to avoid the co-founding pitfalls. HMM is proved to be
more efficient in extracting information from the dataset. Transformer [2] employs a
multi-head self-attention mechanism to learn the relationship among different positions
globally, so that it can enhance the capacity of learning long-term dependencies. The
specific differences in the prediction of stock prices between models have never been
shown in previous studies, which has created difficulties for investors in the choice of
models.
While the new energy vehicle industry maintains high growth, with the resumption
of work and production and the introduction of land subsidy policies, the supply side
and demand side of the industry It is expected to be gradually repaired. In the context
of carbon neutrality, favorable overseas policies are frequently issued, the electrification
process of mainstream car companies is accelerating, and high-quality supply is coming
one after another. The demand for new energy vehicles is expected to continue to rise. On
the other hand, the current development of electric vehicles in the world is still greatly
affected by policies. If the follow-up stimulus policies do not meet expectations or the
policy continuity is not strong, it will hurt the promotion of electric vehicles. In this
context, the stock price changes of the new energy vehicle industry are ambiguous, and
the research on its stock price forecast has great investment significance.
To solve the confusion about the future stock price of new energy vehicles and
the hesitation between model selection, this paper is dedicated to using HMM, LSTM,
and Transformer [2] models to study the stock price of China’s new energy vehicle
industry respectively, and then concludes that Transformer [2] is better than other mod-
els in the new energy vehicle industry, thus bringing some enlightenment to investors.
Furthermore, to better show investors the differences in stock price forecasts of var-
ious models, this paper uses two-dimensional indicators to evaluate the prediction
performance--MAPE and MCC.
128 Q. Wang and Y. Yuan
2 Related Works
3 Method
This article aims to study stocks in the new energy vehicle market. We have used three
different architectures, HMM, LSTM, Transformer [2] to predict the stock price. The
three models are all realized by PyTorch environment and packages of python.
To better describe the stock price changes in the new energy vehicle market, this
paper selects the data from 2019-6-17 to 2022-6-17 of NEW ENERGY VEHICLES
(399976.SZ). The CSI New Energy Vehicle Index, which involves lithium batteries,
charging piles, new energy vehicles, and other companies from the Shanghai and Shen-
zhen markets, are to reflect the overall performance of securities of listed companies
related to new energy vehicles. We use python to realize the models based on PyTorch.
For a fair comparison, we take the same data source of the index in the new energy
section as inputs for each model, including the open point, close point, highest point,
lowest point, and trading volume of the index each day.
The data varies in a range of 1000 to 7000. The first step is to standardize the data.
When using prices and volume data, all the stock data must be within a typical value
range. Generally, machine learning algorithms converge faster or perform better when
they are close to normally distributed and/or on a similar scale. We use the MinMaxScaler
function to scale the data to the range [−1, 1] so that it can cause less error in the following
steps.
X − X .min(axis = 0)
Xstd =
X .max(axis = 0) − X .min(axis = 0) (1)
Xscaled = Xstd ∗ (max − min) + min
After the three trained models get the test results, we use Root Mean Squared Error
(RMSE) to measure how well each model fits the actual data. RMSE is the standard
deviation of the residuals (prediction errors). Residuals are a measure of how far from
130 Q. Wang and Y. Yuan
the regression line data points are and a measure of how to spread out these residuals
are. The formula of RMSE is as follows.
1 n
2
RMSEfo = z fi − z o i (2)
N
i=1
f =forcasts
where, .
o =observed values
Fig. 1. Good results of the training set and test set in LSTM
Hidden Markov models are based on a set of unobserved latent states, and each state
is associated with a possible transition. Between base states are usually not obvious to
investors. The transition of the basic state is based on the company’s policies, decisions,
economic conditions, etc.
An HMM (denoted by λ) model can be written as follows:
λ = (π, A, B) (10)
where A is the transition matrix whose elements give the probability of a transition from
one state to another, B is the emission matrix bj (Ot ) giving the probability of observing
Ot when in state j and π gives the initial probabilities of the states at t = 1.
132 Q. Wang and Y. Yuan
The predicting data of the testing set and the real one using a 3-year training set
are plotted in Fig. 2 (a). From the picture, we can see that the prediction effect of the
HMM model trained with 3 years of data is not satisfactory, and it is very different
from the actual data in terms of basic trend prediction and specific data prediction. This
reflects that the HMM model is not suitable for long-term forecasting. To test the short-
term prediction ability of the HMM model, we selected the CSI stock index data from
Fig. 2. The prediction of HMM using a 3-year training set and a 1-year training set that are not
satisfying
Stock Price Forecast 133
Fig. 3. The prediction of Transformer, which is the best result we get from the three methods
2021-1-25 to 2022-6-17 as the training set of the HMM short-term model, and obtained
the stock price prediction results of the short-term HMM model, as shown in Fig. 2(b)
shown.
By comparison, we can see that the prediction effect of the short-term HMM model
has been greatly improved in stock price prediction compared with the HMM model
trained with 3 years of data. The short-term HMM model can fit the trend of stock price
changes, but there is still a gap between the specific data and the actual stock price.
3.3 Transformer
In 2017, Transformer [2], a well-known sequence-to-sequence model, achieved great
success on natural machine translation tasks. Transformer [2] employs a multi-head
self-attention mechanism to globally learn the relationship between different locations,
enhancing the ability to learn long-term dependencies.
Transformer [2] proposes a self-attention mechanism, the core formula of which is
as follows.
QK T
Attention(Q, K, V ) = Softmax √ V (11)
dk
The results of stock price prediction under the Transformer [2] method are shown in
Fig. 3. From the figure, we can see that the prediction result of the stock price prediction
model obtained by using the Transformer [2] has a high degree of fit with the actual
stock price. So far, this is the best result we got from the three methods.
4 Discussion
The new energy vehicle market has become a hot topic in recent years, and it has also
attracted the attention of the capital market. At the same time, the stock price of the new
energy vehicle market is difficult to predict because the industry is seriously affected
134 Q. Wang and Y. Yuan
Accuracy Model
HMM LSTM Transformer
R2 0.9553 0.8961 0.9275
MSE 24,489.8730 17,434.1378 0.0053
MAE 125.8156 118.8938 0.0603
by national policies and industry guidance, which makes people feel confusing. Under
this circumstance, it is extremely urgent to apply stock price forecasting research in the
new energy vehicle market. However, according to our observation, the existing stock
price forecast research mainly focuses on the A-share market and has not been deeply
cultivated in a certain industry. The stock price research in the field of new energy
vehicles is relatively lacking.
We extracted the stock price data of CSI New energy in the past year to train and test
HMM, LSTM and Transformer [2] models respectively. To compare the fit of the three
model test results with the actual stock price, we use RMSE to measure the distribution
of the residuals of each model. The RMSE of the test results obtained by each model is
shown in Table 1.
From Table 1 we can see that the RMSE of the Transformer [2] model is much smaller
than that of HMM and LSTM. This represents the best fit between the Transformer [2]
test results and the actual data.
To further compare the three models, we again selected three indicators R2 , MSE
and MAE to measure the fitting effect, as shown in Table 2. We can see that Transformer
[2] has the smallest error regardless of which fitting metric is used. Transformer [2] can
almost capture the trends and gives more accurate predictions than HHM and LSTM.
5 Conclusion
This article aims to compare the stock price prediction models applied to the new energy
vehicle market. We have selected three models of HMM, LSTM, and Transformer for
comparison. It can be seen from our research that the short-term train model of HMM
can predict the direction of stock price movement, but it is not suitable for long-term
prediction. LSTMs work better in long-term predictions. Among the three models, the
best stock price prediction effect is Transformer, which can closely track the trend of
stock prices and make predictions in line with the direction of stock price changes.
Stock Price Forecast 135
The sample selected in this paper is the CSI new energy stock index from 2019–6-17
to 2022–6-17, and there may be differences in the prediction results based on different
samples at different time points. We learned that the three models have been revised more
carefully in the existing research so that the prediction results of these three models are
more in line with expectations. This paper’s efforts in this area are subpar. In conclusion,
the Transformer has the best stock price prediction effect among the three models we
selected. The analysis of these types of trends and cycles will give more profit to the
investors.
References
1. M. R. Hassan and B. Nath, Stock market forecasting using hidden Markov model: a new app-
roach, 5th International Conference on Intelligent Systems Design and Applications (ISDA
2005), 2005, pp. 192–196, https://doi.org/10.1109/ISDA.2005.85.
2. Vaswani, Ashish, et al., Attention is all you need., Advances in neural information processing
systems 30 (2017).
3. Chen K, Zhou Y, Dai F., A LSTM-based method for stock returns prediction: A case study
of China stock market[C]//2015 IEEE international conference on big data (big data). IEEE,
2015: 2823–2824.
4. Malibari N, Katib I, Mehmood R. Predicting Stock Closing Prices in Emerging Markets with
Transformer Neural Networks: The Saudi Stock Exchange Case. International Journal of
Advanced Computer Science and Applications, 2021, 12(12).
5. Box G E P, Jenkins G M, Reinsel G C, et al. Time series analysis: forecasting and control,
John Wiley & Sons, 2015.
6. A. A. Ariyo, A. O. Adewumi and C. K. Ayo, Stock Price Prediction Using the ARIMA Model,
2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation,
2014, pp. 106–112, https://doi.org/10.1109/UKSim.2014.67
7. Hochreiter S, Schmidhuber J. Long short-term memory. Neural computation, 1997, 9(8):
1735-1780.
8. Ge Liu, Wenping Ma, A quantum artificial neural network for stock closing price prediction,
Information Sciences,Volume 598, 2022, Pages 75–85,ISSN 0020–0255
9. Renjun Li, The Applications of Hidden Markov Model in Quantitative Investment, Tsinghua
University, 2019. https://doi.org/10.27266/d.cnki.gqhau.2019.000565.
10. Li Wen, Deng Sheng, Duan Yan, et al. Time Series Prediction and Deep Learning: Literature
Review and Application examples. Computer Applications and Software, 2020, 37(10): 64-
70.
136 Q. Wang and Y. Yuan
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-
NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/),
which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any
medium or format, as long as you give appropriate credit to the original author(s) and the source,
provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative
Commons license, unless indicated otherwise in a credit line to the material. If material is not
included in the chapter’s Creative Commons license and your intended use is not permitted by
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder.