Applications of Deep Learning in Stock Market Prediction Recent Progress
Applications of Deep Learning in Stock Market Prediction Recent Progress
Review
A R T I C L E I N F O A B S T R A C T
Keywords: Stock market prediction has been a classical yet challenging problem, with the attention from both economists
Stock market prediction and computer scientists. With the purpose of building an effective prediction model, both linear and machine
Deep learning learning tools have been explored for the past couple of decades. Lately, deep learning models have been
Machine learning
introduced as new frontiers for this topic and the rapid development is too fast to catch up. Hence, our moti
Feedforward neural network
Convolutional neural network
vation for this survey is to give a latest review of recent works on deep learning models for stock market pre
Recurrent neural network diction. We not only category the different data sources, various neural network structures, and common used
evaluation metrics, but also the implementation and reproducibility. Our goal is to help the interested re
searchers to synchronize with the latest progress and also help them to easily reproduce the previous studies as
baselines. Based on the summary, we also highlight some future research directions in this topic.
* Corresponding author.
E-mail address: [email protected].
https://doi.org/10.1016/j.eswa.2021.115537
Received 5 March 2020; Received in revised form 11 June 2021; Accepted 29 June 2021
Available online 12 July 2021
0957-4174/© 2021 Elsevier Ltd. All rights reserved.
W. Jiang Expert Systems With Applications 184 (2021) 115537
alleviate this problem, we summarize the latest progress of deep of stock prices based on how good or bad are the news about these
learning techniques for stock market prediction, especially those which companies.
only appear in the past three years. We also present the trend of each Aguilar-Rivera, Valenzuela-Rendón, and Rodríguez-Ortiz, 2015
step in the prediction workflow in these three years, which would help presents a review of the application of evolutionary computation
the new-comers to keep on the right track, without wasting time on methods to solving financial problems, including the techniques of ge
obsolete technologies. netic algorithms, genetic programming, multi-objective evolutionary
We focus on the application of stock market, however, machine algorithms, learning classifier systems, co-evolutionary approaches, and
learning and deep learning methods have been applied in many financial estimation of distribution algorithms. Cavalcante, Brasileiro, Souza,
problems. It would be beyond the scope of this survey to cover all these Nobrega, and Oliveira (2016) gives an overview of the most important
problems. However, the findings presented in this survey would also be primary studies published from 2009 to 2015, which cover techniques
insightful for other time series prediction problems in the finance area, e. for preprocessing and clustering of financial data, for forecasting future
g., exchange rate or cryptocurrency price prediction. market movements, for mining financial text information, among others.
We also pay a special attention to the implementation and repro Tkáč and Verner (2016) provides a systematic overview of neural
ducibility of previous studies, which is often neglected in similar sur network applications in business between 1994 and 2015 and reveals
veys. The list of open data and code from published papers would not that most of the research has aimed at financial distress and bankruptcy
only help the readers to check the validity of their findings, but also problems, stock price forecasting, and decision support, with special
implement these models as baselines and make a fair comparison on the attention to classification tasks. Besides conventional multilayer feed
same datasets. Based on our summary of the surveyed papers, we try to forward network with gradient descent backpropagation, various hybrid
point out some future research directions in this survey, which would networks have been developed in order to improve the performance of
help the readers to choose their next movement. standard models.
Our main contribution in this survey are summarized as follows: More recently, Xing, Cambria, and Welsch (2018) reviews the
application of cutting-edge NLP techniques for financial forecasting,
1. We summarize the latest progress of applying deep learning tech which would be concerned when text including the financial news or
niques to stock market prediction, especially those which only twitters is used as input for stock market prediction. Rundo, Trenta, di
appear in the past three years. Stallo, and Battiato (2019) covers a wider topic both in the machine
2. We give a general workflow for stock market prediction, based on learning techniques, which include deep learning, but also the field of
which the previous studies can be easily classified and summarized. quantitative finance from HFT trading systems to financial portfolio
And the future studies can refer to the previous work in each step of allocation and optimization systems. Nti, Adekoya, and Weyori (2019)
the workflow. focuses on the fundamental and technical analysis, and find that support
3. We pay a special attention to implementation and reproducibility, vector machine and artificial neural network are the most used machine
which is often neglected in similar surveys. learning techniques for stock market prediction. Based on its review of
4. We point out several future directions, some of which are on-going stock analysis, Shah, Isah, and Zulkernine (2019) points out some
and help the readers to catch up with the research frontiers. challenges and research opportunities, including issues of live testing,
5. Last but not least, an open GitHub repository on this topic is created algorithmic trading, self-defeating, long-term predictions, and senti
1
, where relevant studies will be collected and updated continuously. ment analysis on company filings.
Different from other related works that cover more papers from the
The rest of this survey is organize as follows: Section 2 presents computer science community, Reschenhofer, Mangat, Zwatz, and Guz
related work; Section 3 gives an overview of the papers we cover; Sec mics (2019) reviews articles covered by the Social Sciences Citation
tion 4 describes the major findings in each step of the prediction Index in the category “Business, Finance” and gives more insight on
workflow; Section 5 gives the discussion about implementation and economic significance. It also points out some problems in the existing
reproducibility; Section 6 points up some possible future research di literature, including unsuitable benchmarks, short evaluation periods,
rections; We conclude this survey in Section 7. and nonoperational trading strategies.
Some latest reviews are trying to cover a wider range, e.g., Shah et al.
2. Related Work (2019) covers machine learning techniques applied to the prediction of
financial market prices, and Sezer, Gudelek, and Ozbayoglu (2019)
Stock market prediction has been a research topic for a long time, covers more financial instruments. However, our motivation is to catch
and there are some review papers accompanied with the development up with the research trend of applying deep learning techniques, which
and flourishment of deep learning methods prior to our work. While have been proved to outperform traditional machine learning tech
their focus could also be applications of deep learning methods, stock niques, e.g., support vector machine in most of the publications, with
market prediction could only be one example of many financial prob only a few exceptions, e.g., Ballings, den Poel, Hespeels, and Gryp
lems in these previous surveys. In this section, we list some of them in a (2015) finds that Random Forest is the top algorithm followed by Sup
chronological order and discuss our motivation and unique perspectives. port Vector Machines, Kernel Factory, AdaBoost, Neural Networks, K-
Back to 2009, Atsalakis and Valavanis (2009) surveys more than 100 Nearest Neighbors and Logistic Regression, and Ersan, Nishioka, and
related published articles that focus on neural and neuro-fuzzy tech Scherp (2019) finds that K-Nearest Neighbor and Artificial Neural
niques derived and applied to forecast stock markets, with the discussion Network both outperform Support Vector Machines, but there is no
of classifications of input data, forecasting methodology, performance obvious pros and cons between the performances of them. With the
evaluation and performance measures used. Li and Ma (2010) gives a accumulation of historical prices and diverse input data types, e.g.,
survey on the application of artificial neural networks in forecasting financial news and twitter, we think the advantages of deep learning
financial market prices, including the forecast of stock prices, option techniques would continue and it is necessary to keep updated with this
pricing, exchange rates, banking and financial crisis. Nikfarjam, Emad trend for the future research.
zadeh, and Muthaiyah (2010) surveys some primary studies which Compared with Sezer et al. (2019), whose focus is deep learning for
implement text mining techniques to extract qualitative information financial time series forecasting and a much longer time period (from
about companies and use this information to predict the future behavior 2005 to 2019 exactly), we focus on the recent progress in the past three
years (2017–2019) and a narrower scope of stock price and market index
prediction. For readers who are also interested in other financial in
1
https://github.com/jwwthu/DL4Stock struments, e.g., commodity price, bond price, cryptocurrency price, etc.,
2
W. Jiang Expert Systems With Applications 184 (2021) 115537
Table 1 Table 2
List of top source journals and the number of papers we cover in this study. List of surveyed markets and stock indexes.
Journal Name Paper Count Country Index Description
Expert Systems with Applications 12 US S&P 500 Index of 505 common stocks issued by 500
IEEE Access 5 large-cap companies
Neurocomputing 3 US Dow Jones Industrial Index of 30 major companies
Complexity 2 Average
Journal of Forecasting 2 US NASDAQ Composite Index of common companies in NASDAQ
Knowledge-Based Systems 2 stock market
Applied Soft Computing 2 US NYSE Composite Index of common companies in New York
Mathematical Problems in Engineering 2 Stock Exchange
PLOS ONE 2 US RUSSELL 2000 Index of bottom 2,000 stocks in the Russell
Others in total 24 3000 Index
China SSE Composite Index of common companies in Shanghai
Stock Exchange
China CSI 300 Index of top 300 stocks in Shanghai and
Shenzhen stock exchanges
Hong HSI Hang Seng Index of the largest companies in
Kong Hong Kong Exchange
Japan Nikkei 225 Index of 225 large companies in Tokyo Stock
Exchange
Korea Korea Composite Index of common companies in Korea Stock
Exchange
India BSE 30 Index of 30 companies exist in Bombay Stock
Exchange
India NIFTY 50 Index of 50 companies exist in National
Stock Exchange
England FTSE 100 Index of 100 companies in London Stock
Exchange
Brazil IBOV Bovespa Index of 60 stocks
France CAC 40 Index of 40 stocks most significant stocks in
Euronext Paris
Germany DAX Index of 30 major German companies in
Frankfurt Stock Exchange
Turkey BIST 100 Index of 100 stocks in Borsa Istanbul Stock
Exchange
Argentina MER Merval Index in Buenos Aires Stock
Exchange
Bahrain BAX Bahrain All Share Index of 42 stocks
Fig. 1. The paper count of different problem types. Chile IPSA Ipsa Index of 40 most liquid stocks
Australia All Ordinaries Index of 500 largest companies in Australian
Securities Exchange
we would refer them to this work. We also care more about the imple
mentation workflow and result reproducibility of previous studies, e.g.,
dataset and code availability, which is a problem that has drawn the the price movement direction, e.g., going up or down, we classify it as a
attention from the AI researchers (Gundersen & Kjensmo, 2018). We classification problem. Most studies are considering the daily prediction
would also pay more attention to the uniqueness of stock market pre (105 of 124) and only a few of them are considering the intraday pre
diction (or financial time series forecasting) from general time series diction (18 of 124), e.g., 5-min or hourly prediction. Only one of the 124
prediction problems, e.g., the evaluation of profitability besides pre papers is considering both the daily and intraday situations (Liu & Wang
diction accuracy. & Wang, 2019).
Based on the target output and frequency, the prediction problems
3. Overview can be classified into four types: daily classification (52 of 124), daily
regression (54 of 124), intraday classification (8 of 124) and intraday
In this section, we give an overview of the papers we are going to regression (11 of 124). A detailed paper count of different prediction
review in this study. All the works are searched and collected from problem types is shown in Fig. 1. The reason behind this could be
Google Scholar, with searching keywords such as deep learning, stock partially justified by the difficulty of collecting the corresponding data.
prediction, stock forecasting, etc. Most of the covered papers (115 out of The daily historical prices and news titles are easier to collect and pro
124) are published in the past three years (2017–2019). In total, we cess for research, while the intraday data is very limited in the academia.
cover 56 journal papers, 58 conference papers and 10 preprint papers. We would further discuss the data availability in Section 5.
These preprint papers are all from arXiv.org, which is a famous website Surveyed markets as well as the most famous stock market index in
for e-print archive and we cover these papers to keep updated with the these markets are shown in the Table 2. Most of the studies would focus
latest progress. The top source journals sorted by the number of papers on one market, while some of them would evaluate their models on
we cover in this study are shown in Table 1. multiple markets. Both mature markets (e.g., US) and emerging markets
In this study, the major focus would be the prediction of the close (e.g., China) are gaining a lot of attention from the research community
prices of individual stocks and market indexes. Some financial instru in the past three years.
ment whose price is bounded to the market index is also covered, e.g.,
some exchange-traded fund (ETF) or equity index futures that track the 4. Prediction Workflow
underlying market index. For intraday prediction, we would also cover
mid-price prediction for limit order books. Other financial instruments Given different combinations of data sources, previous studies
are not mentioned in this study, e.g., bond price and cryptocurrency explored the use of deep learning models to predict stock market price/
price. More specifically, if the target to predict the specific value of the movement. In this section, we summarize the previous studies in a
prices, we classify it as a regression problem, and if the target is to predict general workflow with four steps that most of the studies follow: Raw
3
W. Jiang Expert Systems With Applications 184 (2021) 115537
4
W. Jiang Expert Systems With Applications 184 (2021) 115537
data can be used for prediction, e.g., graph neural networks for knowl Another approach to eliminate noisy data in Sun, Rong, Zhang, Liang,
edge graph data. and Xiong (2017) is the use of the kNN-classifiers, based on two training
sets with different labels in a data preparation layer.
4.1.2. Data Length
To evaluate the performance of different models, historical data is 4.2.3. Feature Extraction
necessary for evaluation. However, there is a tradeoff of choosing the For machine learning models, feature engineering is the process of
data length. A short time period of data is not sufficient to show the extracting input features from raw data based on domain knowledge.
effective and has a higher risk of overfitting, while a long time period Combined with raw data, these handcrafted features are used as input
takes the risk of traversing different market styles and present out-of- for the prediction models and can substantially boost machine learning
dated results. Besides, the data availability and cost are factors that model performance.
needs to be taken into consideration when choosing the data length. For market data, technical analysis is a feature extraction approach
The distribution of time periods of data used in the surveyed papers is that builds various indicators for forecasting the direction of prices
shown in Fig. 3. It is more expensive to get intraday data with a good based on historical prices and volumes, e.g., moving average, or moving
quality and most of the previous studies involving intraday prediction average convergence/divergence (MACD). To extract these technical
would use a time period less than one year. indicators, chart pattern recognition techniques are widely used (Leigh,
For a single prediction, lag is used to denote the time length of the Modani, Purvis, & Roberts, 2002; Cervelló-Royo, Guijarro, & Michniuk,
input data to be used by the model, e.g., in the daily prediction, a lag of 2015; Arévalo, García, Guijarro, & Peris, 2017), e.g., those based on
30 days means the data in the past 30 days are used to build the input template matching. These technical indicators can be further used to
features. For technical indicators, lag is often set as an input parameters design simple trading strategies. Technical indicators are also used to
and vary a lot in previous studies from 2 to 252 time periods. Corre build image inputs, e.g., 15 different technical indicators with a 15-days
spondingly, horizon is used to denote the time length of the future to be periods are used to construct a 15 × 15 sized 2D images in Sezer and
predicted by the model. Most of the studies focus on a short-term pre Ozbayoglu (2018).
diction horizon, e.g., one day or five minutes, with only a few exceptions While the feature extraction techniques represented by technical
for a longer horizon such as five days or ten days. analysis for market data have been used and validated for many years,
the tools for extracting features from text data have made a greater
progress in the past few years, owing to the various deep learning
4.2. Data Processing models developed for natural language processing. Before the popularity
of machine learning models, the bag-of-words (BoW) model (Harris,
4.2.1. Missing Data Imputation 1954) is used as a representation of text that describes the occurrence of
The problem of missing data is not as severe as in other domains, e.g., words within a document. In recent years, machine learning and deep
sensor data, because the market data is more reliable and well supported learning models show an improved performance for word embedding.
and maintained by the trading markets. However, to align multiple types Given the sequence of words, the word2vec model (Mikolov, Chen,
of data with different sampling frequencies, e.g., market data and Corrado, & Dean, 2013), which are shallow and two-layer neural net
fundamental data, the data with a lower sampling frequency should be works, can be used to embed each of these words in a vector of real
inserted in a forward way by propagating the last valid observation numbers and has been used in Liu et al. (2019); Liu, Zeng, Ordieres
forward to next valid, to avoid data leakage of the future information. Meré, and Yang, 2019; Lee and Soo, 2017. Global Vectors for Word
Representation (GloVe) (Pennington, Socher, & Manning, 2014) is
4.2.2. Denoising another word embedding method proposed by Stanford University, in
With many irrational behaviors in the stock trading process, the which each of the word vectors has a dimension of 50, and has been used
market data is filled with noise, which may misrepresent the trend of in TTang and Chenang and Chen (2018).
price change and misguide the prediction. As a signal processing tech Stock markets are highly affected by some public events, which can
nique, wavelet transform has been used to eliminate noise in stock price be extracted from online news data and used as input features. Ding,
time series (Bao, Yue, & Rao, 2017; Liang, Ge, Sun, He, & Chen, 2019).
5
W. Jiang Expert Systems With Applications 184 (2021) 115537
Table 3
Article Lists of combinations of input features.
Combination Article List
Historical Prices (Chen et al., 2019; Ding & Qin, 2019; Zhou et al., 2019; Nguyen et al., 2019; Yang et al., 2017; Li & Tam, 2017; Li et al.,
2017; Sachdeva et al., 2019; Mansourfar & Bagherzadeh, xxxx; Tsantekidis et al., 2017; LLiu & Cheniu & Chen, 2019; Qin
et al., 2017; Siami-Namini et al., 2019; Guang et al., 2019; Zhao et al., 2017; Althelaya et al., 2018; Baek & Kim, 2018; Liang
et al., 2019; Fischer & Krauss, 2018; Pang et al., 2018; Tran et al., 2018; Tsantekidis et al., 2017; Chen et al., 2018; Wang
et al., 2019; Zhang et al., 2017; Karathanasopoulos & Osman, 2019; Hollis et al., 2018; KKim & Kangim & Kang, 2019; Cao
& Wang, 2019; Selvin et al., 2017; de A. Araújo et al., 2019; Wang & and Wang, 2015; Zhang et al., 2019; Zhang et al., 2017;
Wu & Gao, 2018; Long et al., 2019; Cao et al., 2019; Hossain et al., 2018; Eapen et al., 2019; Zhan et al., 2018; Lei, 2018;
Chong et al., 2017; Siami-Namini et al., 2018)
Historical Prices + Technical Indicators (Assis et al. (2018); Cheng et al., 2018; Nelson et al., 2017; Gunduz et al., 2017; Sanboon et al., 2019; Chen & Ge, 2019;
Stoean et al., 2019; Al-Thelaya et al., 2019; Yu & Wu, 2019; Li et al., 2019; Gao & Chai, 2018; Chung & Shin, 2018; Yan &
Ouyang, 2018; Sethia & Raut, 2019; Sun et al., 2019; Zhang et al., 2019; Chen et al., 2017; Zhou et al., 2018; Borovkova &
Tsiamas, 2019; Chen et al., 2019; Feng et al., 2019; Sun et al., 2017; Yang et al., 2018; Ticknor, 2013; Song et al., 2019;
Göv¸ken et al., 2016; SSingh & Srivastavaingh & Srivastava, 2017; Patel et al., 2015; Liu & SongLiu & Song, 2017; Merello
et al., 2019)
Historical Prices + Text (Jin et al., 2019; Li et al., 2019; Liu et al., 2019; Liu & Wang & Wang, 2019; Wang et al., 2019; Xu & Cohen, 2018; Matsubara
et al., 2018; TTang & Chenang & Chen, 2018; Huang et al., 2018; Wu et al., 2018; Kumar et al., 2019; Li et al., 2017; Tang
et al., 2019; Huynh et al., 2017; Mohan et al., 2019; Hu et al., 2018)
Historical Prices + Technical Indicators + (Dingli & Fournier, 2017; de Oliveira et al., 2013; Zhong & Enke, 2017; Bao et al., 2017; Tsang et al., 2018; Hoseinzade &
Macroeconomics Haratizadeh, 2019; Hoseinzade et al., 2019)
Historical Prices + Technical Indicators + Text (Vargas et al., 2017; Liu et al., 2019; Oncharoen & Vateekul, 2018; Minh et al., 2018; Lee & Soo, 2017; Chen et al., 2018)
Text (Liu et al., 2018; Hu et al., 2018; Ding et al., 2015; Ding et al., 2014)
Image (Sezer & Ozbayoglu, 2018; Sim et al., 2019; Lee et al., 2019; Sezer & Ozbayoglu, 2019)
Historical Prices + Technical Indicators + (Ballings et al., 2015; Niaki & Hoseinzade, 2013)
Fundamental + Macroeconomics
Historical Prices + Knowledge Graph (Kim et al., 2019; Chen et al., 2018)
Historical Prices + Knowledge Graph + Text (Deng et al., 2019)
Historical Prices + Knowledge Graph + Technical (Matsunaga et al., 2019)
Indicators
Historical Prices + Fundamental data + Text (Tan et al., 2019)
Historical Prices + Fundamental + Knowledge Graph (Liu et al., 2019)
+ Text
Historical Prices + Macroeconomics (Jiang et al., 2018)
Historical Prices + Image (Kim & Kim, 2019)
Zhang, Liu, and Duan (2015) uses a neural tensor network to learn event 4.2.4. Dimensionality Reduction
embeddings for representing news documents. Hu, Liu, Bian, Liu, and It is possible that many features are highly correlated with each
Liu (2018) uses a news embedding layer to encode each news into a other, e.g., the technical indicators which are all calculated from his
news vector. Wang, Li, Huang, and Li (2019) uses a convolution neural torical open/high/low/close prices and volume. To alleviate the corre
network (CNN) layer to extract salient features from transformed event sponding problem of deep learning model’s overfitting, dimensionality
representations. reduction for the input features has been adopted as a preprocessing
From sentiment aspect, the text data can be further analyzed and a technique for stock market prediction.
sentiment vector can depict each word, which may present the positive Principal component analysis (PCA) is a commonly used trans
or negative opinions for the future direction of stock prices. For senti formation technique that uses Singular Value Decomposition of the
ment analysis, Jin, Yang, and Liu (2019) uses CNN and Mohan, Mulla input data to project it to a lower dimensional space and has been used in
pudi, Sammeta, Vijayvergia, and Anastasiu (2019) uses Natural Gao and Chai (2018); Zhang, Zohren, and Roberts, 2019; Zhong and
Language Toolkit (NLTK) (Loper & Bird, 2002). Lien Minh, Sadeghi- Enke, 2017; Wang and and Wang, 2015; SSingh and Srivastavaingh and
Niaraki, Huy, Min, and Moon (2018) even proposes a sentiment Srivastava, 2017; Chong, Han, and Park, 2017. Zhong and Enke (2017)
Stock2Vec embedding model trained on both the stock news and the even gives a comparison between different versions of PCA, and finds
Harvard IV-4 psychological dictionary, which may not be directly that the PCA-ANN model give a slightly higher prediction accuracy for
related to stock market prediction. the daily direction of SPY for next day, compared to the use of fuzzy
Off-the-shelf commercial software is also available for linguistic robust principal component analysis (FRPCA) and kernel-based prin
features and sentiment analysis. For example, Kumar, Ravi, and Miglani cipal component analysis (KPCA).
(2019) employs Linguistic Inquiry and Word Count (LIWC) (Tausczik & For dimensionality reduction, the other options include independent
Pennebaker, 2010) to find out the linguistic features in the news articles, components analysis (ICA) (Sethia & Raut, 2019), autoencoder (Chong
which includes the text analysis module along with a group of built-in et al., 2017), restricted Boltzmann machine (Chong et al., 2017),
dictionaries to count the percentage of words reflecting different emo empirical mode decomposition (EMD) (Cao, Li, & Li, 2019; Zhou, Zhou,
tions, thinking styles, social concerns, and even parts of speech. Yang, & Yang, 2019), and sub-mode coordinate algorithm (SMC)
For knowledge graph data used more recently, the TransE model (Huang, Zhang, Zhang, & Zhang, 2018). Huang et al. (2018) first utilizes
(Bordes, Usunier, Garcia-Duran, Weston, & Yakhnenko, 2013) is a tensor to integrate the multi-sourced data and further proposes an
computationally efficient predictive model that satisfactorily represents improved SMC model to reduce the variance of their subspace in each
a one-to-one type of relationship and has been used in Liu et al. (2019). dimension produced by the tensor decomposition.
Based on the input raw data and extracted features, we show the Feature selection is another way of dimensionality reduction, by
distribution of different combinations of input features in Fig. 4 and the choosing only a subset of input features. Chi-square method (Zheng, Wu,
detailed article lists in Table 3. From Fig. 4, historical prices and tech & Srihari, 2004) and maximum relevance and minimum redundancy
nical indicators are the most commonly used input features and followed (MRMR) (Peng, Long, & Ding, 2005) are two common used feature se
by text and macroeconomics data. This could be explained by the easier lection techniques. Chi-square method decides whether a categorical
accessing and processing of market data than other data types. predictor variable and the target class variable are independent or not.
6
W. Jiang Expert Systems With Applications 184 (2021) 115537
Fig. 5. The example or rolling training-validation-test data splits. (a) Rolling training set; (b) Successive training set.
High chi-squared values indicate the dependence of the target variable including stock prediction (Bao et al., 2017; Nelson, Pereira, & de Oli
on the predictor variable. Minimum redundancy maximum relevance veira, 2017; Fischer & Krauss, 2018; Gao & Chai, 2018; Zhou, Pan, Hu,
uses a heuristic to minimize redundancy while maximizing relevance to Tang, & Zhao, 2018; NNguyen & Yoonguyen & Yoon, 2019; Kim et al.,
select promising features for both continuous and discrete inputs, 2019; Sun, Xiao, Liu, Zhou, & Xiong, 2019; Wang, Sun, Liu, Cao, & Zhu,
through F-statistic values. Chi-square method is used in Gunduz, Yaslan, 2019). The process of a rolling train-validation-test split is shown in
and Cataltepe (2017); Kumar et al., 2019 and maximum relevance and Fig. 5(a), where only the latest part of data samples are used for a new
minimum redundancy is used in Kumar et al. (2019). training round of prediction models. Another variant is to use successive
Other options for feature selection include rough set attribute training sets, which are union set of the rolling training set that come
reduction (RSAR) (Lei, 2018), autocorrelation function (ACF) and par before them, as shown in Fig. 5(b).
tial correlation function (PCF) (Wu & Gao, 2018), the analysis of vari
ance (ANOVA) (Niaki & Hoseinzade, 2013), and maximal information 4.2.7. Data Augmentation
coefficient feature selection (MICFS) (Yang, Zhu, & Huang, 2018). Data augmentation techniques have been widely used for image
classification and object detection tasks and proved to effectively
4.2.5. Feature Normalization & Standardization enhance the classification and detection performance. However, it is less
Given different input features with varying scales, feature normali used for time series tasks including stock prediction, even though the
zation and standardization are used to guarantee that some machine size of stock price time series is not comparable to the size of public
learning models can work and also help to improve the model’s training image datasets, which usually have millions of sample and even more in
speed and performance. Feature normalization refers to the process of recent years.
rescaling the input feature by the minimum and range, to make all the There are still a few works that explore the usage of data augmen
values lie between 0 and 1 (Wang & and Wang, 2015; Gunduz et al., tation. Zhang, Rong, Liang, Sun, and Xiong (2017) firstly clusters
2017; Li, Yang, Xue, & Zhou, 2017; SSingh & Srivastavaingh & Srivas different stocks based on their retracement probability density function
tava, 2017; Althelaya, El-Alfy, & Mohammed, 2018; Althelaya, El-Alfy, and combines all the day-wise information of the same stock cluster as
& Mohammed, 2018; Baek & Kim, 2018; Chen et al., 2018; Chung & enlarged training data. In the ModAugNet framework proposed in Baek
Shin, 2018; Gao & Chai, 2018; Hossain, Karim, Thulasiram, Bruce, & and Kim (2018), the authors choose 10 companies’ stock that are highly
Wang, 2018; Hu, Tang, Zhang, & Wang, 2018; Minh et al., 2018; Pang, correlated to the stock index and augment the data samples by using the
Zhou, Wang, Lin, & Chang, 2018; TTang & Chenang & Chen, 2018; combinations of 10 companies taken 5 at a time in an overfitting pre
Tsang, Deng, & Xie, 2018; Yang et al., 2018; Zhan et al., 2018; Al- vention LSTM module, before feeding the data samples to a prediction
Thelaya, El-Alfy, & Mohammed, 2019; de A. Araújo et al., 2019; Cao LSTM module for stock market index prediction.
et al., 2019; Cao & Wang, 2019; Ding & Qin, 2019; Lee, Kim, Koh, &
Kang, 2019; Li, Li, Yang, Yang, & Liu, 2019; Sachdeva, Jethwani,
Manjunath, Balamurugan, & Krishna, 2019; Sethia & Raut, 2019; Tang, 4.3. Prediction Model
Shen, & Yao, 2019), or between − 1 and 1 (Ticknor, 2013; Zhang,
Aggarwal, & Qi, 2017; Sezer & Ozbayoglu, 2018). Feature standardi Most of the prediction models belong to a supervised learning
zation means subtracting a measure of location and dividing by a approach, when training set is used for the training and test set is used
measure of scale, e.g., the z-score method that subtracts the mean and for evaluation. Only a few of the studies use semi-supervised learning
divides by the standard deviation (Tsantekidis et al., 2017; Tsantekidis when the labels are not available in the feature extraction step. We
et al., 2017; Zhang et al., 2019; Li, Song, & Tao, 2019). further classify the various prediction models into three types: standard
models and their variants, hybrid models, and other models. For stan
4.2.6. Data Split dard models, three families of deep learning models, namely, feedfor
For evaluation of different prediction models, in-sample/out-of- ward neural network, convolutional neural network and recurrent
sample split or train/validation/test split of data samples is commonly neural network, are used a lot. And we category the use of generative
used in machine learning and deep learning fields. The model is trained adversarial network, transfer learning, and reinforcement learning into
with the training or in-sample data set, the hyper-parameters is fine- other models. These models only appear in recent years and are still in
tuned on the validation data set optional, and the final performance is an early stage of being applied for stock market prediction.
evaluated on the test or out-of-sample data set. k-fold cross validation is In this part, we are focusing on the usage of different types of deep
further used to split the dataset into k consecutive folds, and k-1 folds is learning models, instead of diving into the details of each model. For a
used as the training set, while the last fold is then used as a test set. more detailed introduction to deep learning models, we refer the readers
As a special case, train-validation-test split with a rolling (or sliding, to Goodfellow et al. (2016). The abbreviations of machine learning and
moving, walk-forward) window is also often used for time series tasks deep learning methods are shown in Table 12 in the appendix.
Some of the earlier work use ANNs as their prediction models and
7
W. Jiang Expert Systems With Applications 184 (2021) 115537
Table 4
List of standard models or their variants.
Article Prediction Model Baselines
8
W. Jiang Expert Systems With Applications 184 (2021) 115537
Table 4 (continued )
Article Prediction Model Baselines
Sanboon et al. (2019) LSTM SVM, MLP, DT, RF, Logit, kNN
Sethia and Raut (2019) LSTM GRU, ANN, SVM
Siami-Namini et al. (2019) BiLSTM LSTM, ARIMA
Tan et al. (2019) eLSTM SVM, DT, ANN, LSTM, AZFin Text (Schumaker & Chen, 2009), TeSIA (Li et al., 2016)
Tran et al. (2018) TABL Ridge, FFNN, LDA, MDA, MTR, WMTR (Tran et al., 2017), MCSDA (Tran et al., 2017), BoF, N-BoF (Passalis
et al., 2017), SVM, MLP, CNN (Tsantekidis et al., 2017), LSTM (Tsantekidis et al., 2017)
Wang et al. (2019) BiLSTM Random guess, ARIMA, SVM, MLP, HAN (Hu et al., 2018)
Other Types
Li et al. (2017) DBN N/A
Matsubara et al. (2018) DGM SVM, MLP
Liu and Wang and Wang (2019) seq2seq model with AZFin Text (Schumaker & Chen, 2009), DL4S (Akita et al., 2016), DA-RNN (Qin et al., 2017), MKL, ELM (Li
attention et al., 2016)
Karathanasopoulos and Osman, DBN MACD, ARMA
2019
study the effect of different combinations of input features (Niaki & problem in practice, when the gradients of some of the weights start
Hoseinzade, 2013; de Oliveira, Nobre, & Zarate, 2013; Zhong & Enke, to shrink or enlarge if the network is unfolded too many times. Long
2017). In this survey, we use ANN to refer to the neural networks which short-term memory (LSTM) networks are RNNs that solve the van
only have one or zero hidden layers, and DNN to refer to those which ishing gradient problem, where the hidden layer is replaced by
have two or more hidden layers. The list of standard models or their recurrent gates called forget gates. Gated recurrent unit (GRU) is
variants is shown in Table 4. We organize the standard models into three another RNN that uses forget gates, but has fewer parameters than
major types: LSTM. Bi-directional RNN are RNNs that connect two hidden layers
of opposite directions to the same output. Both bi-directional LSTM
• Feedforward neural network (FFNN). It is the simplest type of artificial (BiLSTM) and bi-directional GRU (BGRU) have been used for stock
neural network wherein connections between the nodes do not form market prediction.
a cycle. An artificial neural networks (ANN) are learning models
inspired by biological neural networks, and the neuron in an ANN While standard models perform well at early stages of research, their
consists of an aggregation function which calculates the sum of the variants are further developed to improve the prediction performance.
inputs, and an activation function which generates the outputs. An One approach is to use stacked models, where neural network sub-
autoencoder (AE) is a subset of ANN which has the same number of models are embedded in a larger stacking ensemble model for training
nodes in the input and output layers. When ANN has two or more and prediction. Another approach is to introduce the attention mecha
hidden layers, we denote it as deep neural network (DNN) is this nism (Treisman & Gelade, 1980) into recurrent neural network models,
survey. We also category the following models into this family in which attention is a generalized pooling method with bias alignment
because they share a similar structure: backpropagation neural over inputs.
network (BPNN), multilayer perceptron (MLP), extreme learning There are also some types that we list separately:
machines (ELM) where the parameters of hidden nodes need not be
tuned, deep increasing–decreasing-linear neural network (IDLNN) • Restricted Boltzmann machine (RBM) is a generative stochastic
where each layer is composed of a set of increasing–decreasing-linear artificial neural network that can learn a probability distribution
processing units (de A. Araújo et al., 2019), stochastic time effective over its set of inputs. And a deep belief network (DBN) can be defined
function neural network (STEFNN) (Wang & and Wang, 2015), as a stack of RBMs. DBNs have been used in Li et al. (2017) and
radial basis function network (RBFN) that uses radial basis functions Karathanasopoulos and Osman (2019) for stock prediction.
as activation functions. • Sequence to sequence (seq2seq) model is based on the encoder-
• Convolutional neural network (CNN). Designed for processing two- decoder architecture to generate a sequence output for a sequence
dimensional images, each group of neurons, which is also called a input, in which both the encoder and the decoder use recurrent
filter, performs a convolution operation to a different region of the neural networks. Seq2seq model has been used in Liu and Wang and
input image and the neurons share the same weights, which reduces Wang (2019) for stock prediction.
the number of parameters compared to the densely connected feed
forward neural network. Pooling operations, e.g., max pooling, are While our focus in this survey is not the linear models or the tradi
used to reduce the original size and can be used for multiple times, tional machine learning models, they are often used as baselines for
until the final output is concatenated to a dense layer. Powered by comparison with deep learning models.
the parallel processing ability of graphics processing unit (GPU), the Some often used linear prediction models are as follows:
training of CNN has been shortened and CNN has achieved an
astonishing performance for image related tasks and competitions. • Linear regression (LR). Linear regression is a classical linear model
By reducing the convolutional and pooling operations to a single that tries to fit the relationship between the predicted target and the
temporal dimension, 1D CNN is proposed for time series classifica input variables with a linear model, in which the parameters can be
tion and prediction, e.g., Deng et al. (2019) uses a 1-D fully- learned in the least squares approach.
convolutional network (FCN) architecture, where each hidden • Autoregressive integrated moving average (ARIMA). ARIMA is a
layer has the same length as the input layer, and zero padding is generalization of the autoregressive moving average (ARMA) model,
added to keep subsequent layers the same length as previous ones. which describes a weakly stationary stochastic process with two
• Recurrent neural network (RNN). Compared with feedforward neural parts, namely, the autoregression (AR) and the moving average
network, recurrent neural network is an artificial neural network (MA). Compared with ARMA, ARIMA is capable of dealing with non-
wherein connections between the nodes form a cycle along a tem stationary time series, by introducing an initial differencing step,
poral sequence, which helps it to exhibit temporal dynamic behavior. which is referred as the integrated part in the model.
However, normal RNNs are bothered by the vanishing gradient
9
W. Jiang Expert Systems With Applications 184 (2021) 115537
Table 5 Table 6
List of hybrid models between deep learning models and traditional models. List of hybrid models between different deep learning models.
Article Prediction Model Baselines Article Prediction Model Baselines
Patel et al. SVRG-ANN SVR-RF, SVR-SVR Bao et al. (2017) WT + SAEs + LSTM WT + LSTM, LSTM, RNN
(2015) Lee and Soo RNN + CNN LSTM
Liu and SongLiu Bagging + ANN SVM, ANN, GA-ANN, RF (2017)
and Song Qin et al., 2017 DA-RNN ARIMA, NARX RNN (
(2017) Diaconescu, 2008), Encoder-
Yang et al. Bagging + ANN N/A Decoder (Cho et al., 2014),
(2017) attention based LSTM
Assis et al. RBM + SVM SVM Encoder-Decoder (Bahdanau
(2018) et al., 2014)
Chen et al. RNN + AdaBoost MLP, SVR, RNN Vargas et al. CNN + LSTM DNN (Ding et al., 2014; Ding
(2018) (2017) et al., 2015), CNN (Ding et al.,
Lei (2018) 2RS-WNN BP-NN, RBF-NN, ANFIS-NN, 2015)
SVM, WNN, RS-WNN Chen et al. (2018) DNN + AE + RBM ANN, ELM, RBFNN
Wu and Gao AB-LSTM ARIMA, MLP, SVR, ELM, LSTM, Hossain et al. LSTM + GRU MLP, RNN, CNN
(2018) AB-MLP, AB-SVR, AB-ELM (2018)
Chen et al. PLR + CNN + Dual SVR, LSTM, CNN, LSTM_CNN ( Hu et al. (2018) Hybrid Attention Networks RF, MLP, GRU, BGRU,
(2019) Attention Mechanism based Lin et al., 2017), TPM_NC Temporal-Attention-RNN,
Encoder-Decoder News-Attention-RNN
Li et al. (2019) LSTM + ARIMA LSTM Oncharoen and CNN + LSTM N/A
Li et al. (2019) RNN with high-order MRFs LSTM, attention based LSTM Vateekul
Encoder-Decoder (Bahdanau (2018)
et al., 2014), DA-RNN (Qin et al., Pang et al. (2018) AE + LSTM DBN, MLP, DBN + MLP
2017) TTang and CNN + LSTM LSTM, FFNN, CNN
Sun et al. ARMA-GARCH-NN DNN, LSTM Chenang and
(2019) Chen (2018)
Zhou et al. EMD2FNN ANN, FNN, EMD2NN, WDBPNN ( Wu et al. (2018) CH-RNN DA-RNN (Qin et al., 2017)
(2019) Wang et al., 2011), Long-short Zhan et al. (2018) 1D CNN + LSTM LSTM, GRU
Strategy Al-Thelaya et al. LSTM AE + stacked LSTM LSTM, MLP
(2019)
Eapen et al. CNN + BiLSTM SVR
• Generalized autoregressive conditional heteroskedasticity (GARCH). (2019)
GARCH is also a generation of the autoregressive conditional het Guang et al. MSTD-RCNN SVM, RF, FDNN, TreNet (Lin
(2019) et al., 2017), SFM RNN (HHu &
eroscedasticity (ARCH) model, which describes the error variance as Qiu & Qi, 2017), MS-CNN (Cui
a function of the actual sizes of the previous time periods’ error et al., 2016)
terms. Instead of using AR model in ARCH, GARCH assumes an Jin et al. (2019) CNN + LSTM + Attention LSTM
ARMA model for the error variance, which generalizes ARCH. Kim and Kim CNN + LSTM CNN, LSTM
(2019)
LLiu and Cheniu SRCGUs MGUs, GRUs and LSTMs
Similarly, some often used machine learning models are as follows: and Chen
(2019)
• Logistics regression (Logit). Logistics regression can be seen as a Long et al. (2019) CNN + RNN RNN, LSTM, CNN, SVM, Logit,
generalized linear model, in which a logistic function is used to RF, LR
Tang et al. (2019) MAFN MA, RF, XGBoost, SVR,
model the probabilities of a binary target of being 0 or 1. It is suitable Adversarial Attentive LSTM (
for the classification of price movements, e.g., going up or down. Feng et al., 2019), HAN (Hu
• Support vector machine/regression (SVM/SVR). Support vector ma et al., 2018), StockNet (Xu &
chine is a classical and powerful tool for classification with a good Cohen, 2018)
Wang et al. A Convolutional LSTM CNN, LSTM, seq2seq model
theoretical performance guarantee and has been widely adopted
(2019) Based Variational Seq2seq with attention
before the popularity of deep learning models. SVM tries to learn a Model with Attention
hyperplane to distinguish the training samples that maximize dis Yu and Wu CEAM + DA-RNN DA-RNN (Qin et al., 2017), CF-
tance of the decision boundary from training samples. Combine with (2019) DA-RNN
the kernel trick, which maps the input training samples into high- Zhang et al. CNN + LSTM SVM, MLP, CNN (Tsantekidis
(2019) et al., 2017)
dimensional feature spaces, SVM can efficiently perform non-linear
classification tasks. Support vector regression is the regression
version of SVM. trading strategies, e.g., momentum strategy, which is introduced in
• k-nearest neighbor (kNN). kNN is a non-parametric model for both Jegadeesh and Titman (1993), is a simple strategy of buying winners
classification and regression, in which the output is the class most and selling losers. MACD (Appel & Dobson, 2007) consists of the MACD
common or the average of the values among k nearest neighbors. A line and the signal line, and the most common MACD strategy buys the
useful technique is to assign weights to the neighbors when combing stock when the MACD line crosses above the signal line and sells the
their contributions. stock when the MACD line crosses below the signal line (Lee et al.,
2019). In our surveyed papers, RSI (14 days, 70–30) and SMA (50 days)
Given the predicted movement direction or prices, a long-short are used to design baseline trading strategies (Sezer & Ozbayoglu,
strategy can be further designed to perform trading based on the pre 2018).
diction model, e.g., if the predicted direction is going up, long it, We further categories the hybrids into two classes, namely, the
otherwise short it. A simple baseline is the Buy&Hold Strategy, which hybrid models between deep learning models and traditional models,
buys the asset at the beginning and hold it to the end of the testing and the hybrid models between different deep learning models.
period, without any further buying or selling operations (Niaki & The list of hybrid models between deep learning models and tradi
Hoseinzade, 2013; Sezer & Ozbayoglu, 2018). tional models is shown in Table 5. Li et al. (2019) formulates a
Technical indicators are also often used for designing baseline sentiment-ARMA model to incorporate the news articles as hidden
10
W. Jiang Expert Systems With Applications 184 (2021) 115537
Table 7
List of other types of models.
Article Prediction Baselines
Model
Capsule Network
Liu et al., 2019 CapTE TSLDA (NNguyen & Shiraiguyen &
Shirai, 2015), HAN (Hu et al., 2018),
HCAN (Xu & Cohen, 2018), CH-RNN (
Wu et al., 2018)
Reinforcement Learning
Lee et al. (2019) Deep Q- FC network, CNN, and LSTM,
Fig. 7. The usage of different optimizers.
Network + CNN momentum, MACD
Transfer Learning
Hoseinzade et al. Transfer PCA + ANN (Zhong & Enke, 2017),
(2019) Learning + ANN (Kara et al., 2011), CNN (Gunduz
CNN et al., 2017), CNN (Hoseinzade &
Haratizadeh, 2019)
NNguyen and Transfer SVM, RF, KNN, LSTM (Fischer &
Yoonguyen and Learning + Krauss, 2018)
Yoon (2019) LSTM
deep learning model, comprising three main building blocks that include
a standard convolutional layer, an Inception Module and a LSTM layer.
Guang, Xiaojie, and Ruifan (2019) uses convolutional units to extract
multi-scale features that precisely describe the state of the financial
market and capture temporal, and uses a recurrent neural network to
capture the temporal dependency and complementary across different
Fig. 6. The usage of different models. scales.
For the latter case, Al-Thelaya et al. (2019) proposes a forecasting
information and designs a LSTM-based DNN, which consists of three model, using a combination of LSTM autoencoder and stacked LSTM
components, namely, LSTM, VADER model and differential privacy network. Wang et al. (2019) proposes a hybrid model, consisting of
mechanism that integrates different news sources. To deal with strong stochastic recurrent networks, the sequence-to-sequence architecture,
noise, Liu and SongLiu and Song (2017) uses weak ANNs to get some the self- and inter-attention mechanism, and convolutional LSTM units.
information without over-fitting and get better results by combining the LLiu and Cheniu and Chen (2019) proposes an elective Recurrent Neural
weak results together using optimized bagging. Wu and Gao (2018) uses Networks with Random Connectivity Gated Unit (SRCGUs) that train
AdaBoost algorithm to generate both training samples and ensemble random connectivity LSTMs, GRUs and MGUs simultaneously.
weights for each LSTM predictor and the final prediction results are the We list other types of models in Table 7. We category five types of
combination of all the LSTM predictors with ensemble weights. models in this part, which have not been fully explored for stock market
The list of hybrid models between different deep learning models is prediction but already show some promising results.
shown in Table 6. Two popular combinations are the combination of
CNN and RNN structures and the combination of different RNNs. • Generative adversarial network (GAN). GAN is introduced by Good
For the former case, TreNet (Lin, Guo, & Aberer, 2017) hybrids LSTM fellow et al. (2014), in which a discriminative net D learns to
and CNN for stock trend classification. Zhang et al. (2019) proposes a distinguish whether a given data instance is real or not, and a
11
W. Jiang Expert Systems With Applications 184 (2021) 115537
Table 8
Article Lists archived by different evaluation metrics.
Metrics Article List
Classification Metrics
Accuracy (de Oliveira et al., 2013; Niaki & Hoseinzade, 2013; Ding et al., 2014; Ding et al., 2015; Chen et al., 2017, 2017; Huynh et al., 2017; Li & Tam,
2017; Liu & SongLiu & Song, 2017; Nelson et al., 2017; Selvin et al., 2017; SSingh & Srivastavaingh & Srivastava, 2017; Sun et al., 2017; Vargas
et al., 2017; Weng et al., 2017; Yang et al., 2017; Zhang et al., 2017; Zhao et al., 2017; Assis et al., 2018; Chen et al., 2018; Chen et al., 2018;
Chen et al., 2018; Cheng et al., 2018; Fischer & Krauss, 2018; Gao & Chai, 2018; Hu et al., 2018; Huang et al., 2018; Liu et al., 2018; Liu & Wang
& Wang, 2019; Matsubara et al., 2018; Minh et al., 2018; Oncharoen & Vateekul, 2018; Pang et al., 2018; Sezer & Ozbayoglu, 2018; TTang &
Chenang & Chen, 2018; Tran et al., 2018; Wu et al., 2018; Yang et al., 2018; Zhou et al., 2018; de A. Araújo et al., 2019; Chen & Ge, 2019; Deng
et al., 2019; Feng et al., 2019; Guang et al., 2019; Karathanasopoulos & Osman, 2019; Lee et al., 2019; Li et al., 2019; Li et al., 2019; Liu et al.,
2019; Liu et al., 2019; Long et al., 2019; Merello et al., 2019, 2019; Sanboon et al., 2019; Song et al., 2019; Sun et al., 2019; Tan et al., 2019;
Tang et al., 2019; Wang et al., 2019)
Precision (Gunduz et al., 2017; Li et al., 2017; Nelson et al., 2017; Tsantekidis et al., 2017; Tsantekidis et al., 2017; Cheng et al., 2018; Minh et al., 2018;
Tran et al., 2018; Sezer & Ozbayoglu, 2018; Li et al., 2019; Wang et al., 2019; Zhang et al., 2019)
Recall (Gunduz et al., 2017; Li et al., 2017; Nelson et al., 2017; Tsantekidis et al., 2017; Tsantekidis et al., 2017; Cheng et al., 2018; Minh et al., 2018;
Tran et al., 2018; Sezer & Ozbayoglu, 2018; Li et al., 2019; Zhang et al., 2019)
Sensitivity (Sim et al., 2019)
Specificity (Sim et al., 2019)
F1 score (F1) (Gunduz et al., 2017; Li et al., 2017; Nelson et al., 2017; Sun et al., 2017; Tsantekidis et al., 2017; Tsantekidis et al., 2017; Zhang et al., 2017;
Cheng et al., 2018; Jiang et al., 2018; Sezer & Ozbayoglu, 2018; Tran et al., 2018; Deng et al., 2019; Guang et al., 2019; Li et al., 2019; Wang
et al., 2019; Zhang et al., 2019)
Macro-average F-score (MAFS) (Hoseinzade & Haratizadeh, 2019; Hoseinzade et al., 2019)
Matthews Correlation Coefficient (Ding et al., 2014; Ding et al., 2015; SSingh & Srivastavaingh & Srivastava, 2017; Huang et al., 2018; Matsubara et al., 2018; Tsang et al., 2018;
(MCC) Cao & Wang, 2019; Feng et al., 2019; Liu et al., 2019; Merello et al., 2019; Tan et al., 2019; Tang et al., 2019)
Average AUC Score (AUC) (Ballings et al., 2015; Jiang et al., 2018; Borovkova & Tsiamas, 2019; Zhang et al., 2019)
Theil’s U Coefficient (Theil’s U) (de Oliveira et al., 2013; Bao et al., 2017; Yan & Ouyang, 2018; de A. Araújo et al., 2019)
Hit Ratio (Hit) (SSingh & Srivastavaingh & Srivastava, 2017; Hu et al., 2018; Sim et al., 2019)
Average Relative Variance (ARV) (de A. Araújo et al., 2019)
Regression Metrics
Mean Absolute Error (MAE) Patel et al. (2015); Chong et al., 2017; Li et al., 2017; Qin et al., 2017; Althelaya et al., 2018; Althelaya et al., 2018; Baek and Kim, 2018; Chen
et al., 2018; Chung and Shin, 2018; Gao and Chai, 2018; Hossain et al., 2018; Lei, 2018; Al-Thelaya et al., 2019; Cao et al., 2019; Chen et al.,
2019; Ding and Qin, 2019; Jin et al., 2019; Karathanasopoulos and Osman, 2019; LLiu and Cheniu and Chen, 2019; Mansourfar and
Bagherzadeh, xxxx; Nguyen et al., 2019; Tang et al., 2019; Yu and Wu, 2019; Zhang et al., 2019; Zhou et al., 2019
Root Mean Absolute Error (RMAE) Kim and Kim (2019)
Mean Squared Error (MSE) Patel et al. (2015); Li et al., 2017; Zhang et al., 2017; Zhong and Enke, 2017; Baek and Kim, 2018; Chung and Shin, 2018; Hollis et al., 2018;
Hossain et al., 2018; Liu and Wang and Wang, 2019; Pang et al., 2018; Zhan et al., 2018; de A. Araújo et al., 2019; Eapen et al., 2019; Feng et al.,
2019; Karathanasopoulos and Osman, 2019; Nguyen et al., 2019; Mansourfar and Bagherzadeh, xxxx; Sachdeva et al., 2019; Stoean et al., 2019
Normalized MSE (NMSE) Chong et al. (2017)
Root Mean Squared Error (RMSE) de Oliveira et al. (2013); Chong et al., 2017; Lee and Soo, 2017; Li et al., 2017; Qin et al., 2017; SSingh and Srivastavaingh and Srivastava,
2017; Althelaya et al., 2018; Althelaya et al., 2018; Chen et al., 2018; Chen et al., 2018; Gao and Chai, 2018; Lei, 2018; Siami-Namini et al.,
2018; Al-Thelaya et al., 2019; Cao et al., 2019; Cao and Wang, 2019; Chen et al., 2019; Chen et al., 2019; Jin et al., 2019; Karathanasopoulos
and Osman, 2019; Kim and Kim, 2019; LLiu and Cheniu and Chen, 2019; Mansourfar and Bagherzadeh, xxxx; Sethia and Raut, 2019; Siami-
Namini et al., 2019; Sun et al., 2019; Tang et al., 2019; Yu and Wu, 2019; Zhang et al., 2019; Zhou et al., 2019
Relative RMSE (rRMSE) Patel et al. (2015); Nguyen et al., 2019
Normalized RMSE (NRMSE) Kumar et al. (2019)
Mean Absolute Percentage Error de Oliveira et al. (2013); Ticknor, 2013; Patel et al., 2015; Bao et al., 2017; Li and Tam, 2017; Qin et al., 2017; SSingh and Srivastavaingh and
(MAPE) Srivastava, 2017; Yang et al., 2017; Baek and Kim, 2018; Chen et al., 2018; Chen et al., 2018; Chung and Shin, 2018; Gao and Chai, 2018;
Hossain et al., 2018; Lei, 2018; Wu and Gao, 2018; Tsang et al., 2018; Yan and Ouyang, 2018; de A. Araújo et al., 2019; Cao et al., 2019; Chen
et al., 2019; Jin et al., 2019; Karathanasopoulos and Osman, 2019; Kim and Kim, 2019; Kumar et al., 2019; Mohan et al., 2019; Nguyen et al.,
2019; Sachdeva et al., 2019; Yu and Wu, 2019; Zhang et al., 2019; Zhou et al., 2019
Root Mean Squared Relative Error Zhou et al. (2018)
(RMSRE)
Mutual Information (MUL) Chong et al. (2017)
R2 Bao et al. (2017); Althelaya et al., 2018; Althelaya et al., 2018; Al-Thelaya et al., 2019; Chen et al., 2019; Jin et al., 2019; LLiu and Cheniu and
Chen, 2019; Sethia and Raut, 2019
Profit Metrics
Return Niaki and Hoseinzade (2013); Ding et al., 2015; Bao et al., 2017; Chen et al., 2017; Lee and Soo, 2017; SSingh and Srivastavaingh and
Srivastava, 2017; Zhong and Enke, 2017; Fischer and Krauss, 2018; Hu et al., 2018; Matsubara et al., 2018; Oncharoen and Vateekul, 2018,
2018; Wu et al., 2018; Tsang et al., 2018; Yang et al., 2018; Chen and Ge, 2019; Feng et al., 2019; Hoseinzade and Haratizadeh, 2019;
Karathanasopoulos and Osman, 2019; Kim and Kim, 2019; Lee et al., 2019; Long et al., 2019; Matsunaga et al., 2019; Merello et al., 2019; Sezer
and Ozbayoglu, 2019; Stoean et al., 2019; Song et al., 2019; Sun et al., 2019; Wang et al., 2019; Zhang et al., 2019; Zhou et al., 2019
Maximum Drawdown Zhou et al. (2019)
Annualized Volatility Karathanasopoulos and Osman (2019)
Sharpe Ratio Chen et al. (2017); Fischer and Krauss, 2018; Sezer and Ozbayoglu, 2018; Hoseinzade and Haratizadeh, 2019; Karathanasopoulos and Osman,
2019; Matsunaga et al., 2019; Merello et al., 2019; Stoean et al., 2019; Wang et al., 2019; Zhou et al., 2019
Significance Analysis
Kruskal-Wallis Test Zhang et al. (2019)
Diebold-Mariano Test Kumar et al. (2019)
generative net G learns to confuse D by generating high quality fake Zhou et al. (2018) proposes a generic GAN framework employing
data. This game between G and D would lead to a Nash equilibrium. LSTM and CNN for adversarial training to predict high-frequency
Since the introduction of GAN, it has been applied in multiple image- stock market.
related tasks, especially for image generation and enhancement, and • Graph neural network (GNN). GNN is designed to utilize graph-
generates a large family of variants. Inspired the success of GANs, structured data, thus capable of utilizing the network structure to
12
W. Jiang Expert Systems With Applications 184 (2021) 115537
Fig. 9. The usage of different classification metrics. Fig. 10. The usage of different regression metrics.
incorporate the interconnectivity of the market and make better For the baselines used in the survey studies, both linear, machine
predictions, compared to relying solely on the historical stock prices learning and deep learning models are covered. The change of baseline
of each individual company or on hand-crafted features (Matsunaga, models used is shown in Fig. 8. With the further exploration of deep
Suzumura, & Takahashi, 2019). Chen, Wei, and Huang (2018) first learning models for stock prediction, their ratio as baselines keeps
constructs a graph including 3,024 listed companies based on in increasing in the past three years.
vestment facts from real market, then learns a distributed represen
tation for each company via node embedding methods applied on the 4.4. Model Evaluation
graph, and applies three-layer graph convolutional networks to
predict. Kim et al. (2019) uses LSTM for the individual stock pre In this part, we category the evaluation metrics for the prediction
diction task and GRU for the index movement prediction task where models mentioned in the last part into four types:
an additional graph pooling layer is needed.
• Capsule Network. Different from the method of CNNs and RNNs, the • Classification metrics.Classification metrics are used to measure the
capsule network increases the weights of similar information model’s performance on movement prediction, which is modeled as
through its dynamic routing, which is proposed by Sabour, Frosst, a classification problem.Common used metrics include accuracy
and Hinton (2017) and displaces the pooling operation used in (which is the correct number of prediction for directional change),
conventional convolution neural network. Liu et al. (2019) is the first precision, recall, sensitivity, specificity, F1 score, macro-average F-
to introduce the capsule network for the problem of stock move score, Matthews correlation coefficient (which is a discrete case for
ments prediction based on social media and show that the capsule Pearson correlation coefficient), average AUC score (area under
network is effective for this task. Receiver Operating Characteristic curves (Fawcett, 2006)), Theil’s U
• Reinforcement learning. Unlike supervised learning, reinforcement coefficient, hit ratio, average relative variance, etc.Confusion
learning trains an agent to choose the optimal action given a current matrices and boxplots for daily accuracy are also used for classifi
state, with the goal to maximize cumulative rewards in the training cation performance analysis (Guang et al., 2019; Zhong & Enke,
process. Reinforcement learning can be applied for stock prediction 2017; Zhang et al., 2019).
with the advantage of using information from not only the next time • Regression metrics.Regression metrics are used to measure the
step but from all subsequent time steps (Lee et al., 2019). Rein model’s performance on stock/index price prediction, which is
forcement learning is also used for building algorithmic trading modeled as a regression problem.Common used metrics include
systems (Deng, Bao, Kong, Ren, & Dai, 2016). mean absolute error (MAE), root mean absolute error (RMAE), mean
• Transfer learning. Transfer learning can be used in training deep squared error (MSE), normalized MSE (NMSE), root mean squared
neural networks with a small amount of training data and a reduced error (RMSE), relative RMSE, normalized RMSE (NRMSE), mean
training time, by tuning the pre-trained model on a larger training absolute percentage error (MAPE), root mean squared relative error
dataset, e.g., NNguyen and Yoonguyen and Yoon (2019) trains a (RMSRE), mutual information, R2 (which is the coefficient of
LSTM base model on 50 stocks and transfers parameters for the determination).
prediction model on KOSPI 200 or S&P 500. • Profit Analysis.Profit analysis evaluates whether the predicted-based
trading strategy can bring a profit or not.It is usually evaluated from
We show the change trend of models used in the past three years in two aspects, the return and the risk.The return is the change in value
Fig. 6. RNN Models are used for the most, but the ratio drops with the on the stock portfolio and the risk can be evaluated by maximum
emerging new models in 2019. We also show the change of common drawdown (Zhou et al., 2019), which is the largest peak-to-trough
optimizers used in our surveyed papers in Fig. 7, which include Adam decline in the value of a portfolio and represents the max possible
(Kingma & Ba, 2014), stochastic gradient descent (SGD), RMSprop loss, or the annualized volatility(Karathanasopoulos & Osman,
(Tieleman & Hinton, 2012), AdaDelta (Zeiler, 2012). Adam has been 2019).Sharpe Ratio is a comprehensive metric with both the return
used the most for stock prediction, which is a combination of RMSprop and risk into consideration, which is the average return earned in
and stochastic gradient descent with momentum and presents several excess of the risk-free per unit of volatility (Sharpe, 1994).More
benefits, e.g., computationally efficiency and little memory requirement. detailed analysis about the transactions is given in Sezer and
Ozbayoglu (2018); Sezer and Ozbayoglu, 2019.
13
W. Jiang Expert Systems With Applications 184 (2021) 115537
Table 9 Table 10
List of GPUs used for stock market prediction. List of articles with public available data links.
GPU Type (all from Articles Articles Data Description & Link
NVIDIA)
Bao et al., 2017 CSI 300, Nifty 50, Hang Seng index, Nikkei 225, S&P500 and
Tesla P100 Zhang et al. (2019) DJIA index from Jul-01–2008 to Sep-30–2016. Link: https://
GeForce GTX 1060 Song et al. (2019) doi.org/10.6084/m9.figshare.5028110
GeForce GTX 1080 Ti Eapen et al. (2019) Qin et al., 2017 One-minute stock prices of 104 corporations under NASDAQ
Quadro P2000 Nguyen et al. (2019) 100 and the index value of NASDAQ 100 from Jul-26–2016 to
TITAN X Chong et al. (2017); Minh et al., 2018; Chen and Ge, 2019 Apr-28–2017. Link: http://cseweb.ucsd.edu/yaq007/
TITAN Xp Chen and Ge (2019) NASDAQ100_stock_data.html
TITAN RTX Wang et al. (2019) Zhang et al., 2017 The daily opening prices of 50 stocks in US among 10 sectors
Not specified Xu and Cohen (2018); Fischer and Krauss (2018); Guang from 2007 to 2016. Link: https://github.com/z331565360/
et al. (2019) State-Frequency-Memory-stock-prediction/tree/master/
dataset (also used in Feng et al., 2019)
Hollis et al., 2018 Historical data and Thomson Reuters news since 2007. Link:
• Significance Analysis.In order to determine if there is significant https://www.kaggle.com/c/two-sigma-financial-news
Huang et al., 2018 78 A-share stocks in CSI 100 and 13 popular HK stocks in the
difference in terms of predictions when comparing the deep learning
year 2015 and 2016. Financial web news dataset: https://pan.
models to the baselines, Kruskal–Wallis (Kruskal & Wallis, 1952) and baidu.com/s/1mhCLJJi; Guba dataset: https://pan.baidu.
Diebold-Mariano (Diebold & Mariano, 2002) tests can be used to test com/s/1i5zAWh3
the statistical significance, which decides a statistically better model. Wu et al., 2018 Prices and Twitter for 47 stocks listed in S&P 500 from
They are not used often for stock prediction, with only a few studies January 2017 to November 2017. Link: https://github.com/
wuhuizhe/CHRNN (also used in Liu et al., 2019)
in 2019 (Kumar et al., 2019; Zhang et al., 2019).
Xu and Cohen, Historical data of 88 high-trade-volume-stocks in NASDAQ
2018 and NYSE markets from Jan-01–2014 to Jan-01–2016. Link:
The detailed list of studies using each metrics (as well as the metrics’ https://github.com/yumoxu/stocknet-dataset (also used in
abbreviations) is shown in Table 8. Feng et al., 2019)
We further show the change of classification and regression metrics Feng et al., 2019 Data from previous studies (Zhang et al., 2017; Xu & Cohen,
2018). Link: https://github.com/fulifeng/Adv-ALSTM/tree/
in Fig. 9 and Fig. 10. For classification metrics, accuracy and F1 score are master/data
the most often used, followed by precision, recall, and MCC. For Feng et al., 2019 Historical price data, Sector-industry relations, and Wiki
regression metrics, RMSE and MAPE are the most often used, followed relations between their companies such as supplier–consumer
by MAE and MSE. relation and ownership relation for 1,026 NASDAQ and 1,737
NYSE stocks from Jan-03–2017 to Dec-08–2017. Link:
https://github.com/fulifeng/
5. Implementation and Reproducibility Temporal_Relational_Stock_Ranking/tree/master/data
LLiu and Cheniu 6 top banks in US from 2008 to 2016. Link: https://www.
5.1. Implementation and Chen, 2019 kaggle.com/rohan8594/stock-data
Kim and Kim, 2019 Minute SPY ticker data from Oct-14–2016 to Oct-16–2017.
Link: https://dx.doi.org/10.6084/m9.figshare.7471568
In this section, we pay a special attention to the implementation Sim et al., 2019 Minute data of the S&P 500 index from 10:30 pm on Apr-
details of the papers we survey, which is less discussed before in pre 03–2017, to 2:15 pm on May-02–2017. Link: https://www.
vious surveys. kesci.com/home/dataset/5bbdc2513631bc00109c29a4/files
We firstly investigate the programming language used for the Stoean et al., 2019 25 companies listed under the Romanian stock market from
Oct-16–1997 to Mar-13–2019. Link: https://doi.org/
implementation of machine learning and deep learning models. Among
10.6084/m9.figshare.7976144.v1
them, Python is becoming the dominant choice in the past three years, Liu et al., 2019 News headlines from Thomson Reuters and Cable News
which provides a bunch of packages and frameworks for model imple Network for 6 stocks in US markets. Link: https://github.com/
mentation purpose, e.g., Keras 3, TensorFlow 4, PyTorch 5, Theano 6, linechany/knowledge-graph
scikit-learn 7. Other choices include R, Matlab, Java, etc. Keras and
TensorFlow are the dominant frameworks for deep learning-based stock
and Microsoft Azure 10. However, they are not widely adopted in current
market prediction research. For further reference, the readers may refer
study of stock market prediction and no previous study covered in this
to Hatcher and Yu (2018) for a comprehensive introduction of deep
survey mentions the usage of cloud service explicitly.
learning tools.
Deep learning models require a larger amount of computation for
training, and GPU has been used to accelerate the convolutional oper 5.2. Result Reproducibility
ations involved. With the need of processing multiple types of input
data, especially the text data, the need for GPU would keep increasing in While deep learning techniques have been proved to be effective in
this research area. We give a list of different types of GPU used in the many different problems and most of the previous studies have proven
surveyed papers in Table 9. Cloud computing is another solution when their effectiveness for time series forecasting problems, while there are
GPU is not available locally. There are many commercial choices of still doubts and concerns, e.g., in the M4 open forecasting competition
cloud computing services, e.g., Amazon Web Services 8, Google Cloud 9, with 100,000 time series which started on Jan 1, 2018 and ended on
May 31, 2018, statistical approaches outperform pure ML methods
(Makridakis, Spiliotis, & Assimakopoulos, 2018) and there are similar
results with the dataset of the earlier M3 competition (Makridakis,
3
http://keras.io/. Keras has been incorporated in TensorFlow 2.0 and higher Spiliotis, & Assimakopoulos, 2018). These studies also question the
versions. reproducibility and replicability in the previous papers which use ML
4
https://www.tensorflow.org/
5 methods.
https://pytorch.org/
6 While it is beyond the scope of this study to check the result of each
http://deeplearning.net/software/theano/. It is a discontinued project and
not recommended for further use.
paper, we instead investigate the data and code availability of the sur
7
https://scikit-learn.org/stable/index.html veyed papers, which are two important aspects for the result
8
https://aws.amazon.com/
9
https://cloud.google.com/
10
https://azure.microsoft.com/
14
W. Jiang Expert Systems With Applications 184 (2021) 115537
Table 11 established databases for research purpose and contain many different
List of articles with public code links. types of financial data, which can be used for stock market prediction
Articles Method Description & Link and other financial problems. One example is the CSMAR database 27,
which provides financial statements and stock trading data for Chinese
Weng et al. Artificial neural networks (ANN), decision trees (DT), and support
(2017) vector machines (SVM) in R. Link: https://github.com/binweng/ companies, including balance sheets, income statements, cash flows,
ShinyStock stock prices and returns, market returns and indices, and other data on
Zhang et al. State frequency memory (SFM) recurrent network. Link: https:// Chinese equities. Another example is Wharton Research Data Services
(2017) github.com/z331565360/State-Frequency-Memory-stock- (WRDS) 28, which provides access to many financial, accounting,
prediction
Hu et al. Hybrid attention network (HAN). Link: https://github.com/
banking, economics, marketing, and public policy databases through a
(2018) gkeng/Listening-to-Chaotic-Whishpers–Code uniform, web-based interface.
Xu and Cohen A deep generative model named StockNet. Link: https://github. Data competition websites, e.g., Kaggle 29, are also becoming a good
(2018) com/yumoxu/stocknet-code choice of data repository for stock market prediction. And quantitative
Feng et al. Adversarial attentive LSTM. Link: https://github.com/fulifeng/
companies could collaborate with these websites to host stock market
(2019) Adv-ALSTM
Feng et al. Relational stock ranking (RSR). Link: https://github.com/ prediction competitions, e.g., Two Sigma Financial Modeling Challenge
30
(2019) fulifeng/Temporal_Relational_Stock_Ranking , which is organized by a hedge fund named Two Sigma 31.
Kim et al. Hierarchical graph attention network (HATS). Link: https:// Even though most of the data sources are available on the Internet, it
(2019) github.com/dmis-lab/hats would be more convenient for replicability if the authors could release
Lee et al. Deep Q-Network. Link: https://github.com/lee-jinho/DQN-
(2019) global-stock-market-prediction/
the exact dataset they use. In Table 10, we list those with the data
description and link, for those data which is hosted in software host
websites such as Github 32, cloud services, researcher’s own website,
reproducibility. Some of the source journals would require or recom and data competition websites such as Kaggle. For the mid-price pre
mend the data and code submitted as supplementary files for peer re diction of limit order book data, there is a benchmark dataset provided
view, e.g., PLOS ONE. In other cases, the authors would share their data by Ntakaris, Magris, Kanniainen, Gabbouj, and Iosifidis (2017) and has
and code proactively, for the consideration that following works can been used in the following studies (Tran, Iosifidis, Kanniainen, & Gab
easily use them as baselines, which gains a higher impact for their bouj, 2018).
publications.
5.2.2. Code Availability
5.2.1. Data Availability Github has been the mainstream platform of hosting source code in
There are many free data sources on the Internet for the research the computer science field. However, only a small number of studies
purpose of stock market prediction. For historical price and volume, the would release their code for now, in the area of stock market prediction.
first choice should be the widely used Yahoo! Finance 11, which provides In Table 11, we list the articles with public code repositories. A short
free access to data including stock quotes, up-to-date news, international description of each method is mentioned, and the details can be found in
market data, etc., and has been mentioned at least in 25 out of 124 Section 4 and the original documents.
papers. Other similar options include Tushare 12, which can be used to
crawl historical data of China stocks. Some stock markets would also 6. Future Directions
provide the download service of historical data on their official websites.
For macroeconomic indicators, International Monetary Fund (IMF) 13 Based on our review of recent works, we give some future directions
and World Bank 14 are good choices to explore. For financial news, in this section, which aims to bring new insight to interested researchers.
previous studies would crawl some major news sources, e.g., CNBC 15,
Reuters 16, Wall Street Journal 17, Fortune 18, etc. Social networking 6.1. New Models
websites, e.g., Twitter 19 and Sina Weibo 20, provide web Application
Programming Interface (API) for the access of their data (usually pre Different structures of neural networks are not fully studied for stock
processed and anonymous). And researchers could filter the financial prediction, especially those who only appear in recent years. There are
related tweet using companies’ names as keywords. For relational data, two steps where deep learning models involve in stock prediction,
Wikidata 21 provides relations between companies such as suppli namely, Data Processing and Prediction Model in Section 4. While we
er–consumer relation and ownership relation. already covered some latest effort of applying new models in this survey,
There are many commercial choices too, e.g., Bloomberg 22, Wind 23, e.g., the attention mechanism and generative adversarial networks,
Quantopian 24, Investing.com 25. Online brokers such as Interactive there are still a huge space to explore for new models. For example, for
Brokers 26 also provide data-related services. There are also some well- sentiment analysis of text data, Transformer (Vaswani et al., 2017) and
pre-trained BERT (Bidirectional Encoder Representations from Trans
formers) (Devlin, Chang, Lee, & Toutanova, 2018) are widely used in
11
https://finance.yahoo.com/ natural language processing, but is less discussed for financial news
12
https://tushare.pro/ analysis.
13
https://www.imf.org/
14
https://www.worldbank.org/
15 6.2. Multiple Data Sources
https://www.cnbc.com/
16
https://www.reuters.com/
17
https://www.wsj.com/
Observed from our discussion in Section 4, it is not wise to design a
18
https://fortune.com/ stock prediction solution based on a single data source, e.g., market
19
https://twitter.com/
20
https://www.weibo.com/
21 27
https://www.wikidata.org/wiki/Wikidata:Main_Page http://us.gtadata.com/
22 28
https://www.bloomberg.com/ https://wrds-web.wharton.upenn.edu/
23 29
https://www.wind.com.cn/ https://www.kaggle.com/
24 30
https://www.quantopian.com/ https://www.kaggle.com/c/two-sigma-financial-modeling
25 31
https://www.investing.com/ https://www.twosigma.com/
26 32
https://www.interactivebrokers.com/ https://github.com/
15
W. Jiang Expert Systems With Applications 184 (2021) 115537
data, as it has been heavily used in previous studies and it would be very published articles, with the hope to accelerate the process of adopting
challenging to outperform existing solutions. A better idea is to collect published models as baselines (maybe with new data input). With some
and use multiple data sources, especially those which are less explored future directions pointed up, the insight and summary in the survey
in the literature (Zhou, Gao, Liu, & Xiao, 2019). would help to boost the future research in related topics.
Our contributions in this survey are summarized in both practical
6.3. Cross-market Analysis and theoretical aspects. As for the practical aspect, a general workflow is
given for newcomers in this area, which is easy to follow. The discussion
Most of the existing studies focus on only one stock market, in the about implementation and reproducibility would be extremely useful
sense that stock markets differ from each other because of the trading when implementing the surveyed papers as baselines. From the theo
rules, while different markets may share some common phenomenon retical aspect, compared with other relevant studies in expert and
that can be leveraged for prediction by approaches such as transfer intelligent systems, our focus is deep learning, which is proven effective
learning. There are already a few studies showing positive results for for a wide range of applications. In this survey, the latest progress of the
cross-market analysis (Hoseinzade & Haratizadeh, 2019; Lee et al., deep learning techniques to a specific scenario, e.g., stock market pre
2019; Merello, Ratto, Oneto, & Cambria, 2019; NNguyen & Yoonguyen diction, is discussed and summarized, with the basic theoretical intro
& Yoon, 2019; Hoseinzade, Haratizadeh, & Khoeini, 2019), it is worth duction given to these deep learning techniques. Furthermore, future
exploring in the following studies. In Lee et al. (2019), the model is research directions are given theoretically for interested researchers.
trained only on US stock market data and tested on the stock market data The limitations of this survey are summarized in three points. The
of 31 different countries over 12 years. Even though the authors do not first point is that only the recent progress of the deep learning applica
use the terminology of transfer learning, it is a practice of model tion in the stock market is covered in this survey, without giving a whole
transfer. picture of the relevant history. For those who are interested in the earlier
literature, the relevant discussion can be found in previous surveys
6.4. Algorithmic Trading discussed in Section 2. The second point is that the scope of this survey is
limited to the stock market, without discussing the application of deep
The prediction is not the end of the journey. Good prediction is one learning in other important financial markets, e.g., the foreign exchange
factor to make money in the stock market, but not the whole story. Some and futures markets. However, some of the techniques covered in this
of the studies have evaluated the profit and risk of the trading strategies survey are still applicable to these markets. The third point is that even
based on the prediction result, as we discussed in Section 4.4. However, though deep learning is proven as the state-of-the-art technique for
these strategies are simple and intuitive, which may be impractical predicting the stock market in most of the surveyed studies, this survey
limited by the trading rules. The transaction cost is often omitted or does not aim to provide a comprehensive experimental comparison
simplified, which makes the conclusion less persuasive. Another prob between deep learning and other prediction techniques, which requires
lem is the adaption for different market styles, as the training of deep a huge amount of computation resource and is left for future studies.
learning models is time-consuming. These studies are not sufficient for
building a practical algorithmic trading system. One possible direction is Declaration of Competing Interest
deep reinforcement learning, which has recent successes in a variety of
applications and is also been used in a few studies for stock prediction The authors declare that they have no known competing financial
and trading (Xiong, Liu, Zhong, Yang, & Walid, 2018; Lee et al., 2019). It interests or personal relationships that could have appeared to influence
has advantages of simulating more possible cases and making a faster the work reported in this paper.
and better trading choice than human traders.
Acknowledgement
7. Conclusion
Weiwei Jiang: Conceptualization; Data curation; Formal analysis;
Inspired by the rapid development and increasing usage of deep Funding acquisition; Investigation; Methodology; Project administra
learning models for stock market prediction, we give a review of recent tion; Resources; Software; Supervision; Validation; Visualization; Roles/
progress by surveying more than 100 related published articles in the Writing - original draft; Writing - review & editing.
past three years. We cover each step from raw data collection and data
processing to prediction model and model evaluation and present the Appendix A
research trend from 2017 to 2019. We also pay a special attention to the
implementation of deep learning models and the reproducibility of
Table 12
Abbreviations of machine learning and deep learning methods.
Abbreviation Full Name
Preprocessing Techniques
BoF Bag-of-feature
BoW Bag-of-words
CEAM Cycle Embeddings with Attention Mechanism
CEEMDAN Complete Ensemble Empirical Mode Decomposition with Adaptive Noise
EMD Empirical Mode Decomposition
MACD Moving Average Convergence/ Divergence
MOM Momentum
N-BoF Neural Bag-of-Feature
PCA Principal Components Analysis
RS Rough Set
RSI Relative Strength Index
SMA Simple Moving Average
SMC Sub-mode Coordinate Algorithm
(continued on next page)
16
W. Jiang Expert Systems With Applications 184 (2021) 115537
Table 12 (continued )
Abbreviation Full Name
WT Wavelet Transform
(2D)2PCA 2-Directional 2- Dimensional Principal Component Analysis
Optimization Algorithms
HM Harmony Memory
GA Genetic Algorithm
ISCA Improved Sine Cosine Algorithm
GWO Grey Wolf Optimizer
PSO particle swarm optimization
WOA Whale Optimization Algorithm
SCA Sine Cosine Algorithm
Linear Models
AR Autoregressive
ARMA Autoregressive Moving Average
ARIMA Autoregressive Integrated Moving Average
EMA Estimated Moving Average
GARCH Generalized Autoregressive Conditional Heteroskedasticity
LDA Linear Discriminant Analysis
LR Linear Regression
PLR Piecewise Linear Regression
MA Moving Average
MR Mean Reversion
MCSDA Multilinear Class-specific Discriminant Analysis
MDA Multilinear Discriminant Analysis
MTR Multilinear Time-series Regression
WMTR Weighted Multilinear Time-series Regression
17
W. Jiang Expert Systems With Applications 184 (2021) 115537
Table 12 (continued )
Abbreviation Full Name
References Bollacker, K., Evans, C., Paritosh, P., Sturge, T., & Taylor, J. (2008). Freebase: a
collaboratively created graph database for structuring human knowledge. In
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
de A. Araújo, R., Nedjah, N., Oliveira, A.L., & de L. Meira, S.R. (2019). A deep
(pp. 1247–1250).
increasing–decreasing-linear neural network for financial time series prediction.
Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal
Neurocomputing, 347, 59 – 81. url:http://www.sciencedirect.com/science/article/
of econometrics, 31, 307–327.
pii/S0925231219303194. doi:https://doi.org/10.1016/j.neucom.2019.03.017.
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., & Yakhnenko, O. (2013).
Aguilar-Rivera, R., Valenzuela-Rendón, M., & Rodríguez-Ortiz, J. (2015). Genetic
Translating embeddings for modeling multi-relational data. In Advances in neural
algorithms and darwinian approaches in financial applications: A survey. Expert
information processing systems (pp. 2787–2795).
Systems with Applications, 42, 7684–7697. https://doi.org/10.1016/j.
Borovkova, S., & Tsiamas, I. (2019). An ensemble of lstm neural networks for high-
eswa.2015.06.001. url:http://www.sciencedirect.com/science/article/pii/
frequency stock market classification. Journal of Forecasting.
S0957417415003954.
Brownlee, J. (2018). Deep Learning for Time Series Forecasting: Predict the Future with
Akita, R., Yoshihara, A., Matsubara, T., & Uehara, K. (2016). Deep learning for stock
MLPs, CNNs and LSTMs in Python. Machine Learning Mastery.
prediction using numerical and textual information. In 2016 IEEE/ACIS 15th
Cao, J., Li, Z., & Li, J. (2019). Financial time series forecasting model based on ceemdan
International Conference on Computer and Information Science (ICIS) (pp. 1–6). https://
and lstm. Physica A: Statistical Mechanics and its Applications, 519, 127–139. https://
doi.org/10.1109/ICIS.2016.7550882
doi.org/10.1016/j.physa.2018.11.061. url:http://www.sciencedirect.com/science/
Al-Thelaya, K. A., El-Alfy, E.-S. M., & Mohammed, S. (2019). Forecasting of bahrain stock
article/pii/S0378437118314985.
market with deep learning: Methodology and case study. In 2019 8th International
Cao, J., & Wang, J. (2019). Stock price forecasting model based on modified convolution
Conference on Modeling Simulation and Applied Optimization (ICMSAO) (pp. 1–5).
neural network and financial time series analysis. International Journal of
IEEE.
Communication Systems, 32, e3987. url:https://onlinelibrary.wiley.com/doi/abs/
Alpaydin, E. (2014). Introduction to machine learning. MIT press.
10.1002/dac.3987. doi:10.1002/dac.3987. arXiv:https://onlinelibrary.wiley.com/
Althelaya, K. A., El-Alfy, E. M., & Mohammed, S. (2018). Stock market forecast using
doi/pdf/10.1002/dac.3987. E3987 IJCS-18-0961.R2.
multivariate analysis with bidirectional and stacked (lstm, gru). In 2018 21st Saudi
Cavalcante, R. C., Brasileiro, R. C., Souza, V. L., Nobrega, J. P., & Oliveira, A. L. (2016).
Computer Society National Computer Conference (NCC) (pp. 1–7). https://doi.org/
Computational intelligence and financial markets: A survey and future directions.
10.1109/NCG.2018.8593076
Expert Systems with Applications, 55, 194–211. https://doi.org/10.1016/j.
Althelaya, K. A., El-Alfy, E.-S. M., & Mohammed, S. (2018). In Evaluation of bidirectional
eswa.2016.02.006. url:http://www.sciencedirect.com/science/article/pii/
lstm for short- and long-term stock market prediction. In 2018 9th International
S095741741630029X.
Conference on Information and Communication Systems (ICICS) (pp. 151–156). IEEE.
Cervelló-Royo, R., Guijarro, F., & Michniuk, K. (2015). Stock market trading rule based
Appel, G., & Dobson, E. (2007). Understanding MACD. Traders Press.
on pattern recognition and technical analysis: Forecasting the djia index with
Araújo, R. d. A. (2011). A class of hybrid morphological perceptrons with application in
intraday data. Expert Systems with Applications, 42, 5963–5975.
time series forecasting. Knowledge-Based Systems, 24, 513–529.
Chen, H., Xiao, K., Sun, J., & Wu, S. (2017). A double-layer neural network framework
Arévalo, R., García, J., Guijarro, F., & Peris, A. (2017). A dynamic trading rule based on
for high-frequency forecasting. ACM Transactions on Management Information Systems
filtered flag pattern recognition for stock market price forecasting. Expert Systems
(TMIS), 7, 11.
with Applications, 81, 177–192.
Chen, L., Chi, Y., Guan, Y., & Fan, J. (2019). A hybrid attention-based emd-lstm model
Asadi, S., Hadavandi, E., Mehmanpazir, F., & Nakhostin, M. M. (2012). Hybridization of
for financial time series prediction. In 2019 2nd International Conference on Artificial
evolutionary levenberg–marquardt neural networks and data pre-processing for
Intelligence and Big Data (ICAIBD) (pp. 113–118). IEEE.
stock market prediction. Knowledge-Based Systems, 35, 245–258.
Chen, L., Qiao, Z., Wang, M., Wang, C., Du, R., & Stanley, H. E. (2018). Which artificial
Assis, C. A., Pereira, A. C., Carrano, E. G., Ramos, R., & Dias, W. (2018). Restricted
intelligence algorithm better predicts the chinese stock market? IEEE Access, 6,
boltzmann machines for the prediction of trends in financial time series. In 2018
48625–48633.
International Joint Conference on Neural Networks (IJCNN) (pp. 1–8). IEEE.
Chen, S., & Ge, L. (2019). Exploring the attention mechanism in lstm-based hong kong
Atsalakis, G. S., & Valavanis, K. P. (2009). Surveying stock market forecasting techniques
stock price movement prediction. Quantitative Finance, 19, 1507–1515.
– part ii: Soft computing methods. Expert Systems with Applications, 36, 5932–5941.
Chen, W., Yeo, C. K., Lau, C. T., & Lee, B. S. (2018). Leveraging social media news to
https://doi.org/10.1016/j.eswa.2008.07.006. url:http://www.sciencedirect.com/
predict stock index movement using rnn-boost. Data & Knowledge Engineering, 118,
science/article/pii/S0957417408004417.
14–24.
Baek, Y., & Kim, H. Y. (2018). Modaugnet: A new forecasting framework for stock market
Chen, Y., Lin, W., & Wang, J. Z. (2019). A dual-attention-based stock price trend
index value with an overfitting prevention lstm module and a prediction lstm
prediction model with dual features. IEEE Access, 7, 148047–148058. https://doi.
module. Expert Systems with Applications, 113, 457–480. https://doi.org/10.1016/j.
org/10.1109/ACCESS.2019.2946223
eswa.2018.07.019. url:http://www.sciencedirect.com/science/article/pii/
Chen, Y., Wei, Z., & Huang, X. (2018). Incorporating corporation relationship via graph
S0957417418304342.
convolutional neural networks for stock price prediction. In Proceedings of the 27th
Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly
ACM International Conference on Information and Knowledge Management CIKM, ’18
learning to align and translate. arXiv preprint arXiv:1409.0473.
pp. 1655–1658). New York, NY, USA: ACM. https://doi.org/10.1145/
Ballings, M., den Poel, D. V., Hespeels, N., & Gryp, R. (2015). Evaluating multiple
3269206.3269269.
classifiers for stock price direction prediction. Expert Systems with Applications, 42,
Cheng, L.-C., Huang, Y.-H., & Wu, M.-E. (2018). Applied attention-based lstm neural
7046–7056. https://doi.org/10.1016/j.eswa.2015.05.013. url:http://www.
networks in stock prediction. In 2018 IEEE International Conference on Big Data (Big
sciencedirect.com/science/article/pii/S0957417415003334.
Data) (pp. 4716–4718). IEEE.
Bao, W., Yue, J., & Rao, Y. (2017). A deep learning framework for financial time series
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., &
using stacked autoencoders and long-short term memory. PLOS ONE, 12, 1–24.
Bengio, Y. (2014). Learning phrase representations using rnn encoder-decoder for
https://doi.org/10.1371/journal.pone.0180944
statistical machine translation. arXiv preprint arXiv:1406.1078.
18
W. Jiang Expert Systems With Applications 184 (2021) 115537
Chong, E., Han, C., & Park, F. C. (2017). Deep learning networks for stock market Hossain, M. A., Karim, R., Thulasiram, R., Bruce, N. D. B., & Wang, Y. (2018). Hybrid
analysis and prediction: Methodology, data representations, and case studies. Expert deep learning model for stock price prediction. In 2018 IEEE Symposium Series on
Systems with Applications, 83, 187–205. https://doi.org/10.1016/j. Computational Intelligence (SSCI) (pp. 1837–1844). https://doi.org/10.1109/
eswa.2017.04.030. url:http://www.sciencedirect.com/science/article/pii/ SSCI.2018.8628641
S0957417417302750. Hu, H., & Qi, G.-J. (2017). State-frequency memory recurrent neural networks. In
Chung, H., & Shin, K.-S. (2018). Genetic algorithm-optimized long short-term memory Proceedings of the 34th International Conference on Machine Learning-Volume 70 (pp.
network for stock market prediction. Sustainability, 10, 3765. 1568–1577). JMLR. org.
Cui, Z., Chen, W., & Chen, Y. (2016). Multi-scale convolutional neural networks for time Hu, H., Tang, L., Zhang, S., & Wang, H. (2018). Predicting the direction of stock markets
series classification. arXiv preprint arXiv:1603.06995. using optimized neural networks with google trends. Neurocomputing, 285, 188–195.
Deng, S., Zhang, N., Zhang, W., Chen, J., Pan, J. Z., & Chen, H. (2019). Knowledge-driven url:http://www.sciencedirect.com/science/article/pii/S0925231218300572.
stock trend prediction and explanation via temporal convolutional network. In 10.1016/j.neucom.2018.01.038.
Companion Proceedings of The 2019 World Wide Web Conference WWW ’19 (pp. Hu, Z., Liu, W., Bian, J., Liu, X., & Liu, T.-Y. (2018). Listening to chaotic whispers: A deep
678–685). New York, NY, USA: ACM. https://doi.org/10.1145/3308560.3317701. learning framework for news-oriented stock trend prediction. In Proceedings of the
url:http://doi.acm.org/10.1145/3308560.3317701. Eleventh ACM International Conference on Web Search and Data Mining WSDM ’18 (pp.
Deng, Y., Bao, F., Kong, Y., Ren, Z., & Dai, Q. (2016). Deep direct reinforcement learning 261–269). New York, NY, USA: ACM. https://doi.org/10.1145/3159652.3159690.
for financial signal representation and trading. IEEE Transactions on Neural Networks Huang, J., Zhang, Y., Zhang, J., & Zhang, X. (2018). A tensor-based sub-mode coordinate
and Learning Systems, 28, 653–664. algorithm for stock prediction. In 2018 IEEE Third International Conference on Data
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep Science in Cyberspace (DSC) (pp. 716–721). https://doi.org/10.1109/
bidirectional transformers for language understanding. arXiv preprint arXiv: DSC.2018.00114
1810.04805. Huynh, H. D., Dang, L. M., & Duong, D. (2017). A new model for stock price movements
Diaconescu, E. (2008). The use of narx neural networks to predict chaotic time series. prediction using deep neural network. In Proceedings of the Eighth International
Wseas Transactions on computer research, 3, 182–191. Symposium on Information and Communication Technology (pp. 57–62). ACM.
Diebold, F. X., & Mariano, R. S. (2002). Comparing predictive accuracy. Journal of Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: principles and practice. OTexts.
Business & economic statistics, 20, 134–144. Jegadeesh, N., & Titman, S. (1993). Returns to buying winners and selling losers:
Ding, G., & Qin, L. (2019). Study on the prediction of stock price based on the associated Implications for stock market efficiency. The Journal of finance, 48, 65–91.
network model of lstm. International Journal of Machine Learning and Cybernetics, Jiang, W., & Zhang, L. (2018). Geospatial data to images: A deep-learning framework for
1–11. traffic forecasting. Tsinghua Science and Technology, 24, 52–64.
Ding, X., Zhang, Y., Liu, T., & Duan, J. (2014). Using structured events to predict stock Jiang, W., & Zhang, L. (2020). Edge-siamnet and edge-triplenet: New deep learning
price movement: An empirical investigation. In Proceedings of the 2014 Conference on models for handwritten numeral recognition. IEICE Transactions on Information and
Empirical Methods in Natural Language Processing (EMNLP) (pp. 1415–1425). Systems, 103.
Ding, X., Zhang, Y., Liu, T., & Duan, J. (2015). Deep learning for event-driven stock Jiang, X., Pan, S., Jiang, J., & Long, G. (2018). Cross-domain deep learning approach for
prediction. In Twenty-fourth international joint conference on artificial intelligence. multiple financial market prediction. In 2018 International Joint Conference on Neural
Dingli, A., & Fournier, K. S. (2017). Financial time series forecasting–a deep learning Networks (IJCNN) (pp. 1–8). https://doi.org/10.1109/IJCNN.2018.8489360
approach. International Journal of Machine Learning and Computing, 7, 118–122. Jin, Z., Yang, Y., & Liu, Y. (2019). Stock closing price prediction based on sentiment
Eapen, J., Bein, D., & Verma, A. (2019). Novel deep learning model with cnn and bi- analysis and lstm. Neural Computing and Applications, 1–17.
directional lstm for improved stock market index prediction. In 2019 IEEE 9th Annual Joachims, T. (1998). In Text categorization with support vector machines: Learning with
Computing and Communication Workshop and Conference (CCWC) (pp. 0264–0270). many relevant features. In European conference on machine learning (pp. 137–142).
https://doi.org/10.1109/CCWC.2019.8666592 Springer.
Ersan, D., Nishioka, C., & Scherp, A. (2019). Comparison of machine learning methods Joulin, A., Grave, É., Bojanowski, P., & Mikolov, T. (2017). Bag of tricks for efficient text
for financial time series forecasting at the examples of over 10 years of daily and classification. In Proceedings of the 15th Conference of the European Chapter of the
hourly data of dax 30 and s&p 500. Journal of Computational Social Science, 1–31. Association for Computational Linguistics: Volume 2, Short Papers (pp. 427–431).
Fama, E. F. (1965). The behavior of stock-market prices. The Journal of Business, 38, Kara, Y., Boyacioglu, M. A., & Baykan, Ö. K. (2011). Predicting direction of stock price
34–105. index movement using artificial neural networks and support vector machines: The
Fawcett, T. (2006). An introduction to roc analysis. Pattern recognition letters, 27, sample of the istanbul stock exchange. Expert Systems with Applications, 38,
861–874. 5311–5319.
Feng, F., Chen, H., He, X., Ding, J., Sun, M., & Chua, T.-S. (2019). Enhancing stock Karathanasopoulos, A., & Osman, M. (2019). Forecasting the dubai financial market with
movement prediction with adversarial training. In Proceedings of the Twenty-Eighth a combination of momentum effect with a deep belief network. Journal of
International Joint Conference on Artificial Intelligence, IJCAI-19 (pp. 5843–5849). Forecasting, 38, 346–353. url:https://onlinelibrary.wiley.com/doi/abs/10.1002/
International Joint Conferences on Artificial Intelligence Organization. url:https:// for.2560. 10.1002/for.2560.
doi.org/10.24963/ijcai.2019/810. doi:10.24963/ijcai.2019/810. Kim, R., So, C.H., Jeong, M., Lee, S., Kim, J., & Kang, J. (2019). Hats: A hierarchical
Feng, F., He, X., Wang, X., Luo, C., Liu, Y., & Chua, T.-S. (2019). Temporal relational graph attention network for stock movement prediction. arXiv preprint arXiv:
ranking for stock prediction. ACM Transactions on Information Systems, 37, 27:1–27: 1908.07999,.
30. https://doi.org/10.1145/3309547 Kim, S., & Kang, M. (2019). Financial series prediction using attention lstm. arXiv
Fischer, T., & Krauss, C. (2018). Deep learning with long short-term memory networks preprint arXiv:1902.10877.
for financial market predictions. European Journal of Operational Research, 270, Kim, T., & Kim, H. Y. (2019). Forecasting stock prices with a feature fusion lstm-cnn
654–669. https://doi.org/10.1016/j.ejor.2017.11.054. url:http://www. model using different representations of the same data. PloS one, 14, Article
sciencedirect.com/science/article/pii/S0377221717310652. e0212320.
Gao, T., & Chai, Y. (2018). Improving stock closing price prediction using recurrent Kingma, D.P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv
neural network and technical indicators. Neural Computation, 30, 2833–2854. preprint arXiv:1412.6980,.
https://doi.org/10.1162/neco_a_01124. arXiv:https://doi.org/10.1162/neco_a_ Kruskal, W. H., & Wallis, W. A. (1952). Use of ranks in one-criterion variance analysis.
01124. PMID: 30148707. Journal of the American statistical Association, 47, 583–621.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press. Kumar, B.S., Ravi, V., & Miglani, R. (2019). Predicting indian stock market using the
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., psycho-linguistic features of financial news. arXiv preprint arXiv:1911.06193.
Courville, A., & Bengio, Y. (2014). Generative adversarial nets. In Advances in Lee, C.-Y., & Soo, V.-W. (2017). Predict stock price with financial news based on
neural information processing systems (pp. 2672–2680). recurrent convolutional neural networks. In 2017 Conference on Technologies and
Guang, L., Xiaojie, W., & Ruifan, L. (2019). Multi-scale rcnn model for financial time- Applications of Artificial Intelligence (TAAI) (pp. 160–165). IEEE.
series classification. arXiv preprint arXiv:1911.09359. Lee, J., Kim, R., Koh, Y., & Kang, J. (2019). Global stock market prediction based on
Gundersen, O. E., & Kjensmo, S. (2018). State of the art: Reproducibility in artificial stock chart images using deep q-network. arXiv preprint arXiv:1902.10948.
intelligence. In Thirty-Second AAAI Conference on Artificial Intelligence. Lei, L. (2018). Wavelet neural network prediction method of stock price trend based on
Gunduz, H., Yaslan, Y., & Cataltepe, Z. (2017). Intraday prediction of borsa istanbul rough set attribute reduction. Applied Soft Computing, 62, 923–932. https://doi.org/
using convolutional neural networks and feature correlations. Knowledge-Based 10.1016/j.asoc.2017.09.029. url:http://www.sciencedirect.com/science/article/
Systems, 137, 138–148. pii/S1568494617305689.
Göken, M., Özçalıcı, M., Boru, A., & Dosdoğru, A.T. (2016). Integrating metaheuristics Leigh, W., Modani, N., Purvis, R., & Roberts, T. (2002). Stock market trading rule
and artificial neural networks for improved stock price prediction. Expert Systems discovery using technical charting heuristics. Expert Systems with Applications, 23,
with Applications, 44, 320–331. url:http://www.sciencedirect.com/science/article/ 155–159.
pii/S0957417415006570. doi: 10.1016/j.eswa.2015.09.029. Li, C., Song, D., & Tao, D. (2019). Multi-task recurrent neural networks and higher-order
Harris, Z. S. (1954). Distributional structure. Word, 10, 146–162. markov random fields for stock price movement prediction: Multi-task rnn and
Hatcher, W. G., & Yu, W. (2018). A survey of deep learning: Platforms, applications and higer-order mrfs for stock price classification. In Proceedings of the 25th ACM SIGKDD
emerging research trends. IEEE Access, 6, 24411–24432. https://doi.org/10.1109/ International Conference on Knowledge Discovery & Data Mining KDD ’19 (pp.
ACCESS.2018.2830661 1141–1151). New York, NY, USA: ACM. https://doi.org/10.1145/
Hollis, T., Viscardi, A., & Yi, S.E. (2018). A comparison of lstms and attention 3292500.3330983.
mechanisms for forecasting financial time series. arXiv preprint arXiv:1812.07699. Li, J., Bu, H., & Wu, J. (2017). Sentiment-aware stock market prediction: A deep learning
Hoseinzade, E., & Haratizadeh, S. (2019). Cnnpred: Cnn-based stock market prediction method. In 2017 International Conference on Service Systems and Service Management
using a diverse set of variables. Expert Systems with Applications, 129, 273–285. (pp. 1–6). IEEE.
Hoseinzade, E., Haratizadeh, S., & Khoeini, A. (2019). U-cnnpred: A universal cnn-based
predictor for stock markets. arXiv preprint arXiv:1911.12540.
19
W. Jiang Expert Systems With Applications 184 (2021) 115537
Li, Q., Chen, Y., Jiang, L. L., Li, P., & Chen, H. (2016). A tensor-based information Nelson, D. M., Pereira, A. C., & de Oliveira, R. A. (2017). Stock market’s price movement
framework for predicting the stock market. ACM Transactions on Information Systems prediction with lstm neural networks. In 2017 International Joint Conference on
(TOIS), 34, 1–30. Neural Networks (IJCNN) (pp. 1419–1426). IEEE.
Li, X., Li, Y., Yang, H., Yang, L., & Liu, X.-Y. (2019). Dp-lstm: Differential privacy- Nguyen, D. H. D., Tran, L. P., & Nguyen, V. (2019). Predicting stock prices using dynamic
inspired lstm for stock prediction using financial news. arXiv preprint arXiv: lstm models. In International Conference on Applied Informatics (pp. 199–212).
1912.10806. Springer.
Li, X., Xie, H., Wang, R., Cai, Y., Cao, J., Wang, F., Min, H., & Deng, X. (2016). Empirical Nguyen, T. H., & Shirai, K. (2015). Topic modeling based sentiment analysis on social
analysis: stock market prediction via extreme learning machine. Neural Computing & media for stock market prediction. In Proceedings of the 53rd Annual Meeting of the
Applications, 27, 67–78. Association for Computational Linguistics and the 7th International Joint Conference on
Li, X., Yang, L., Xue, F., & Zhou, H. (2017). In Time series prediction of stock price using deep Natural Language Processing (Volume 1: Long Papers) (pp. 1354–1364).
belief networks with intrinsic plasticity. In 2017 29th Chinese Control And Decision Nguyen, T.-T., & Yoon, S. (2019). A novel approach to short-term stock price movement
Conference (CCDC) (pp. 1237–1242). IEEE. prediction using transfer learning. Applied Sciences, 9, 4745.
Li, Y., & Ma, W. (2010). Applications of artificial neural networks in financial economics: Niaki, S. T. A., & Hoseinzade, S. (2013). Forecasting s&p 500 index using artificial neural
A survey. In 2010 International Symposium on Computational Intelligence and networks and design of experiments. Journal of Industrial Engineering International, 9,
Design (pp. 211–214). volume 1. 1.
Li, Z., & Tam, V. (2017). Combining the real-time wavelet denoising and long-short-term- Nikfarjam, A., Emadzadeh, E., & Muthaiyah, S. (2010). Text mining approaches for stock
memory neural network for predicting stock indexes. In 2017 IEEE Symposium Series market prediction. In 2010 The 2nd International Conference on Computer and
on Computational Intelligence (SSCI) (pp. 1–8). IEEE. Automation Engineering (ICCAE) (pp. 256–260). volume 4. doi:10.1109/
Liang, X., Ge, Z., Sun, L., He, M., & Chen, H. (2019). Lstm with wavelet transform based ICCAE.2010.5451705.
data preprocessing for stock price prediction. Mathematical Problems in Nikou, M., Mansourfar, G., & Bagherzadeh, J. Stock price prediction using deep learning
Engineering, 2019. algorithm and its comparison with machine learning algorithms. Intelligent Systems
Lien Minh, D., Sadeghi-Niaraki, A., Huy, H. D., Min, K., & Moon, H. (2018). Deep in Accounting, Finance and Management.
learning approach for short-term stock trends prediction based on two-stream gated Ntakaris, A., Magris, M., Kanniainen, J., Gabbouj, M., & Iosifidis, A. (2017). Benchmark
recurrent unit network. IEEE Access, 6, 55392–55404. https://doi.org/10.1109/ dataset for mid-price prediction of limit order book data. arXiv preprint arXiv:
ACCESS.2018.2868970 1705.03233.
Lin, T., Guo, T., & Aberer, K. (2017). Hybrid neural networks for learning the trend in Nti, I. K., Adekoya, A. F., & Weyori, B. A. (2019). A systematic review of fundamental and
time series. In Proceedings of the Twenty-Sixth International Joint Conference on technical analysis of stock market predictions. Artificial Intelligence Review, 1–51.
Artificial Intelligence CONF (pp. 2273–2279). de Oliveira, F. A., Nobre, C. N., & Zarate, L. E. (2013). Applying artificial neural networks
Liu, G., & Wang, X. (2019). A numerical-based attention method for stock market to prediction of stock price and improvement of the directional prediction
prediction with dual information. IEEE Access, 7, 7357–7367. https://doi.org/ index–case study of petr4, petrobras, brazil. Expert Systems with Applications, 40,
10.1109/ACCESS.2018.2886367 7596–7606.
Liu, H., & Song, B. (2017). Stock trends forecasting by multi-layer stochastic ann Oncharoen, P., & Vateekul, P. (2018). Deep learning for stock market prediction using
bagging. In 2017 IEEE 29th International Conference on Tools with Artificial Intelligence event embedding and technical indicators. In 2018 5th International Conference on
(ICTAI) (pp. 322–329). https://doi.org/10.1109/ICTAI.2017.00058 Advanced Informatics: Concept Theory and Applications (ICAICTA) (pp. 19–24).
Liu, J., & Chen, S. (2019). Non-stationary multivariate time series prediction with https://doi.org/10.1109/ICAICTA.2018.8541310
selective recurrent neural networks. In Pacific Rim International Conference on Pang, X., Zhou, Y., Wang, P., Lin, W., & Chang, V. (2018). An innovative neural network
Artificial Intelligence (pp. 636–649). Springer. approach for stock market prediction. The Journal of Supercomputing. https://doi.
Liu, J., Lin, H., Liu, X., Xu, B., Ren, Y., Diao, Y., & Yang, L. (2019). Transformer-based org/10.1007/s11227-017-2228-y
capsule network for stock movement prediction. In Proceedings of the First Workshop Passalis, N., Tsantekidis, A., Tefas, A., Kanniainen, J., Gabbouj, M., & Iosifidis, A. (2017).
on Financial Technology and Natural Language Processing (pp. 66–73). Time-series classification using neural bag-of-features. In 2017 25th European Signal
Liu, J., Lu, Z., & Du, W. (2019). Combining enterprise knowledge graph and news Processing Conference (EUSIPCO) (pp. 301–305). IEEE.
sentiment analysis for stock price prediction. In Proceedings of the 52nd Hawaii Patel, J., Shah, S., Thakkar, P., & Kotecha, K. (2015). Predicting stock market index using
International Conference on System Sciences. fusion of machine learning techniques. Expert Systems with Applications, 42,
Liu, Q., Cheng, X., Su, S., & Zhu, S. (2018). Hierarchical complementary attention 2162–2172. https://doi.org/10.1016/j.eswa.2014.10.031. url:http://www.
network for predicting stock price movements with news. In Proceedings of the 27th sciencedirect.com/science/article/pii/S0957417414006551.
ACM International Conference on Information and Knowledge Management CIKM ’18 Peng, H., Long, F., & Ding, C. (2005). Feature selection based on mutual information
(pp. 1603–1606). New York, NY, USA: ACM. https://doi.org/10.1145/ criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions
3269206.3269286. on Pattern Analysis and Machine Intelligence, 27, 1226–1238.
Liu, Y., Zeng, Q., Ordieres Meré, J., & Yang, H. (2019). Anticipating stock market of the Peng, Y., & Jiang, H. (2016). Leverage financial news to predict stock price movements
renowned companies: A knowledge graph approach. Complexity, 2019. using word embeddings and deep neural networks. In Proceedings of the 2016
Long, W., Lu, Z., & Cui, L. (2019). Deep learning-based feature engineering for stock Conference of the North American Chapter of the Association for Computational
price movement prediction. Knowledge-Based Systems, 164, 163–173. https://doi. Linguistics: Human Language Technologies (pp. 374–379). San Diego, California:
org/10.1016/j.knosys.2018.10.034. url:http://www.sciencedirect.com/science/ Association for Computational Linguistics. url:https://www.aclweb.org/anthology/
article/pii/S0950705118305264. N16-1041. doi:10.18653/v1/N16-1041.
Loper, E., & Bird, S. (2002). Nltk: the natural language toolkit. arXiv preprint cs/ Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word
0205028. representation. In Proceedings of the 2014 conference on empirical methods in natural
Ma, D., Li, S., Zhang, X., & Wang, H. (2017). Interactive attention networks for aspect- language processing (EMNLP) (pp. 1532–1543).
level sentiment classification. arXiv preprint arXiv:1709.00893. Qin, Y., Song, D., Chen, H., Cheng, W., Jiang, G., & Cottrell, G. W. (2017). A dual-stage
Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2018). The m4 competition: Results, attention-based recurrent neural network for time series prediction. In Proceedings of
findings, conclusion and way forward. International Journal of Forecasting. https:// the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17 (pp.
doi.org/10.1016/j.ijforecast.2018.06.001. url:http://www.sciencedirect.com/ 2627–2633).
science/article/pii/S0169207018300785. Rawat, W., & Wang, Z. (2017). Deep convolutional neural networks for image
Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2018). Statistical and machine classification: A comprehensive review. Neural Computation, 29, 2352–2449.
learning forecasting methods: Concerns and ways forward. PLOS ONE, 13, 1–26. Reschenhofer, E., Mangat, M. K., Zwatz, C., & Guzmics, S. (2019). Evaluation of current
https://doi.org/10.1371/journal.pone.0194889 research on stock return predictability. Journal of Forecasting.
Malkiel, B. G. (2003). The efficient market hypothesis and its critics. Journal of Economic Rundo, F., Trenta, F., di Stallo, A. L., & Battiato, S. (2019). Machine learning for
Perspectives, 17, 59–82. quantitative finance applications: A survey. Applied Sciences, 9, 5574.
Matsubara, T., Akita, R., & Uehara, K. (2018). Stock price prediction by deep neural Sabour, S., Frosst, N., & Hinton, G.E. (2017). Dynamic routing between capsules. In
generative model of news articles. IEICE Transactions on Information and Systems, Advances in neural information processing systems (pp. 3856–3866).
101, 901–908. Sachdeva, A., Jethwani, G., Manjunath, C., Balamurugan, M., & Krishna, A. V. (2019). An
Matsunaga, D., Suzumura, T., & Takahashi, T. (2019). Exploring graph neural networks effective time series analysis for equity market prediction using deep learning model.
for stock market predictions with rolling window analysis. arXiv preprint arXiv: In 2019 International Conference on Data Science and Communication (IconDSC) (pp.
1909.10660. 1–5). IEEE.
Menezes Jr, J. M. P., & Barreto, G. A. (2008). Long-term time series prediction with the Sanboon, T., Keatruangkamala, K., & Jaiyen, S. (2019). A deep learning model for
narx network: An empirical evaluation. Neurocomputing, 71, 3335–3343. predicting buy and sell recommendations in stock exchange of thailand using long
Merello, S., Ratto, A. P., Oneto, L., & Cambria, E. (2019). Ensemble application of short-term memory. In 2019 IEEE 4th International Conference on Computer and
transfer learning and sample weighting for stock market prediction. In 2019 Communication Systems (ICCCS) (pp. 757–760). IEEE.
International Joint Conference on Neural Networks (IJCNN) (pp. 1–8). IEEE. Schumaker, R. P., & Chen, H. (2009). Textual analysis of stock market prediction using
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word breaking financial news: The azfin text system. Acm Transactions on Information
representations in vector space. arXiv preprint arXiv:1301.3781. Systems, 27, 1–19.
Mohan, S., Mullapudi, S., Sammeta, S., Vijayvergia, P., & Anastasiu, D. C. (2019). Stock Selvin, S., Vinayakumar, R., Gopalakrishnan, E. A., Menon, V. K., & Soman, K. P. (2017).
price prediction using news sentiment analysis. In 2019 IEEE Fifth International Stock price prediction using lstm, rnn and cnn-sliding window model. In 2017
Conference on Big Data Computing Service and Applications (BigDataService) (pp. International Conference on Advances in Computing, Communications and Informatics
205–208). IEEE. (ICACCI) (pp. 1643–1647). https://doi.org/10.1109/ICACCI.2017.8126078
20
W. Jiang Expert Systems With Applications 184 (2021) 115537
Sethia, A., & Raut, P. (2019). Application of lstm, gru and ica for stock price prediction. on Computational Intelligence and Virtual Environments for Measurement Systems and
In S. C. Satapathy, & A. Joshi (Eds.), Information and Communication Technology for Applications (CIVEMSA) (pp. 60–65). https://doi.org/10.1109/
Intelligent Systems (pp. 479–487). Singapore: Springer Singapore. CIVEMSA.2017.7995302
Sezer, O.B., Gudelek, M.U., & Ozbayoglu, A.M. (2019). Financial time series forecasting Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., &
with deep learning: A systematic literature review: 2005–2019. arXiv preprint arXiv: Polosukhin, I. (2017). Attention is all you need. In Advances in neural information
1911.13288. processing systems (pp. 5998–6008).
Sezer, O. B., & Ozbayoglu, A. M. (2018). Algorithmic financial trading with deep Wang, J., Sun, T., Liu, B., Cao, Y., & Zhu, H. (2019). Clvsa: A convolutional lstm based
convolutional neural networks: Time series to image conversion approach. Applied variational sequence-to-sequence model with attention for predicting trends of
Soft Computing, 70, 525–538. https://doi.org/10.1016/j.asoc.2018.04.024. url: financial markets. In Proceedings of the 28th International Joint Conference on Artificial
http://www.sciencedirect.com/science/article/pii/S1568494618302151. Intelligence (pp. 3705–3711). AAAI Press.
Sezer, O.B., & Ozbayoglu, A.M. (2019). Financial trading model with stock bar chart Wang, J., & Wang, J. (2015). Forecasting stock market indexes using principle
image time series with deep convolutional neural networks. arXiv preprint arXiv: component analysis and stochastic time effective neural networks. Neurocomputing,
1903.04610. 156, 68–78. https://doi.org/10.1016/j.neucom.2014.12.084. url:http://www.
Shah, D., Isah, H., & Zulkernine, F. (2019). Stock market analysis: A review and sciencedirect.com/science/article/pii/S0925231215000090.
taxonomy of prediction techniques. International Journal of Financial Studies, 7, 26. Wang, J.-Z., Wang, J.-J., Zhang, Z.-G., & Guo, S.-P. (2011). Forecasting stock indices with
Sharpe, W. F. (1994). The sharpe ratio. Journal of portfolio management, 21, 49–58. back propagation neural network. Expert Systems with Applications, 38,
Siami-Namini, S., Tavakoli, N., & Namin, A.S. (2019). A comparative analysis of 14346–14355. https://doi.org/10.1016/j.eswa.2011.04.222. url:http://www.
forecasting financial time series using arima, lstm, and bilstm. arXiv preprint arXiv: sciencedirect.com/science/article/pii/S0957417411007494.
1911.09512. Wang, Y., Li, Q., Huang, Z., & Li, J. (2019). Ean: Event attention network for stock price
Siami-Namini, S., Tavakoli, N., & Siami Namin, A. (2018). A comparison of arima and trend prediction based on sentimental embedding. In Proceedings of the 10th ACM
lstm in forecasting time series. In 2018 17th IEEE International Conference on Machine Conference on Web Science WebSci ’19 (pp. 311–320). New York, NY, USA: ACM.
Learning and Applications (ICMLA) (pp. 1394–1401). https://doi.org/10.1109/ https://doi.org/10.1145/3292522.3326014.
ICMLA.2018.00227 Weng, B., Ahmed, M. A., & Megahed, F. M. (2017). Stock market one-day ahead
Sim, H.S., Kim, H.I., & Ahn, J.J. (2019). Is deep learning for image recognition applicable movement prediction using disparate data sources. Expert Systems with Applications,
to stock market prediction? Complexity, 2019. 79, 153–163.
Singh, R., & Srivastava, S. (2017). Stock prediction using deep learning. Multimedia Tools Wu, H., Zhang, W., Shen, W., & Wang, J. (2018). Hybrid deep sequential modeling for
and Applications, 76, 18569–18584. https://doi.org/10.1007/s11042-016-4159-7 social text-driven stock prediction. In Proceedings of the 27th ACM International
Song, Y., Lee, J. W., & Lee, J. (2019). A study on novel filtering and relationship between Conference on Information and Knowledge Management CIKM’18 (pp. 1627–1630).
input-features and target-vectors in a deep learning model for stock price prediction. New York, NY, USA: ACM. https://doi.org/10.1145/3269206.3269290.
Applied Intelligence, 49, 897–911. https://doi.org/10.1007/s10489-018-1308-x Wu, Y., & Gao, J. (2018). Adaboost-based long short-term memory ensemble learning
Stoean, C., Paja, W., Stoean, R., & Sandita, A. (2019). Deep architectures for long-term approach for financial time series forecasting. Current Science, 00113891, 115.
stock price prediction with a heuristic-based strategy for trading simulations. PloS Xing, F. Z., Cambria, E., & Welsch, R. E. (2018). Natural language based financial
one, 14. forecasting: a survey. Artificial Intelligence Review, 50, 49–73.
Sun, H., Rong, W., Zhang, J., Liang, Q., & Xiong, Z. (2017). Stacked denoising Xiong, Z., Liu, X.-Y., Zhong, S., Yang, H., & Walid, A. (2018). Practical deep
autoencoder based stock market trend prediction via k-nearest neighbour data reinforcement learning approach for stock trading. arXiv preprint arXiv:
selection. In D. Liu, S. Xie, Y. Li, D. Zhao, & E.-S. M. El-Alfy (Eds.), Neural Information 1811.07522,.
Processing (pp. 882–892). Cham: Springer International Publishing. Xu, Y., & Cohen, S. B. (2018). Stock movement prediction from tweets and historical
Sun, J., Xiao, K., Liu, C., Zhou, W., & Xiong, H. (2019). Exploiting intra-day patterns for prices. In Proceedings of the 56th Annual Meeting of the Association for Computational
market shock prediction: A machine learning approach. Expert Systems with Linguistics (Volume 1: Long Papers) (pp. 1970–1979). Melbourne, Australia:
Applications, 127, 272–281. Association for Computational Linguistics. https://doi.org/10.18653/v1/P18-1183.
Tan, J., Wang, J., Rinprasertmeechai, D., Xing, R., & Li, Q. (2019). A tensor-based elstm Yan, H., & Ouyang, H. (2018). Financial time series prediction based on deep learning.
model to predict stock price using financial news. In Proceedings of the 52nd Hawaii Wireless Personal Communications, 102, 683–700. https://doi.org/10.1007/s11277-
International Conference on System Sciences. 017-5086-2
Tang, J., & Chen, X. (2018). Stock market prediction based on historic prices and news Yang, B., Gong, Z.-J., & Yang, W. (2017). Stock market index prediction using deep
titles. In Proceedings of the 2018 International Conference on Machine Learning neural network ensemble. In 2017 36th Chinese Control Conference (CCC) (pp.
Technologies ICMLT ’18 (pp. 29–34). New York, NY, USA: ACM. https://doi.org/ 3882–3887). IEEE.
10.1145/3231884.3231887. Yang, H., Zhu, Y., & Huang, Q. (2018). A multi-indicator feature selection for cnn-driven
Tang, N., Shen, Y., & Yao, J. (2019). Learning to fuse multiple semantic aspects from rich stock index prediction. In L. Cheng, A. C. S. Leung, & S. Ozawa (Eds.), Neural
texts for stock price prediction. In International Conference on Web Information Information Processing (pp. 35–46). Cham: Springer International Publishing.
Systems Engineering (pp. 65–81). Springer. Yolcu, U., Egrioglu, E., & Aladag, C. H. (2013). A new linear & nonlinear artificial neural
Tausczik, Y. R., & Pennebaker, J. W. (2010). The psychological meaning of words: Liwc network model for time series forecasting. Decision Support Systems, 54, 1340–1347.
and computerized text analysis methods. Journal of Language and Social Psychology, Yu, M.-H., & Wu, J.-L. (2019). Ceam: A novel approach using cycle embeddings with
29, 24–54. attention mechanism for stock price prediction. In A novel approach using cycle
Ticknor, J. L. (2013). A bayesian regularized artificial neural network for stock market embeddings with attention mechanism for stock price prediction. In 2019 IEEE
forecasting. Expert Systems with Applications, 40, 5501–5506. https://doi.org/ International Conference on Big Data and Smart Computing (BigComp) (pp. 1–4). IEEE.
10.1016/j.eswa.2013.04.013. url:http://www.sciencedirect.com/science/article/ Zamora, E., & Sossa, H. (2017). Dendrite morphological neurons trained by stochastic
pii/S0957417413002509. gradient descent. Neurocomputing, 260, 420–431.
Tieleman, T., & Hinton, G. (2012). Lecture 6.5-rmsprop: Divide the gradient by a running Zeiler, M.D. (2012). Adadelta: an adaptive learning rate method. arXiv preprint arXiv:
average of its recent magnitude. COURSERA: Neural networks for machine learning, 4, 1212.5701.
26–31. Zhan, X., Li, Y., Li, R., Gu, X., Habimana, O., & Wang, H. (2018). Stock price prediction
Tkáč, M., & Verner, R. (2016). Artificial neural networks in business: Two decades of using time convolution long short-term memory network. In W. Liu, F. Giunchiglia,
research. Applied Soft Computing, 38, 788–804. https://doi.org/10.1016/j. & B. Yang (Eds.), Knowledge Science, Engineering and Management (pp. 461–468).
asoc.2015.09.040. url:http://www.sciencedirect.com/science/article/pii/ Cham: Springer International Publishing.
S1568494615006122. Zhang, J., Rong, W., Liang, Q., Sun, H., & Xiong, Z. (2017). Data augmentation based
Tran, D. T., Gabbouj, M., & Iosifidis, A. (2017). Multilinear class-specific discriminant stock trend prediction using self-organising map. In D. Liu, S. Xie, Y. Li, D. Zhao, &
analysis. Pattern Recognition Letters, 100, 131–136. E.-S. M. El-Alfy (Eds.), Neural Information Processing (pp. 903–912). Cham: Springer
Tran, D. T., Iosifidis, A., Kanniainen, J., & Gabbouj, M. (2018). Temporal attention- International Publishing.
augmented bilinear network for financial time-series data analysis. IEEE Transactions Zhang, K., Zhong, G., Dong, J., Wang, S., & Wang, Y. (2019). Stock market prediction
on Neural Networks and Learning Systems, 30, 1407–1418. based on generative adversarial network. Procedia Computer Science, 147, 400–406.
Tran, D. T., Magris, M., Kanniainen, J., Gabbouj, M., & Iosifidis, A. (2017). Tensor url:http://www.sciencedirect.com/science/article/pii/S1877050919302789. doi:
representation in high-frequency financial data for price change prediction. In 2017 10.1016/j.procs.2019.01.256. 2018 International Conference on Identification,
IEEE Symposium Series on Computational Intelligence (SSCI) (pp. 1–7). IEEE. Information and Knowledge in the Internet of Things.
Treisman, A. M., & Gelade, G. (1980). A feature-integration theory of attention. Cognitive Zhang, L., Aggarwal, C., & Qi, G.-J. (2017). Stock price prediction via discovering multi-
Psychology, 12, 97–136. frequency trading patterns. In Proceedings of the 23rd ACM SIGKDD International
Tsang, G., Deng, J., & Xie, X. (2018). Recurrent neural networks for financial time-series Conference on Knowledge Discovery and Data Mining KDD, ’17 pp. 2141–2149). New
modelling. In 2018 24th International Conference on Pattern Recognition (ICPR) (pp. York, NY, USA: ACM. https://doi.org/10.1145/3097983.3098117.
892–897). https://doi.org/10.1109/ICPR.2018.8545666 Zhang, Z., Zohren, S., & Roberts, S. (2019). Deeplob: Deep convolutional neural networks
Tsantekidis, A., Passalis, N., Tefas, A., Kanniainen, J., Gabbouj, M., & Iosifidis, A. (2017). for limit order books. IEEE Transactions on Signal Processing, 67, 3001–3012.
Forecasting stock prices from the limit order book using convolutional neural Zhao, Z., Rao, R., Tu, S., & Shi, J. (2017). Time-weighted lstm model with redefined
networks. In 2017 IEEE 19th Conference on Business Informatics (CBI) (pp. 7–12). labeling for stock trend prediction. In 2017 IEEE 29th International Conference on
IEEE volume 1. Tools with Artificial Intelligence (ICTAI) (pp. 1210–1217). https://doi.org/10.1109/
Tsantekidis, A., Passalis, N., Tefas, A., Kanniainen, J., Gabbouj, M., & Iosifidis, A. (2017). ICTAI.2017.00184
Using deep learning to detect price change indications in financial markets. In 2017 Zhao, Z.-Q., Zheng, P., Xu, S.-T., & Wu, X. (2019). Object detection with deep learning: A
25th European Signal Processing Conference (EUSIPCO) (pp. 2511–2515). IEEE. review. IEEE Transactions on Neural Networks and Learning Systems, 30, 3212–3232.
Vargas, M. R., de Lima, B. S. L. P., & Evsukoff, A. G. (2017). Deep learning for stock Zheng, Z., Wu, X., & Srihari, R. (2004). Feature selection for text categorization on
market prediction from financial news articles. In 2017 IEEE International Conference imbalanced data. ACM Sigkdd Explorations Newsletter, 6, 80–89.
21
W. Jiang Expert Systems With Applications 184 (2021) 115537
Zhong, X., & Enke, D. (2017). Forecasting daily stock market return using dimensionality Zhou, X., Pan, Z., Hu, G., Tang, S., & Zhao, C. (2018). Stock market prediction on high-
reduction. Expert Systems with Applications, 67, 126–139. https://doi.org/10.1016/j. frequency data using generative adversarial nets. Mathematical Problems in
eswa.2016.09.027. url:http://www.sciencedirect.com/science/article/pii/ Engineering, 2018.
S0957417416305115. Zhou, Z., Gao, M., Liu, Q., & Xiao, H. (2019). Forecasting stock price movements with
Zhou, F., Zhou, H.-M., Yang, Z., & Yang, L. (2019). Emd2fnn: A strategy combining multiple data sources: Evidence from stock market in china. Physica A: Statistical
empirical mode decomposition and factorization machine based neural network for Mechanics and its Applications, 123389.
stock market trend prediction. Expert Systems with Applications, 115, 136–151.
22