Mathematics 08 01245 v2
Mathematics 08 01245 v2
Mathematics 08 01245 v2
Article
Deep Learning Methods for Modeling Bitcoin Price
Prosper Lamothe-Fernández 1 , David Alaminos 2, * , Prosper Lamothe-López 3
and Manuel A. Fernández-Gámez 4
1 Department of Financing and Commercial Research, UDI of Financing, Calle Francisco Tomás y Valiente, 5,
Universidad Autónoma de Madrid, 28049 Madrid, Spain; [email protected]
2 Department of Economic Theory and Economic History, Campus El Ejido s/n, University of Malaga,
29071 Malaga, Spain; [email protected]
3 Rho Finanzas Partner, Calle de Zorrilla, 21, 28014 Madrid, Spain; [email protected]
4 Department of Finance and Accounting, Campus El Ejido s/n, University of Malaga, 29071 Malaga, Spain
* Correspondence: [email protected]
!"#!$%&'(!
Received: 25 June 2020; Accepted: 28 July 2020; Published: 30 July 2020 !"#$%&'
Abstract: A precise prediction of Bitcoin price is an important aspect of digital financial markets
because it improves the valuation of an asset belonging to a decentralized control market. Numerous
studies have studied the accuracy of models from a set of factors. Hence, previous literature shows
how models for the prediction of Bitcoin su↵er from poor performance capacity and, therefore,
more progress is needed on predictive models, and they do not select the most significant variables.
This paper presents a comparison of deep learning methodologies for forecasting Bitcoin price and,
therefore, a new prediction model with the ability to estimate accurately. A sample of 29 initial factors
was used, which has made possible the application of explanatory factors of di↵erent aspects related
to the formation of the price of Bitcoin. To the sample under study, di↵erent methods have been
applied to achieve a robust model, namely, deep recurrent convolutional neural networks, which have
shown the importance of transaction costs and difficulty in Bitcoin price, among others. Our results
have a great potential impact on the adequacy of asset pricing against the uncertainties derived
from digital currencies, providing tools that help to achieve stability in cryptocurrency markets.
Our models o↵er high and stable success results for a future prediction horizon, something useful for
asset valuation of cryptocurrencies like Bitcoin.
Keywords: bitcoin; deep learning; deep recurrent convolutional neural networks; forecasting;
asset pricing
1. Introduction
Bitcoin is a cryptocurrency built by free software based on peer-to-peer networks as an irreversible
private payment platform. Bitcoin lacks a physical form, is not backed by any public body,
and therefore any intervention by a government agency or other agent is not necessary to transact [1].
These transactions are made from the blockchain system. Blockchain is an open accounting book,
which records transactions between two parties efficiently, leaving such a mark permanently and
impossible to erase, making this tool a decentralized validation protocol that is difficult to manipulate,
and with low risk of fraud. The blockchain system is not subject to any individual entity [2].
For Bitcoin, the concept originated from the concept of cryptocurrency, or virtual currency [3].
Cryptocurrencies are a monetary medium that is not a↵ected by public regulation, nor is it subject to a
regulatory body. It only a↵ects the activity and rules developed by the developers. Cryptocurrencies
are virtual currencies that can be created and stored only electronically [4]. The cryptocurrency is
designed to serve as a medium of exchange and for this, it uses cryptography systems to secure the
transaction and control the subsequent creation of the cryptocurrency. Cryptocurrency is a subset of a
digital currency designed to function as a medium of exchange and cryptography is used to secure the
transaction and control the future creation of the cryptocurrency.
Forecasting Bitcoin price is vitally important for both asset managers and independent investors.
Although Bitcoin is a currency, it cannot be studied as another traditional currency where economic
theories about uncovered interest rate parity, future cash-flows model, and purchasing power parity
matter, since di↵erent standard factors of the relationship between supply and demand cannot
be applied in the digital currency market like Bitcoin [5]. On the one hand, Bitcoin has di↵erent
characteristics that make it useful for those agents who invest in Bitcoin, such as transaction speed,
dissemination, decentrality, and the large virtual community of people interested in talking and
providing relevant information about digital currencies, mainly Bitcoin [6].
Velankar and colleagues [7] attempted to predict the daily price change sign as accurately as
possible using Bayesian regression and generalized linear model. To do this, they considered the daily
trends of the Bitcoin market and focused on the characteristics of Bitcoin transactions, reaching an
accuracy of 51% with the generalized linear model. McNally and co-workers [8] studied the precision
with which the direction of the Bitcoin price in United States Dollar (USD) can be predicted. They used
a recurrent neural network (RNN), a long short-term memory (LSTM) network, and the autoregressive
integrated moving average (ARIMA) method. The LSTM network obtains the highest classification
accuracy of 52% and a root mean square error (RMSE) of 8%. As expected, non-linear deep learning
methods exceeded the ARIMA method’s prognosis. For their part, Yogeshwaran and co-workers [9]
applied convolutional and recurrent neural networks to predict the price of Bitcoin using data from
a time interval of 5 min to 2 h, with convolutional neural networks showing a lower level of error,
at around 5%. Demir and colleagues [10] predicted the price of Bitcoin using methods such as long
short-term memory networks, naïve Bayes, and the nearest neighbor algorithm. These methods
achieved accuracy rates between 97.2% and 81.2%. Rizwan, Narejo, and Javed [11] continued with the
application of deep learning methods with the techniques of RNN and LSTM. Their results showed an
accuracy of 52% and an 8% RMSE by the LSTM. Linardatos and Kotsiantis [12] had the same results,
after using eXtreme Gradient Boosting (XGBoost) and LSTM; they concluded that this last technique
yielded a lower RMSE of 0.999. Despite the superiority of computational techniques, Felizardo and
colleagues [13] showed that ARIMA had a lower error rate than methods, such as random forest (RF),
support vector machine (SVM), LSTM, and WaveNets, to predict the future price of Bitcoin. Finally,
other works showed new deep learning methods, such as Dutta, Kumar, and Basu [14], who applied
both LSTM and the gated recurring unit (GRU) model; the latter showed the best error result, with an
RMSE of 0.019. Ji and co-workers [15] predicted the price of Bitcoin with di↵erent methodologies such
as deep neural network (DNN), the LSTM model, and convolutional neural network. They obtained a
precision of 60%, leaving the improvement of precision with deep learning techniques and a greater
definition of significant variables as a future line of research. These authors show the need for stable
prediction models, not only with data in and out of the sample, but also in forecasts of future results.
To contribute to the robustness of the Bitcoin price prediction models, in the present study a
comparison of deep learning methodologies to predict and model the Bitcoin price is developed and,
as a consequence, a new model that generates better forecasts of the Bitcoin price and its behavior in
the future. This model can predict achieving accuracy levels above 95%. This model was constructed
from a sample of 29 variables. Di↵erent methods were applied in the construction of the Bitcoin price
prediction model to build a reliable model, which is contrasted with various methodologies used in
previous works to check with which technique a high predictive capacity is achieved; specifically,
the methods of deep recurrent neural networks, deep neural decision trees, and deep support vector
machines, were used. Furthermore, this work attempts to obtain high accuracy, but it is also robust
and stable in the future horizon to predict new observations, something that has not yet been reported
by previous works [7–15], but which some authors demand for the development of these models and
their real contribution [9,12].
Mathematics 2020, 8, 1245 3 of 13
We make two main contributions to the literature. First, we consider new explanatory variables for
modeling the Bitcoin price, testing the importance of these variables which have not been considered
so far. It has important implications for investors, who will know which indicators provide reliable,
accurate, and potential forecasts of the Bitcoin price. Second, we improve the prediction accuracy
concerning that obtained in previous studies with innovative methodologies.
This study is structured as follows: Section 2 explains the theory of methods applied. Section 3
o↵ers details of the data and the variables used in this study. Section 4 develops the results obtained.
Section 5 provides conclusions of the study and the purposes of the models obtained.
yt = o(Wso st + by ) (2)
where Wxs , Wss , and Wso define the weights from the input layer x to the hidden layer s, by the biases
of the hidden layer and output layer. Equation (3) points out and o as the activation functions.
Z +1
j!t
STFT{z(t)}(⌧, !) ⌘ T (⌧, !) = z(t)!(t ⌧)e dt (3)
1
where z(t) is the vibration signals, and !(t) is the Gaussian window function focused around 0. T(⌧, !)
is the function that expresses the vibration signals. To calculate the hidden layers with the convolutional
operation, Equations (4) and (5) are applied.
Yt = o(WYS ⇤ St + By ) (5)
where Wh is the weight and bh is the bias. The model calculates the residuals caused by the di↵erence
between the predicted and the actual observations in the training stage [20]. Stochastic gradient descent
is applied for optimization to learn the parameters. Considering that the data at time t is r, the loss
function is determined as shown in Equation (7).
^ 1 ^ 2
L(r, r ) = kr rk2 (7)
2
where w is a constant with value w = [1, 2, ..., n + 1], ⌧ > 0 is a temperature factor, and b is defined in
Equation (9).
b = [0, 1, 1, 2, ..., 1 2 ··· n] (9)
The coding of the binning function x is given by the NN according the expression of Equation (9) [24].
The key idea is to build the DT with the applied Kronecker product from the binning function defined
above. Connecting every feature xd with its NN fd (xd ), we can determine all the final nodes of the DT
as appears in Equation (10).
z = f1(x1) ⌦ f2(x2) ⌦ · · · ⌦fD(xD) (10)
where z expresses the leaf node index obtained by instance x in vector form. The complexity parameter
of the model is determined by the number of cut points of each node. There may be inactive points
since the values of the cut points are usually not limited.
1 XN
min W T W + C max(1 W T xn tn , 0) (12)
w 2 n=1
Mathematics 2020, 8, 1245 5 of 13
Usually the Softmax or 1-of-K encoding method is applied in the classification task of deep
learning algorithms. In the case of working with 10 classes, the Softmax layer is composed of 10 nodes
P
and expressed by pi , where i = 1, ..., 10; pi specifies a discrete probability distribution, 10
i pi = 1.
Equation (13) is defined by h as the activation of the penultimate layer nodes, W as the weight
linked by the penultimate layer to the Softmax layer, and the total input into a Softmax layer. The next
expression is the result. X
ai = hk Wki (13)
k
exp(ai )
pi = P10 (14)
j exp(a j )
Since linear-SVM is not di↵erentiable, a popular variation is known as the DSVR, which minimizes
the squared hinge loss as indicated in Equation (16).
1 XN 2
min W T W + C max(1 W T xn tn , 0) (16)
w 2 n=1
The target of the DSVR is to train deep neural networks for prediction [24,25]. Equation (17)
expresses the di↵erentiation of the activation concerning the penultimate layer, where l (w) is said
di↵erentiation, changing the input x for the activation h.
@l(w)
= Ctn w(I{1 > wT ht tn }) (17)
@hn
where I{·} is the indicator function. Likewise, for the DSVR, we have Equation (18).
@l(w)
= 2Ctn w(max(1 W T hn tn , 0)) (18)
@hn
Variables Description
(a) Demand and Supply
Transaction value Value of daily transactions
Number of mined Bitcoins currently circulating on
Number of Bitcoins
the network
Bitcoins addresses Number of unique Bitcoin addresses used per day
Transaction volume Number of transactions per day
Unspent transactions Number of valid unspent transactions
Blockchain transactions Number of transactions on blockchain
Blockchain addresses Number of unique addresses used in blockchain
Block size Average block size expressed in megabytes
Miners reward Block rewards paid to miners
Mining commissions Average transaction fees (in USD)
Miners’ income divided by the number of
Cost per transaction
transactions
Difficulty Difficulty mining a new blockchain block
Hash Times a hash function can be calculated per second
Halving Process of reducing the emission rate of new units
(b) Attractive
Forum posts Number of new members in online Bitcoin forums
Forum members New posts in online Bitcoin forums
(c) Macroeconomic and Financial
Texas oil Oil Price (West Texas)
Brent oil Oil Price (Brent, London)
Dollar exchange rate Exchange rate between the US dollar and the euro
Dow Jones Dow Jones Index of the New York Stock Exchange
Gold Gold price in US dollars per troy ounce
The sample is fragmented into three mutually exclusive parts, one for training (70% of the data),
one for validation (10% of the data), and the third group for testing (20% of the data). The training data
are used to build the intended models, while the validation data attempt to assess whether there is
overtraining of those models. As for the test data, they serve to evaluate the built model and measure
the predictive capacity. The percentage of correctly classified cases is the precision results and RMSE
measures the level of errors made. Furthermore, for the distribution of the sample data in these three
phases, cross-validation 10 times with 500 iterations was used [28,29].
4. Results
variable. Despite these extremes, they do not a↵ect the values of the standard deviations of the
respective variables.
Table
Table 3.
3. Results
Results of
of accuracy
accuracy evaluation:
evaluation: classification
classification (%).
(%).
Table 4. Results of accuracy evaluation: greater sensitivity variables.
DRCNN
DRCNN DNDT
DNDT DSVR
DSVR
Sample DRCNN DNDT DSVR
Sample Acc.
Acc.
Acc.
Acc. (%)
(%) RMSE
RMSEvalueMAPE
Transaction Acc.
Acc. (%)
MAPE Transaction RMSE
(%) volume
RMSE MAPE
MAPE
Transaction value RMSE
RMSE MAPE
MAPE
(%)
(%)
Transaction volume Block size Block size
Training
Training 97.34
97.34 0.66
0.66 0.29 95.86
0.29 Blockchain 0.70
95.86transactions
0.70 0.33
0.33 94.49
94.49 0.75
0.75 0.38
0.38
Block size Blockchain transactions
Validatio
Validatio Cost per transaction Cost per transaction Cost per transaction
96.18
96.18 0.71
0.71 0.34
0.34 95.07
95.07 0.74
0.74 0.37
0.37 93.18
93.18 0.81
0.81 0.43
0.43
nn Difficulty Difficulty Difficulty
Testing
Testing 95.27
95.27Dollar exchange
0.77
0.77 rate0.40
0.40 Forum
94.42 posts 0.79
94.42 0.79 0.42Forum 92.61
0.42 posts
92.61 0.84
0.84 0.47
0.47
DRCNN: Dow Jones Dow Jones Dollar exchange rate
DRCNN: deep
deep recurrent
recurrent convolution
convolution neural
neural network;
network; DNDT:
DNDT: deep
deep neural
neural decision
decision trees;
trees; DSVR:
DSVR: deep
deep learning
learning
Gold Gold Dow Jones
linear
linear support
support vector
vector machines;
machines; Acc:
Acc: accuracy;
accuracy; RMSE:
RMSE: rootroot mean
mean square
square error;
error; MAPE:
MAPE: mean absolute percentage
Gold mean absolute percentage
error.
error.
Figure
Figure 1.
Figure 1. Results
1. Results of
Results of accuracy
of accuracy evaluation:
accuracy evaluation: classification
evaluation: classification (%).
classification (%).
(%).
Figure
Figure 2.
2. Results
Results of accuracy evaluation:
evaluation: RMSE.
Figure 2. Results of
of accuracy
accuracy evaluation: RMSE.
RMSE.
Mathematics 2020, 8, 1245 9 of 13
Mathematics 2020, 8, x FOR PEER REVIEW 9 of 14
Figure 3.
Figure 3. Results
Results of
of accuracy
accuracy evaluation: MAPE.
evaluation: MAPE.
moment t + 1, and this prediction is used to predict for moment t + 2 and so on. This means that the
predicted
and Figures data
4–6for t + 1the
show areaccuracy
consideredandreal data
error and are
results for added
t + 1 andto the
t + 2end of the available
forecasting data
horizons. [33].
For t+
Table 5 and Figures 4–6 show the accuracy and error results for t + 1 and t + 2 forecasting
1, the range of precision for the three methods is 88.34–94.19% on average, where the percentage of horizons.
For t + 1,isthe
accuracy range
higher of precision
in the for the three
DRCNN (94.19%). For t methods is 88.34–94.19%
+ 2, this range of precisionon average, where
is 85.76–91.37%, the
where
percentage of accuracy is higher in the DRCNN (94.19%). For t + 2, this range of
the percentage of accuracy is once again higher in the DRCNN (91.37%). These results show the high precision is 85.76–
91.37%,
precisionwhere the percentage
and great robustnessofofaccuracy is once again higher in the DRCNN (91.37%). These results
the models.
show the high precision and great robustness of the models.
Table 5. Multiple-step ahead forecasts in forecast horizon = t + 1 and t + 2.
Table 5. Multiple-step ahead forecasts in forecast horizon = t + 1 and t + 2.
DRCNN DNDT DSVR
Horizon DRCNN DNDT DSVR
Horizon Acc. (%) RMSE MAPE Acc. (%) RMSE MAPE Acc. (%) RMSE MAPE
Acc. (%) RMSE MAPE Acc. (%) RMSE MAPE Acc. (%) RMSE MAPE
t+1 94.19 0.81 0.52 92.35 0.87 0.59 88.34 0.97 0.65
t + 1 t + 2 94.19 91.37 0.81 0.92 0.520.63 92.35
89.41 0.87
1.03 0.59
0.67 88.34
85.76 0.97
1.10 0.78 0.65
t+2 91.37 0.92 0.63 89.41 1.03
Acc: accuracy.
0.67 85.76 1.10 0.78
Acc: accuracy.
Figure
PEER4.
Figure
Mathematics 2020, 8, x FOR Multiple-step
Multiple-step ahead
4.REVIEW ahead forecasts
forecasts in
in forecast
forecast horizon:
horizon: accuracy.
accuracy. 11 of 14
Figure
Figure 5.
5. Multiple-step
Multiple-step ahead
ahead forecasts
forecasts in
in forecast
forecast horizon:
horizon: RMSE.
RMSE.
Mathematics 2020, 8, 1245 11 of 13
Figure 5. Multiple-step ahead forecasts in forecast horizon: RMSE.
Figure 6.
Figure 6. Multiple-step
Multiple-step ahead
ahead forecasts in forecast
forecasts in forecast horizon:
horizon: MAPE.
MAPE.
5. Conclusions
5. Conclusions
This study developed
This study developed aa comparison
comparisonof ofmethodologies
methodologiestotopredictpredictBitcoin
Bitcoinprice
priceand,
and,therefore,
therefore,a
anew
newmodel
modelwaswascreated
createdtotoforecast
forecastthisthisprice.
price.TheTheperiod
periodselected
selectedwaswas from
from 2011
2011 to
to 2019.
2019. We
We applied
applied
di↵erent
different deep
deep learning
learning methods
methods in in the
the construction
construction of of the
the Bitcoin
Bitcoin price
price prediction
prediction model
model toto achieve
achieve
aa robust
robust model,
model,such
suchasasdeep
deeprecurrent
recurrent convolutional
convolutional neural
neural network,
network, deep deep neural
neural decision
decision treestrees
and
and deep support vector machines. The DRCNN model obtained the
deep support vector machines. The DRCNN model obtained the highest levels of precision. We highest levels of precision.
We propose
propose to increase
to increase thethe level
level of of performance
performance ofofthe
themodels
modelstotopredict
predictthetheprice
priceofof Bitcoin
Bitcoin compared
compared
to
to previous
previousliterature.
literature.This
Thisresearch
research hashas
shownshownsignificantly higher
significantly precision
higher resultsresults
precision than those
thanshown
those
in previous works, achieving a precision hit range of 92.61–95.27%. Likewise, it
shown in previous works, achieving a precision hit range of 92.61–95.27%. Likewise, it was possible was possible to identify
atonew set ofasignificant
identify new set ofvariables
significant forvariables
the prediction
for theof prediction
the price ofofBitcoin, o↵ering
the price great stability
of Bitcoin, offering in the
great
models developed predicting in the future horizons of one and two years.
stability in the models developed predicting in the future horizons of one and two years.
This
This research
research allows
allows usus to
to increase
increase the
the results
results and
and conclusions
conclusions on on the
the price
price of
of Bitcoin
Bitcoin concerning
concerning
previous
previous works,
works, both
both in
in matters
matters of of precision
precision andand error,
error, but
but also
also on
on significant
significant variables.
variables. A A set
set of
of
significant variables for each methodology applied has been selected analyzing
significant variables for each methodology applied has been selected analyzing our results, but some our results, but some
of
of these
these variables
variables are
are recurrent
recurrent in in the
the three
three methods.
methods. This supposes an
This supposes important addition
an important addition toto the
the field
field
of cryptocurrency pricing. The conclusions are relevant to central bankers, investors, asset managers,
private forecasters, and business professionals for the cryptocurrencies market, who are generally
interested in knowing which indicators provide reliable, accurate, and potential forecasts of price
changes. Our study suggests new and significant explanatory variables to allow these agents to predict
the Bitcoin price phenomenon. These results have provided a new Bitcoin price forecasting model
developed using three methods, with the DCRNN model as the most accurate, thus contributing to
existing knowledge in the field of machine learning, and especially, deep learning. This new model
can be used as a reference for setting asset pricing and improved investment decision-making.
In summary, this study provides a significant opportunity to contribute to the field of finance,
since the results obtained have significant implications for the future decisions of asset managers,
making it possible to avoid big change events of the price and the potential associated costs. It also
helps these agents send warning signals to financial markets and avoid massive losses derived from an
increase of volatility in the price.
Opportunities for further research in this field include developing predictive models considering
volatility correlation of the other new alternative assets and also safe-haven assets such as gold or
stable currencies, that evaluate the di↵erent scenarios of portfolio choice and optimization.
Mathematics 2020, 8, 1245 12 of 13
Author Contributions: Conceptualization, P.L.-F., D.A., P.L.-L. and M.A.F.-G.; Data curation, D.A. and M.A.F.-G.;
Formal analysis, P.L.-F., D.A. and P.L.-L.; Funding acquisition, P.L.-F., P.L.-L. and M.A.F.-G.; Investigation, D.A.
and M.A.F.-G.; Methodology, D.A.; Project administration, P.L.-F. and M.A.F.-G.; Resources, P.L.-F. and M.A.F.-G.;
Software, D.A.; Supervision, D.A.; Validation, D.A. and P.L.-L.; Visualization, P.L.-F. and D.A.; Writing—original
draft, P.L.-F. and D.A.; Writing—review & editing, P.L.-F., D.A., P.L.-L. and M.A.F.-G. All authors have read and
agreed to the published version of the manuscript.
Funding: This research was funded by Cátedra de Economía y Finanzas Sostenibles, University of Malaga, Spain.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Kristoufek, L. What Are the Main Drivers of the Bitcoin Price? Evidence from Wavelet Coherence Analysis.
PLoS ONE 2015, 10, e0123923. [CrossRef]
2. Wamba, S.F.; Kamdjoug, J.R.K.; Bawack, R.E.; Keogh, J.G. Bitcoin, Blockchain and Fintech: A systematic
review and case studies in the supply chain. Prod. Plan. Control Manag. Oper. 2019, 31, 115–142. [CrossRef]
3. Chen, W.; Zheng, Z.; Ma, M.; Wu, J.; Zhou, Y.; Yao, J. Dependence structure between bitcoin price and its
influence factors. Int. J. Comput. Sci. Eng. 2020, 21, 334–345. [CrossRef]
4. Balcilar, M.; Bouri, E.; Gupta, R.; Roubaud, D. Can volume predict bitcoin returns and volatility?
A quantiles-based approach. Econ. Model. 2017, 64, 74–81. [CrossRef]
5. Ciaian, P.; Rajcaniova, M.; Artis Kancs, D. The economics of BitCoin price formation. Appl. Econ. 2016,
48, 1799–1815. [CrossRef]
6. Schmidt, R.; Möhring, M.; Glück, D.; Haerting, R.; Keller, B.; Reichstein, C. Benefits from Using Bitcoin:
Empirical Evidence from a European Country. Int. J. Serv. Sci. Manag. Eng. Technol. 2016, 7, 48–62. [CrossRef]
7. Velankar, S.; Valecha, S.; Maji, S. Bitcoin Price Prediction using Machine Learning. In Proceedings of the
20th International Conference on Advanced Communications Technology (ICACT), Chuncheon-si, Korea,
11–14 February 2018.
8. McNally, S.; Roche, J.; Caton, S. Predicting the Price of Bitcoin Using Machine Learning. In Proceedings
of the 26th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing,
Cambridge, UK, 21–23 March 2018.
9. Yogeshwaran, S.; Kaur, M.J.; Maheshwari, P. Project Based Learning: Predicting Bitcoin Prices using Deep
Learning. In Proceedings of the 2019 IEEE Global Engineering Education Conference (EDUCON), Dubai,
UAE, 9–11 April 2019.
10. Demir, A.; Akılotu, B.N.; Kadiroğlu, Z.; Şengür, A. Bitcoin Price Prediction Using Machine Learning Methods.
In Proceedings of the 2019 1st International Informatics and Software Engineering Conference (UBMYK),
Ankara, Turkey, 6–7 November 2019.
11. Rizwan, M.; Narejo, S.; Javed, M. Bitcoin price prediction using Deep Learning Algorithm. In Proceedings
of the 13th International Conference on Mathematics, Actuarial Science, Computer Science and Statistics
(MACS), Karachi, Pakistan, 14–15 December 2019.
12. Linardatos, P.; Kotsiantis, S. Bitcoin Price Prediction Combining Data and Text Mining. In Advances in
Integrations of Intelligent Methods. Smart Innovation, Systems and Technologies; Hatzilygeroudis, I., Perikos, I.,
Grivokostopoulou, F., Eds.; Springer: Singapore, 2020.
13. Felizardo, L.; Oliveira, R.; Del-Moral-Hernández, E.; Cozman, F. Comparative study of Bitcoin price prediction
using WaveNets, Recurrent Neural Networks and other Machine Learning Methods. In Proceedings of the
6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC), Beijing, China,
28–30 October 2019.
14. Dutta, A.; Kumar, S.; Basu, M. A Gated Recurrent Unit Approach to Bitcoin Price Prediction. J. Risk
Financ. Manag. 2020, 13, 23. [CrossRef]
15. Ji, S.; Kim, J.; Im, H. A Comparative Study of Bitcoin Price Prediction Using Deep Learning. Mathematics
2019, 7, 898. [CrossRef]
16. Saltelli, A. Making best use of model evaluations to compute sensitivity indices. Comput. Phys. Commun.
2002, 145, 280–297. [CrossRef]
17. Wang, S.; Chen, X.; Tong, C.; Zhao, Z. Matching Synchrosqueezing Wavelet Transform and Application to
Aeroengine Vibration Monitoring. IEEE Trans. Instrum. Meas. 2017, 66, 360–372. [CrossRef]
Mathematics 2020, 8, 1245 13 of 13
18. Huang, C.-W.; Narayanan, S.S. Deep convolutional recurrent neural network with attention mechanism for
robust speech emotion recognition. In Proceedings of the 2017 IEEE International Conference on Multimedia
and Expo, Hong Kong, China, 10–14 July 2017; pp. 583–588.
19. Ran, X.; Xue, L.; Zhang, Y.; Liu, Z.; Sang, X.; Xe, J. Rock Classification from Field Image Patches Analyzed
Using a Deep Convolutional Neural Network. Mathematics 2019, 7, 755. [CrossRef]
20. Ma, M.; Mao, Z. Deep Recurrent Convolutional Neural Network for Remaining Useful Life Prediction.
In Proceedings of the 2019 IEEE International Conference on Prognostics and Health Management (ICPHM),
San Francisco, CA, USA, 17–20 June 2019; pp. 1–4.
21. Yang, Y.; Garcia-Morillo, I.; Hospedales, T.M. Deep Neural Decision Trees. In Proceedings of the 2018 ICML
Workshop on Human Interpretability in Machine Learning (WHI 2018), Stockholm, Sweden, 14 July 2018.
22. Norouzi, M.; Collins, M.D.; Johnson, M.; Fleet, D.J.; Kohli, P. Efficient non-greedy optimization of decision
trees. In Proceedings of the 28th International Conference on Neural Information Processing Systems,
Montreal, QC, Canada, 8–13 December 2015; pp. 1729–1737.
23. Dougherty, J.; Kohavi, R.; Sahami, M. Supervised and unsupervised discretization of continuous features.
In Proceedings of the 12th International Conference on Machine Learning (ICML), Tahoe City, CA, USA,
9–12 July 1995.
24. Jang, E.; Gu, S.; Poole, B. Categorical reparameterization with Gumbel-Softmax. arXiv 2017, arXiv:1611.01144.
25. Tang, Y. Categorical reparameterization with Gumbel-Softmax. arXiv 2013, arXiv:1306.0239.
26. Delen, D.; Kuzey, C.; Uyar, A. Measuring firm performance using financial ratios: A decision tree approach.
Expert Syst. Appl. 2013, 40, 3970–3983. [CrossRef]
27. Efimov, D.; Sulieman, H. Sobol Sensitivity: A Strategy for Feature Selection. In Mathematics Across
Contemporary Sciences. AUS-ICMS 2015; Springer Proceedings in Mathematics & Statistics: Cham, Switzerland,
2017; Volume 190.
28. Alaminos, D.; Fernández, S.M.; García, F.; Fernández, M.A. Data Mining for Municipal Financial Distress
Prediction, Advances in Data Mining, Applications and Theoretical Aspects. Lect. Notes Comput. Sci. 2018,
10933, 296–308.
29. Zhang, G.P.; Qi, M. Neural network forecasting for seasonal and trend time series. Eur. J. Oper. Res. 2005,
160, 501–514. [CrossRef]
30. Polasik, M.; Piotrowska, A.I.; Wisniewski, T.P.; Kotkowski, R.; Lightfoot, G. Price fluctuations and the use of
Bitcoin: An empirical inquiry. Int. J. Electron. Commer. 2015, 20, 9–49. [CrossRef]
31. Al-Khazali, O.; Bouri, E.; Roubaud, D. The impact of positive and negative macroeconomic news surprises:
Gold versus Bitcoin. Econ. Bull. 2018, 38, 373–382.
32. Koprinska, I.; Rana, M.; Rahman, A. Dynamic ensemble using previous and predicted future performance for
Multi-step-ahead solar power forecasting. In Proceedings of the ICANN 2019: Artificial Neural Networks
and Machine Learning, Munich, Germany, 17–19 September 2019; pp. 436–449.
33. Makridakis, S.; Spiliotis, E.; Assimakopoulos, V. Statistical and Machine Learning forecasting methods:
Concerns and ways forward. PLoS ONE 2018, 13, e0194889. [CrossRef] [PubMed]
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
PT. TRI DIGITAL PERKASA
Equity Tower, lantai 37 unit D&H, Jl. Jend. Sudirman Kav. 52-53 (SCBD),
Jakarta Selatan - 12190
Telp. 081973008899
TANDA TERIMA
001/TT/TDP/IX/2022
Penerima, Pengirim,
SURAT HUTANG
0002.SUP/02.09.TDP/2022
Dengan ini menyatakan bahwa benar pihak kedua memiliki hutang gaji kepada pihak pertama
sebesar Rp 30.000.000. Dalam pelunasannya akan dilakukan secara berangsur.
Penerima, Direktur,
Signal Processing
journal homepage: www.elsevier.com/locate/sigpro
a r t i c l e i n f o a b s t r a c t
Article history: In stock and flight price time series diffusion and jumps govern price evolution over time. A jump-
Received 29 August 2020 diffusion dyadic particle filter is proposed for price prediction. In stock price prediction, the dyad com-
Revised 17 December 2020
prises a latent vector modeling each stock and a latent vector modeling the group of companies in the
Accepted 16 January 2021
same category. In flight price prediction, the dyad consists of a departure latent vector and an arrival
Available online 19 January 2021
latent vector, respectively. A particle coefficient is introduced to encode both diffusion and jumps. The
Keywords: diffusion process is assumed to be a geometric Brownian motion whose dynamics are modeled by a
Particle filters Kalman filter. The negative log-likelihood of the posterior distribution is approximated by a Taylor ex-
Price diffusion pansion around the previously observed drift parameter. Efficient approximations of the first and second-
Price jumps order derivatives of the negative log-likelihood with respect to the previously observed drift parameter
Stock price prediction are derived. To infer sudden price jumps, a reversible jump Markov chain Monte-Carlo framework is used.
Flight price prediction
Experiments have demonstrated that price jump and diffusion inference mechanisms lead to more accu-
rate predictions compared to state-of-the-art techniques. Performance gains are attested to be statistically
significant.
© 2021 Elsevier B.V. All rights reserved.
https://doi.org/10.1016/j.sigpro.2021.107994
0165-1684/© 2021 Elsevier B.V. All rights reserved.
M. Ntemi and C. Kotropoulos Signal Processing 183 (2021) 107994
portance resampling (SIR). To ensure particle variability, noise is Table 1 summarizes the notation used throughout the paper.
added to particles, which is known as roughening [16]. Section 2 reviews related work. Section 3.1 discusses the con-
In this paper, a pair of latent vectors (i.e., a dyad) constructs struction of the latent space of dyadic particle filter. The diffu-
a latent space. These latent vectors interact through time as in sion and jump processes are elaborated in Sections 3.2 and 3.3,
Collaborative Kalman filter (CKF)[17]. In stock price prediction, a respectively. The particle coefficient vector generation is detailed
dyad is formed by a stock latent vector modeling the stock evo- in Section 3.4 and PF is described in Section 3.5. Experimental re-
lution and a market segment latent vector modeling the evolution sults are presented in Section 4 and conclusions are summarized
of the market segment where the respective stock belongs to. For in Section 5.
example, the stock latent vector models BP stock price evolution
through time and the segment latent vector models the evolution 2. Related work
of oil companies through time. In flight price prediction, a dyad is
constructed by a departure latent vector, i.e., Amsterdam and an Jump-diffusion models have gained much attention in price
arrival latent vector, i.e., Athens. This latent space is utilized in a time series forecasting. The Merton jump-diffusion model is a pop-
dyadic PF (DPF) [18]. Instead of assuming these vectors fixed as in ular model to describe price dynamics. In [24], a Stochastic Grid
[18], here they are treated as dynamically evolving entities. Specifi- Bundling Method is applied to a Merton jump-diffusion model
cally, the stock latent vector is assumed to evolve through time ac- for pricing multidimensional Bermudan options, which reduces the
cording to a geometric Brownian motion log drift parameter. This uncertainty of price estimates by control variates. Other works uti-
parameter records the diffusion process. It is computed through a lize radial basis function interpolation techniques in the Merton
second-order Taylor expansion of the negative log-likelihood of the model to compute option prices, which allow the implementation
posterior distribution of the latent vectors about the previous value of the boundary conditions efficiently [25,26]. In [27], a tridiagonal
of the log drift parameter. The posterior distribution is obtained system of linear equations is proposed to solve a partial integro-
by Kalman filtering. A Newton-type update of the drift parameter differential equation in Merton and Kou jump-diffusion models.
is employed as in [17]. Extending [17], efficient approximations of Some works combine the Markov-switching approach and Lèvy
the first and second-order derivatives of the negative log-likelihood processes to detect jumps and regime-switches in a price time se-
with respect to (w.r.t.) the previous value of the log drift parame- ries [28].
ter are derived from first principles. The jump process is inferred Many non-linear filters (e.g., Extended Kalman filter, Unscented
through a reversible-jump Markov chain Monte Carlo (rjMCMC) Kalman filter) have been proposed to approximate the non-
scenario, since both the arrival rate of a jump and the times of Gaussian dynamics governing price evolution through time. An ex-
a jump occurrence are unknown [19], [20]. Here, the jump arrivals tended Kalman filter is applied to estimate the Schwartz model
follow a Poisson process, where the rate is locally constant within and estimate the price of WTI crude oil [29]. An H-extended
a window of 3 price observations. The model parameters, i.e., jump Kalman filter is proposed in [30] to bound the influence of un-
times and jump rate, are sampled from their full conditional distri- certainties. Other works combine a square root Unscented Kalman
bution via a Gibbs sampler. Both diffusion and jump components filter and control theory to ensure positive semi-definiteness of
are attached to the main diagonal of the posterior covariance ma- the state covariances [31], [32]. PFs also deal with the non-linear
trix of the latent vectors at the previous time step, forming the dynamics in time series. In [33], an iterated auxiliary PF is pro-
prior covariance matrix at the current time step. Then, the product posed to approximate an optimal sequence of positive functions.
of the diagonal elements of the prior covariance matrices of the In [34], a PF with non-Gaussian transition densities is proposed
latent vectors of the dyad is computed. A particle coefficient vec- for high-frequency financial data estimation. Although these filters
tor is introduced, whose components encode the aforementioned track price volatility through time, they fail to capture any sudden
product, yielding a jump-diffusion particle filter (JDPF). Particles jumps.
are generated according to this aforementioned particle coefficient
vectors, which influence strongly their efficacy. As a result, a small
3. The jump-diffusion particle filter
number of particles leads to the optimal approximation of the pos-
terior distribution within a Bayesian context. In contrast to [21],
The jump-diffusion particle filter builds on the (DPF) [18]. Here,
the tracking ability of JDPF does not rely on the assumption of a
the dynamic evolution of the prior covariance matrix of the latent
heavy-tailed distributed observation error variance, but it relies on
state vectors is governed by a jump-diffusion model.
the jump-diffusion scheme, leading to a totally different approxi-
mation of the posterior mean vectors, the posterior covariance ma-
trices, and the particles. By doing so, the efficacy of the particle 3.1. The latent state of the dyadic particle filter
filter is strongly enhanced, as it is demonstrated by experiments.
JDPF tracks closely both the volatility and price jumps through The latent state of the DPF is constructed through a probabilis-
time, indeed. JDPF predictions are compared to those of state-of- tic approach, by utilizing KF. That is, the latent state consists of
the-art methods in [17], [20], [22], and [23] as well as our previous a dyad i, j of two latent vectors si [t] ∈ Rn and m j [t] ∈ Rn , which
results [18] and [21]. In stock price prediction, JDPF outperforms all interact through time [17]. At every time step, the prior probabil-
methods in all cases. In flight price prediction, JDPF performs best ity distributions of these latent vectors are computed and when an
in those routes, where jumps are present. Summing up, three novel observation occurs, their posterior probability distributions should
contributions are made in the paper: be inferred. The prior distributions of the latent vectors at time t
form the state model
! " ! "
si [t] ∼ N µsi [t], !si [t] , m j [t] ∼ N µm j [t], !m j [t] (1)
• A new analytical approximation is derived for the log drift pa-
rameter. which are multivariate Gaussians with prior mean vectors
• Jumps are added as a concept to the particle filter, allowing it µsi , µm j ∈ Rn and prior covariance matrices !si , !m j ∈ Rn×n . In
to track consistently the overall price evolution through time. Eq. (1), instead of assuming that the prior covariance ma-
• Statistical hypothesis testing demonstrates that there is strong trix !si ∈ Rn×n is the posterior one at t − 1, i.e., !si [t] =
evidence to reject the null hypothesis of equal mean squared !&si [t − 1] as in [18], the prior covariance matrix is allowed to
prediction error at 5% significance level. evolve according to a jump-diffusion framework. Specifically, the
2
M. Ntemi and C. Kotropoulos Signal Processing 183 (2021) 107994
Table 1
Notation.
prior mean vector and the prior covariance matrix are defined is the identity matrix. In Eq. (2), αsi [t] is the drift value of the
as Brownian motion of si and J[t] is the jump value at time t. These
! " entities represent the diffusion and jump components of JDPF. Let
µsi [t] = µ&si [t − 1], !si [t] = !&si [t − 1] + αsi [t] + J[t] I (2) the state equation si [t] = s&i [t − '[t] ] + g[t]. Assuming that the pos-
terior latent vector at t − '[t] and the noise vector g are Gaussian,
where µ& [t − 1] denotes the posterior mean vector at the previous
the expression for the covariance matrix of the prior latent vector
time step, !& [t − 1] is the posterior covariance matrix and I ∈ Rn×n
3
M. Ntemi and C. Kotropoulos Signal Processing 183 (2021) 107994
at t in Eq. (2) results from with y[t − 1] being the price at the previous time step. The opti-
#$ t
% mal parameter updates for the approximate distribution of m j are
!si [t] = !&si [t − '[t] ] + ea[t˜] dt˜ + J[t] I (3) obtained in a similar manner. As can be seen in Eq. (8), both the
t−'[t] posterior mean vector µ&si [t] and the posterior covariance matrix
where the stochastic integral is associated to geometric Brown- !&si [t] contain information about m j [t]. In the corresponding com-
ian motion. Since typically the paths of a Brownian motion are putation of µ&m j [t] and !&m j [t], the dynamically evolving si [t] drifts
nowhere differentiable, a solution can be provided by extending m j in its path in the joint latent space. The posterior parameters
the differential calculus to Itô calculus. By doing so, the stochas- at time t constitute the prior parameters at time t + 1. Next, the
&t
tic integral of Brownian motion is approximated as t−'[t] ea[t˜] dt˜ ≈ drift value αsi [t] in Eq. (2) is calculated.
ea[t] '[t] = α [t] '[t] with '[t] denoting the number of time steps
that passed since the last observation of si (here, '[t] = 1) [17]. 3.2. Diffusion process
Accordingly, the main diagonal of the prior covariance matrix of
si [t] encodes the jump-diffusion model to be exploited by the par- The diffusion dynamics of the model lie in the conditional pdf
ticle coefficient vector to improve price estimation. The drift is re- of the prior latent vector si [t] given the posterior latent vector at
sponsible for the constant motion of si capturing the price volatil- the previous time step
ity through time, while the jump component captures discontinu- ! "
ities, i.e., price jumps. Eq. (2) is also applied to compute the pa- p(si [t] | s&i [t − 1] ) ∼ N µs&i [t − 1], !&si [t − 1] + (αsi [t] + J[t] )I
rameters of the prior distribution of the latent vector m j [t] with
(9)
the difference that there is not a drift component. m j [t] evolution
relies on the motion of si [t], a procedure accomplished through as [t]
where α is a geometric Brownian motion αsi [t] = e i , with asi [t]
the posterior probability density function (pdf) calculation, which representing the Brownian log drift value distributed as:
will be discussed next. ) *
The posterior pdfs of the latent vectors, taking into account the asi [t] ∼ N asi [t − 1], ( '[at] (10)
price observation y[t], are also multivariate Gaussians
! & " ! " [t]
where 'a is the time elapsed since the last observation of asi ,
s&i [t] ∼ N µsi [t], !&si [t] , m&j [t] ∼ N µ&m j [t], !&m j [t] (4)
assuming values equal to 1 or 2 (days), and ( is an extra drift pa-
with µ& and !& denoting the posterior mean vectors and the rameter [17]. The geometric Brownian motion is also inferred at
posterior covariance matrices, respectively. Since the calculation every time step via variational inference. That is, to compute the
of the posterior mean vector and covariance matrix is an in- Brownian log drift value, a second-order Taylor expansion about
tractable problem, a variational inference technique is utilized to its previous occurrence value at t − 1, asi [t − 1], is applied [17]:
obtain approximate solutions [35]. That is, a factorized distribu-
tion f (asi [t] ) ≈ f (asi [t − 1] ) + (asi [t] − asi [t − 1] ) f & (asi [t − 1] ) +
! is assumed " to approximate the true posterior distribution
p si [t ], m j [t ] at time t i.e., 1
+ (as [t] − asi [t − 1] )2 f && (asi [t − 1] ) (11)
! " ! " ! " 2 i
q si [t ], m j [t ] ≈ q si [t ] q m j [t ] (5)
where f (· ) = − ln p(si [t ], asi [t ] ) is the negative log-likelihood and
where q(· ) is the approximate distribution. Price observation at f & () and f && () denote its first-order and second-order derivative
time t is normally distributed with mean value equal to the expec- w.r.t. asi [t − 1], respectively. The optimal solution w.r.t. asi [t] is ob-
tation of the inner product of si [t] and m j [t] w.r.t. the approximate d f (as [t] )
tained by solving f & (asi [t] ) = i
= 0, i.e., given by:
posterior distribution and fixed variance σi2j , so that d as [t]
i
4
M. Ntemi and C. Kotropoulos Signal Processing 183 (2021) 107994
Table 2
First and second-order derivatives here and in [17].
. /! b2n
"! "0
here f & (asi [t − 1] ) = − 1
(asi [t] − asi [t − 1] ) + 1
n xn 1− 1+ 2Mnn
( '[at] 2 λ˜ n ˜n
Mnn −λ
1
. . b2n xn
/ 0 . ! "2
f && (asi [t − 1] ) = 1
+ 1
n xn ( 1 − xn ) + 1
n 1 − 2 ( 1 − xn ) Mnn
+ n xn bn Mnn
1 − xn Mnn
( '[at] 2 2 λ˜ n ˜n
Mnn −λ ˜n
Mnn −λ ˜n
Mnn −λ
. /! b˜ 2n +Mnn
"0
[17] f & (asi [t − 1] ) = − 1
(asi [t] − asi [t − 1] ) − 1
n xn 1−
( '[at] 2 λ˜ n
. . b˜ 2 +M
f && (asi [t − 1] ) = − 1
− 1
2 n xn ( 1 − xn ) + 1
2 n xn ( 1 − 2xn n ˜ nn
)
( '[at] λn
5
M. Ntemi and C. Kotropoulos Signal Processing 183 (2021) 107994
BP –1.055574 yes
+ Shell –1.095411 yes
ξ [t] = ξ [t − 1] + y[t] (27) Coca-Cola –0.442442 yes
t∈H Pepsi –1.530511 yes
Pfizer –1.554863 yes
where |H| = 3 is a window containing 3 price observations, which Novartis –2.414171 yes
shifts over the price time series. This window size provides the op- Roche –1.640908 yes
timal trade-off between computational complexity and prediction Posco –1.637748 yes
performance. The prior distributions of parameters p(θl [t ]|θl− [t ] )
are assumed to be Gamma distributions [20].
is the average of the components of the rth observation vector de-
3.4. Particle coefficient vector fined in Eq. (30) and σδ2 ∼ T N (0, 0.03 ). Let us introduce ψ (r ) =
exp(−δ (r ) /max{|δ (l ) |} ), which undergoes normalization yielding
.
PF is applied to estimate the distribution in Eq. (6) at time t. ψ˜ (r ) = ψ (r ) /l dl=1 ψ (l ) . The threshold of the systematic resam-
Let c(r ) [t] = s(r ) [t] ∗ m(r ) [t], where ∗ is the Hadamard product [21]. .d
pling is ϕ = r=1 ψ˜ (r ) . If ρ (r ) < ϕ , a new observation vector arises
The particles o(r ) ∈ Rn are generated according to according to Eq. (30). Otherwise, the current observation vec-
o(r ) [t] = o(r ) [t − 1] + c(r ) [t − 1], r = 1, . . . , d, (28) tor y(r ) [t] is utilized. Systematic resampling is applied iteratively,
within . resampling loops. This process ensures that only the par-
Let σs2i,r [t] denote the rth diagonal element of the prior covari- ticles with significant weights are maintained.
ance matrix !si [t] ≈ diag(σs2i,1 [t], σs2i,2 [t], . . . , σs2i,n [t]). Similarly let Gaussian noise is attached to observation vectors of Eq. (30) to
σm2 j,r [t] denote the rth diagonal element of the prior covariance deal with impoverishment. This mechanism, known as roughening,
ensures their variability and it is conducted according to:
matrix !m j [t]. The elements of the coefficient vector c(r ) [t] ∈ Rn
! "
follow a Gaussian distribution N 0, σs2 [t ]σm
2 [t ] [21]. Specifically, y(r ) [t] ← y(r ) [t] + κ[t] (34)
i,l j,l
they are computed through: where κ[t] ∼ N ( 0, σ 2
κ I ). Eventually, the price estimation at time
(r ) 2 2 (r ) step t is derived through the computation of the average across
cl [t] = σ si,l [t] ·σ m j,l [t] ·u [t], l = 1, 2, . . . , n (29)
d observation vectors y(r ) [t] and their n components at every re-
with u(r ) [t] ∼ N (0, 1 ). sampling loop, i.e.,
The prior covariance matrix !si [t] of the latent vector si [t] con- n d
1 + + (r )
tains the diffusion and jump modeling factors αsi [t] and J[t], re- yˆ[t ] = yl [t ]. (35)
n·d
spectively. The particle coefficient generation cl(r ) [t] relies also on l=1 r=1
!si [t]. As a result, the dynamic information regarding diffusion and
jump modelling is embedded into the particle coefficients through 4. Experimental results
PF.
4.1. Stock price prediction
3.5. Particle filtering
Experiments are conducted on the opening prices of Pepsi,
At time t a predefined number of particles d is sampled Coca-Cola, Pfizer, Roche, Novartis, Shell, BP and Posco stocks in
from an importance distribution [43]. Here, the prior distribution order to compare the prediction performance with that of [17],
p(o(r ) [t]|o(r ) [t − 1] ) is the importance distribution. This is the SIS [18], [21] as well as [20] and [23], which also model jumps and
step. The prior covariance matrices !si [t ], !m j [t ] are utilized to diffusion in financial time series. A time series is constructed for
each stock containing the first 1,0 0 0 daily historical prices from
compute the coefficient vector c(r ) [t], as explained in Section 3.4.
1961 to 2003, which are selected as being the most erratic ones.
Particle generation relies on Eq. (28). The observation model of
Augmented Dickey-Fuller (ADF) tests are applied to these time se-
JDPF is given by:
ries in order to check stationarity [46]. The null hypothesis H0 as-
y(r ) [t] = o(r ) [t] + c(r ) [t], r = 1, . . . , d (30) sumes the presence of a unit root in the time series, meaning that
the series is non-stationary. H0 is rejected if the ADF statistic is
where y(r ) [t] ∈ Rn , r = 1, . . . , d are observation vectors. This is the less than the critical values at β1% = −3.437, β5% = −2.864, and
update step in PF. In order to eliminate degeneracy, systematic re- β10% = −2.568 for a significance level 99%, 95%, or 90%, respec-
sampling is applied on particles [44]. For this reason, d uniform tively. From the results of the test gathered in Table 3, it is at-
random variables are generated tested that there is non-significant evidence against H0 at any level
(r − 1 ) + ρ˜ of significance. As a consequence, all stock price time series are
ρ (r ) = , ρ˜ ∼ U (0, 1 ) r = 1, . . . , d (31) non-stationary.
d
JDPF is an on-line system and predicts the opening stock prices
where ρ (r ) ∈ [ r−1
d
, dr ). These d random variables can be consid- of the next day based on its current price. The latent vector di-
ered as a “comb” of d regularly spaced points [45]. Then, particle mension is n = 5, since prediction performance has not enhanced
weights are drawn from a Gaussian distribution, i.e, further with n = 10, 15, and 30. Multiple trials shown that the
δ (r ) [t] ∼ N (y[t − 1], v(r ) [t] ), r = 1, . . . , d (32) optimal number of particles is d = 20. The resampling loops are
. = 100, the variance of the random vector κ[t] used in roughen-
with v(r ) [t] denoting a truncated Gaussian random variable dis- ing is σκ2 = 0.01, the extra drift parameter ( = 8 × 10−3 , and the
tributed as v(r ) [t] ∼ T N (ȳ(r ) , σδ2 ), where drift value of the Brownian motion of si is asi [t0 ] = −6.5 in or-
1 T der to be compatible with [21]. The variance of the jump size is
ȳ(r ) [t ] = 1 y(r ) [t ] (33) σJ2 = 0.05. The initial parameters of Gamma distribution of jump
n n×1
6
M. Ntemi and C. Kotropoulos Signal Processing 183 (2021) 107994
arrivals γJ are set to ζ [t0 ] = 1, ξ [t0 ] = 10. The length of the Markov
chain is set to LT = 20. The probabilities of acceptance are Pb = 0.8,
Pd = 0.1, and Pm = 0.1, as in [20].
The historical daily prices are illustrated in Figs. 1, 3, 5, 7, 9,
11, and 13 in black color. Figs. 2, 4, 6, 8, 10, 12, and 14 depict the
respective predictions in red color.
Figs. 15 and 16 illustrate a zoom in 180 Novartis and Shell stock
price time series, respectively. The historical prices are those in
black color and the predicted prices are those in red color. Clearly,
JDPF conducts remarkable price predictions. The plots of predicted
and historical prices are considerably similar. The evaluation of the
prediction performance is conducted w.r.t. the root-mean-square
error (RMSE) in USD. The results are summarized in Table 4. The
first column indicates the stock, the second column refers to the
performance of the proposed JDPF, the third column shows the
performance of the DDPF [21], the fourth column represents the
performance of the DPF [18], the fifth column summarizes the per-
formance of the Collaborative Kalman Filter [17], the sixth column
refers to [20], the seventh column refers to [23], and the price
range of each stock in USD is summarized in the eighth column. Fig. 5. Real Pfizer prices.
7
M. Ntemi and C. Kotropoulos Signal Processing 183 (2021) 107994
Table 4
Stock prices prediction performance.
8
M. Ntemi and C. Kotropoulos Signal Processing 183 (2021) 107994
Fig. 12. Shell predictions. Fig. 15. Zoom in 180 Novartis predicted and real prices.
9
M. Ntemi and C. Kotropoulos Signal Processing 183 (2021) 107994
Table 5 Table 6
F-test JDPF against DDPF, DPF, CKF, [20], [23] (N = 1, 0 0 0, Fβ /2 = JDPF-LSTM comparison of the prediction performance (N = 1, 0 0 0, F0.9775 = 0.9011,
0.9011, F1−β /2 = 1.1097, β = 5%). F0.025 = 1.1097).
Stock F12 F13 F14 F15 F16 Stock LSTM - JDPF RMSE (USD) F -test JDPF-LSTM
BP 0.5601 0.3021 0.0008 0.4327 0.4082 train/test RMSE LSTM [22] test RMSE JDPF F17
Coca-Cola 0.6639 0.3947 0.0235 0.5024 0.5380
BP 0.91/1.25 0.3091 0.4003
Novartis 0.6842 0.5601 0.0054 0.6099 0.6991
Coca-Cola 1.38/2.09 0.7998 0.04521
Pfizer 0.2998 0.1093 0.0014 0.4986 0.4894
Novartis 1.2/4.75 0.7006 0.0399
Pepsi 0.7230 0.4922 0.0109 0.5928 0.6013
Pfizer 1.49/1.25 0.2192 0.3095
Posco 0.6987 0.2191 0.0015 0.4651 0.4429
Pepsi 1.31/2.78 0.6711 0.0113
Shell 0.5999 0.1922 0.0145 0.5767 0.5806
Posco 2.94/2.63 0.9153 0.0106
Roche 0.6063 0.4005 0.0081 0.5982 0.6102
Shell 1.92/3.07 0.9041 0.0981
Roche 0.78/1.81 0.5069 0.4432
10
M. Ntemi and C. Kotropoulos Signal Processing 183 (2021) 107994
Table 7
Flight ticket price prediction performance.
Amsterdam Athens 185.37 / 83.67 48.5826 47.4356 49.1134 51.0031 47.2006 64 - 646
Amsterdam Thessaloniki 128.83/58.82 56.7311 28.1487 32.8597 31.4864 27.6051 109 - 711
Berlin Athens 57.24/56.76 16.8384 12.8372 15.0561 16.0118 8.1201 59 - 258
Berlin Thessaloniki 10.34/9.85 20.4419 2.1951 24.3284 22.1639 2.2143 89 - 227
Brussels Athens 111.90/182.42 41.0444 40.1591 48.6628 50.1339 38.9566 88 - 638
Brussels Thessaloniki 95.18/79.19 91.9626 42.6099 39.7812 41.6271 20.1681 120 - 1358
Eindhoven Athens 60.37/25.77 14.446 12.8718 13.1991 13.9932 11.9272 163 - 431
Eindhoven Thessaloniki 233.44/67.68 91.9626 50.5403 54.4557 53.1087 41.9925 120 - 1358
Frankfurt Athens 107.84/361.45 43.3365 42.3067 51.2522 52.1939 41.1193 77 - 658
Frankfurt Thessaloniki 123.08/43.85 115.2531 33.5208 130.3652 100.9986 40.4662 198 - 977
Geneva Athens 86.03/42.55 25.4913 23.0955 24.3644 26.1724 19.9116 53 - 419
Geneva Thessaloniki 14.06/75.41 8.7674 4.5604 6.3966 7.8013 4.0199 133 - 239
London Athens 48.83/157.92 25.0637 22.6694 31.0811 32.1791 22.6546 65 - 393
London Thessaloniki 31.11/11.32 10.84 9.5415 12.7893 14.9965 4.7865 211 - 313
Milan Athens 64.84/75.16 20.7424 17.6877 19.9322 22.1006 17.3024 59 - 323
Milan Thessaloniki 13.80/4.50 7.7521 3.6736 9.0446 12.8012 2.0751 156 - 197
Munich Athens 183.32/438.02 81.3257 80.645 90.8981 93.3648 76.9139 76 - 1250
Munich Thessaloniki 113.32 /63.40 55.1415 24.9361 26.3841 30.0424 22.1319 49 - 612
Paris Athens 71.82/151.05 26.209 24.4464 25.5191 29.2012 19.3779 70 - 508
Paris Thessaloniki 84.15/28.13 23.6168 16.4376 20.2914 21.1986 10.3622 194 - 444
Prague Athens 225.88/39.81 55.6095 54.9637 53.0887 47.0748 49.1495 90 - 1104
Prague Thessaloniki 12.20/17.63 8.3879 2.8646 3.0776 3.5901 2.0197 95 - 151
Stockholm Athens 64.40/63.42 18.4782 16.2179 22.6697 20.3299 12.7036 44 - 296
Stockholm Thessaloniki 9.95/17.26 22.4197 4.8552 36.1517 38.0429 4.8562 79 - 287
Zurich Athens 94.79/63.79 28.8506 26.8774 25.0998 27.9432 20.6916 60 - 466
Zurich Thessaloniki 40.31/64.89 15.4131 10.1105 12.1726 14.1071 6.3143 180 - 332
Table 8
F-tests JDPF against LSTM, DPF, DDPF, [20], and [23].
11
M. Ntemi and C. Kotropoulos Signal Processing 183 (2021) 107994
ists given the price range of the route. As a consequence, JDPF pre- 1 (0
· exp{− (si [t] − µ&si [t − 1] )T !−1 &
si [t] (si [t] − µsi [t − 1] )} (A.5)
dicted prices, which include jumps lost price tracking. On the con- 2
trary, when large price jumps were present w.r.t. the mean flight can be used to approximate fI (· ). Accordingly, Eq. (A.5) can be
price, such as in routes Brussels-Thessaloniki, London-Thessaloniki, rewritten as
and Zurich-Thessaloniki, JDPF performs best. The prices of route $
1 1
Brussels-Thessaloniki are plotted in Fig. 18. The RMSE could be al- f I (· ) = ln det(2π !si [t] ) ?
most half of that delivered by the other methods. The combination 2 det(2π !&si [t] )
of jumps and drift enhanced the robustness of the particle coeffi- B 1
C
cients. As a result, more accurate flight price predictions are dis- · exp − (si [t] − µ&si [t] )T (!&si [t] )−1 (si [t] − µ&si [t] ) dsi [t]
2
closed. $
1
+ (si [t] − µ&si [t − 1] )T !−1
si [t] (si [t]
5. Conclusion 2
1
−µ&si [t − 1] ) ?
A jump-diffusion particle filter (JDPF) has been proposed. Both det(2π !&si [t] )
Brownian motion log drift parameter and the jump parameter en-
1
B C
able the latent vectors to capture efficiently price volatility as well · exp − (si [t] − µ&si [t] )T (!&si )−1 [t](si [t] − µ&si [t] ) dsi [t].
2
as most of the vast price jumps. Particle coefficients constitute the
carriers of this dynamic information, since they are drawn from la- (A.6)
tent vector prior distributions. As a consequence, a small number 1
In Eq. (A.6), the first term is equal to 2 ln det(2π !si [t] ). Now let
of particles, which are generated w.r.t. to these coefficients, can ef-
ficiently approximate posterior distributions and ensure an effec- r = (!&si [t] )−1/2 (si [t] − µ&si [t] ) ⇔ si [t] = µ&si [t] + (!&si [t] )−1/2 r.
tive price prediction performance. In stock price prediction, JDPF (A.7)
outperforms the state-of-the-art methods. In flight price predic-
tion, JDPF performs best, whenever the time series exhibits price Then
jumps.
/ 0 ?
dr = det (!&si [t] )−1/2 dsi [t] ⇔ dsi [t] = det(!&si [t] )dr. (A.8)
Declaration of Competing Interest The combination of Eqs. (A.7) and (A.8) leads to
$ ! & " ! & "
The authors declare that they have no known competing finan- µsi [t] − µ&si [t − 1] + (!&si [t] )1/2 r T !−1 & &
si [t ] µsi [t ] − µsi [t − 1] + (!si [t] )
1/2
r
: ;< =
cial interests or personal relationships that could have appeared to y
influence the work reported in this paper. 1
B 1 C
·? T
exp − r r dr (A.9)
det(2π I ) 2
Appendix A. Appendix
Four terms compose term y in the integrand in Eq. (A.9), i.e.,
In Eq. (11) f (t ) is the negative log-likelihood, i.e., ! & " ! & "
µsi [t] − µ&si [t − 1] T !−1 &
si [t ] µsi [t ] − µsi [t − 1] +
f (t ) = − ln p(si [t ], asi [t ] ) = − ln p(si [t]|asi [t − 1] ) − ln p(asi [t] ) . ! & "
: ;< = : ;< = +rT (!&si [t ] )1/2 !−1 &
si [t ] µsi [t ] − µsi [t − 1] +
f I (· ) fII (· ) ! "
+ µ&si [t] − µ&si [t − 1] T !−1 &
si [t ] (!si [t ] )
1/2
r+
(A.1)
+rT (!&si [t ] )1/2 !−1 &
si [t ] (!si [t ] )
1/2
r. (A.10)
Let us omit the subscripts si from 'a for notation simplicity. The
second term fII (· ) in Eq. (A.1) can be rewritten as, It can be shown that the integral of the first term in
> 3 @A Eq. (A.10) yields
1 1 $
− ln p(asi [t] ) = − ln ? exp − (asi [t] − asi [t − 1] )2 (µ&si [t] − µ&si [t − 1] )T !−1 &
2π( ' [t] 2( '[at] si [t] (µsi [t] (A.11)
a
1
B 1 C
1 1
= ln(2π( '[at] ) + (asi [t] − asi [t − 1] )2 −µ&si [t − 1] ) ? exp − rT r dr
2 2( '[at] det(2π I ) 2
1 ! −1 ! & "! & " "
∝ (asi [t] − asi [t − 1] )2 (A.2) = tr !si [t ] µsi [t ] − µsi [t − 1] µsi [t ] − µ&si [t − 1] T .
& (A.11)
2( '[at]
The integral of the fourth term in Eq. (A.10) reads as
The first and second-order derivatives of fII (· ) w.r.t. asi [t − 1] in $ B C
1 1
Eq. (A.1) are rT (!&si [t ] )1/2 !−1 &
si [t ]$!si [t ] )
1/2
r? exp − rT r dr
det(2π I ) 2
1
fII& (asi [t − 1] ) = − (asi [t] − asi [t − 1] ) ! "
( '[at] = tr (!&si [t] )−1 !si [t] . (A.12)
1 The integration of the second and the third term of
fII&& (asi [t − 1] ) = . (A.3)
( '[at] Eq. (A.10) yields zero. Summing up, fI (· ) in Eq. (A.1) becomes
For the first term in Eq. (A.1) the elbo function is utilized by ap- 1 1 ! "
f I (· ) = ln det(2π !si [t] ) + tr (!&si [t ] )−1 !si [t ]
plying Jensen’s inequality for concave functions [47]: 2 2
1 ! ! & &
"! & &
"T "
fI (· ) = − ln p(si [t]|asi [t − 1] ) ≤ −Eq [ln p(si [t]|asi [t − 1] )] + KL{q| p} + tr !−1si [t ] µsi [t ] − µsi [t − 1] µsi [t ] − µsi [t − 1] .
2
(A.4) (A.13)
where In Eq. (A.13), the terms should be identified which depend on
/ ' 1 asi [t − 1]. As can be seen in Eq. (1), the prior covariance matrix
− Eq [ln p(si [t]|asi [t − 1] )] = −Eq ln ?
det(2π !si [t] ) !si [t] clearly depends on asi [t − 1]. Let the eigendecomposition
12
M. Ntemi and C. Kotropoulos Signal Processing 183 (2021) 107994
of the posterior covariance matrix at the previous time step be The substitution of Eq. (A.21) into Eq. (A.20) leads to
!&si [t − 1] = W#WT , where # = diag(λn ) with λn being the nth # % J K
+ b2 + bn n
+
eigenvalue of the posterior covariance matrix at t − 1. Then, the ∂ x
n
=− 2 Mnl l φl + bn xn . (A.22)
eigendecomposition of the a priori covariance matrix !si [t] taking ∂ asi [t − 1] n λ˜ n ˜
n λn
˜
λl l=1
into account drift reads
The differentiation of the third term in (A.13) gives
˜ WT
!si [t] = W# (A.14)
∂ ∂
˜ n = (λn + easi [t−1] '[t] ). Let us elaborate
˜ ) with λ
˜ = diag(λ tr[(!&si [t] )−1 !si [t]] = tr[(!−1
si [t] + X )!si [t]]
where # a ∂ asi [t − 1] ∂ asi [t − 1]
Eq. (8):
∂
≈ tr(I ) = 0. (A.23)
(µ&m j [t ]µTm j [t ] − !&m j [t] ) ∂ asi [t − 1]
(!&si [t ] )−1 = !−1
si [t ] +
˜ −1 WT + X.
= W#
σi2j The combination of Eqs. (A.13), (A.17), and (A.22) leads to
: ;< = H J KI
n
X ∂ 1 + + bn + x
f I (· ) = xn − 2 Mnl l φl + bn xn
(A.15) ∂ asi [t − 1] 2 n ˜
λ n
˜
λ
n l l=1
The second term in Eq. (A.15) is a constant matrix w.r.t. asi [t − 1]. # % n
1+ bn + bn + x
Then, it can be shown that = xn − bn xn − Mnl l φl .
2 n λ˜ n n
˜
λ n
˜
λ l
l=1
∂ + easi [t−1] '[at] +
log det(2π !si [t] ) = = xn . (A.24)
∂ asi [t − 1] n λn + easi [t−1] '[at] n
: ;< = If we omit second order terms w.r.t. Mnl in the second term of
xn Eq. (A.24) and retain only the term for l = n, the first order deriva-
(A.16) tive in Eq. (A.2) reads as
a [t−1] [t]
e si 'a
where xn = . Let b = WT (µ&si [t] − µ&si [t − 1] ). The third f & = ∂ as ∂[t−1]
f
a [t−1] [t]
λn + e s i 'a i
.
1) *) *2
1 1 b2n 2Mnn
term in Eq. (A.13) is rewritten: =− (asi [t] − asi [t − 1] ) + xn (1 − 1+
(' [t]
a
2
n λ˜ n ˜n
Mnn −λ
∂ ! "
˜ −1 WT (µ& [t] − µ& [t − 1] )(µ& [t] − µ& [t − 1] )T
tr (W# (A.25)
si si si si
∂ asi [t − 1]
+ where bn 0 ( M˜ nn − 1 )φn ⇔ φn 0 bn
. Similarly, the second-
∂ ˜ −1 b = ∂ b2n λn ( M˜ nn −1 )
= bT # . (A.17) λn
∂ asi [t − 1] ∂ asi [t − 1] d λn + e si [t−1] '[at]
a order derivative in Eq. (11) reads as
&
Using Eq. (A.17) the term of b which depends on asi [t − 1] can be f && = ∂ as∂[ft−1]
approximated as i
. .
1 2
1 1 1 b2n xn Mnn
=
(' [t] + 2
xn ( 1 − xn ) + 2 λ˜ n
1 − 2 ( 1 − xn ) ˜n
Mnn −λ
.
a n 1 ) n *2
b ≈ W !&si [t]W #
˜ −1 WT − WT )µ& [t − 1]
T Mnn Mnn
si + xn bn ˜n 1 − xn ˜ .
Mnn −λ Mnn −λn
: ;< = n
M (A.26)
˜ −1 − I ) WT µ& [t − 1]
= ( M# si
: ;< = CRediT authorship contribution statement
φ
˜ −1 − I )φ.
= ( M# (A.18) Myrsini Ntemi: Methodology, Validation, Formal analysis, Writ-
ing - review & editing, Conceptualization, Software, Investigation,
The nth element of b is Resources, Data curation, Writing - original draft, Visualization.
# %
Mnn + Mnl φl Constantine Kotropoulos: Methodology, Validation, Formal analy-
˜ −1 − I )φ]n =
bn ≈ [ ( M# −1 φn + (A.19)
λ˜ n λ˜ l sis, Writing - review & editing, Supervision, Project administration,
l/=n
Funding acquisition.
which depends on asi [t − 1] as well. Let us elaborate Eq. (A.17):
# % References
∂ + b2n + ∂ b2n
= = [1] T. Bollerslev, P.E. Rossi, Introduction: Modelling stock market volatility-bridg-
∂ asi [t − 1] n λn + easi [t−1] '[at] n
∂ asi [t − 1] λ˜ n ing the gap to continuous time, in: P.E. Rossi (Ed.), Modelling Stock Market
: ;< = Volatility, Academic Press, San Diego, CA, 1996.
λ˜ n [2] R.S. Tsay, Financial time series, Wiley StatsRef: Statistics Reference Online
# % (2014) 1–23.
+ bn ∂ bn
= 2 − bn xn . (A.20) [3] P.W. Glynn, Diffusion approximations, Handbooks in Operations Research and
˜
n λn
∂ asi [t − 1] Management Science 2 (1990) 145–198.
[4] W. Paul, J. Baschnagel, Stochastic Processes, 1, Springer, 2013.
By utilizing Eq. (A.19), it can be shown [5] P. Tankov, Financial Modelling With Jump Processes, Chapman and Hall/CRC,
2003.
∂ bn −Mnn easi [t−1] '[at] + −Mnl φl (easi [t−1] '[at] ) [6] D. Hainaut, N. Leonenko, Option pricing in illiquid markets: a fractional
= φ n + jump-diffusion approach, J Comput Appl Math 381 (2020) 112995.
∂ asi [t − 1] λ˜ 2n λ˜ 2
l/=n l [7] K.F. Mina, G.H. Cheang, C. Chiarella, Approximate hedging of options under
H I jump-diffusion processes, Int J Theo Appl Financ 18 (04) (2015) 155–224.
xn + x [8] S.G. Kou, A jump-diffusion model for option pricing, Manage Sci 48 (8) (2002)
= − Mnn φn + Mnl l φl 1086–1101.
˜λn ˜λl [9] R.C. Merton, Option pricing when underlying stock returns are discontinuous,
l/=n
J Financ Econ 3 (1–2) (1976) 125–144.
n
+ [10] R. Anderson, S. Sundaresan, A comparative study of structural models of cor-
x
=− Mnl l φl . (A.21) porate bond yields: an exploratory investigation, J Bank Financ 24 (1–2) (20 0 0)
˜λl 255–269.
l=1
13
M. Ntemi and C. Kotropoulos Signal Processing 183 (2021) 107994
[11] E.P. Jones, S.P. Mason, E. Rosenfeld, Contingent claims analysis of corporate [29] C.-O. Ewald, A. Zhang, Z. Zong, On the calibration of the Schwartz two-factor
capital structures: an empirical investigation, J Finance 39 (3) (1984) 611–625. model to WTI crude oil options and the extended Kalman filter, Ann Oper Res
[12] A. Doucet, N. De Freitas, N. Gordon, An introduction to sequential Monte 282 (1–2) (2019) 119–130.
Carlo methods, in: Sequential Monte Carlo Methods in Practice, Springer, 2001, [30] J. Zhao, Dynamic state estimation with model uncertainties using H-Extended
pp. 3–14. Kalman filter, IEEE Trans. Power Syst 33 (1) (2018) 1099–1100.
[13] P.M. Djuric, J.H. Kotecha, J. Zhang, Y. Huang, T. Ghirmai, M.F. Bugallo, J. Miguez, [31] J. Qi, K. Sun, J. Wang, H. Liu, Dynamic state estimation for multi-machine
Particle filtering, IEEE Signal Process Mag 20 (5) (2003) 19–38. power system by unscented Kalman filter with enhanced numerical stability,
[14] M.S. Arulampalam, S. Maskell, N. Gordon, T. Clapp, A tutorial on particle filters IEEE Trans. Smart Grid 9 (2) (2018) 1184–1196.
for online nonlinear/non-Gaussian Bayesian tracking, IEEE Trans. Signal Pro- [32] E. Bradford, L. Imsland, Economic stochastic model predictive control using the
cessing 50 (2) (2002) 174–188. unscented Kalman filter, IFAC Proceedings Volumes 51 (18) (2018) 417–422.
[15] R. Van Der Merwe, A. Doucet, N. De Freitas, E.A. Wan, The unscented par- [33] P. Guarniero, A.M. Johansen, A. Lee, The iterated auxiliary particle filter, J Am
ticle filter, in: Advances in Neural Information Processing Systems, 2001, Stat Assoc 112 (520) (2017) 1636–1647.
pp. 584–590. [34] P. Wang, L. Li, S.J. Godsill, Particle filtering and inference for limit order books
[16] N.J. Gordon, D.J. Salmond, A.F.M. Smith, Novel approach to nonlin- in high frequency finance, in: Proc. 43rd IEEE Int. Conf. Acoustics, Speech and
ear/non-Gaussian Bayesian state estimation, IEE Proceedings F - Radar and Sig- Signal Processing, 2018, pp. 4264–4268.
nal Processing 140 (2) (1993) 107–113. [35] M.J. Wainwright, M.I. Jordan, Graphical models, exponential families, and vari-
[17] S. Gultekin, J. Paisley, A collaborative Kalman filter for time-evolving dyadic ational inference, Foundations and Trends® in Machine Learning 1 (1–2)
processes, in: Proc. 2014 IEEE Int. Conf. Data Mining, 2014, pp. 30–37. (2008) 1–305.
[18] M. Ntemi, C. Kotropoulos, A Dyadic particle filter for price prediction, in: Proc. [36] D. Blei, A. Kucukelbir, J.D. McAuliffe, Variational inference: a review for statis-
27th European Signal Processing Conf. (EUSIPCO), 2019. ticians, J Am Stat Assoc 112 (518) (2017) 859–877.
[19] P.J. Green, Reversible jump Markov chain Monte Carlo computation and [37] P. Tankov, E. Voltchkova, Jump-diffusion models: a practitioners guide, Banque
Bayesian model determination, Biometrika 82 (4) (1995) 711–732. et Marchés 99 (1) (2009) 24.
[20] J. Murphy, S. Godsill, Bayesian parameter estimation of jump-Langevin sys- [38] G. Franciszek, Nonhomogeneous compound poisson process application to
tems for trend following in finance, in: Proc. 40th IEEE Int. Conf. on Acoustics, modeling of random processes related to accidents in the baltic sea waters and
Speech and Signal Processing, 2015, pp. 4125–4129. ports, J Polish Safe Reliab Assoc Summer Safe Reliab Semin 9 (2018) 21–29.
[21] M. Ntemi, C. Kotropoulos, A dynamic dyadic particle filter for price prediction, [39] K. Sigmann, Notes on Poisson processes and compound Poisson processes,
Signal Processing 167 (2020), doi:10.1016/j.sigpro.2019.107334. 2007. Course Hero IEOR 4706, Columbia College.
[22] S. Siami-Namini, A.S. Namin, Forecasting economics and financial time series: [40] C. Ibe Oliver, Markov Processes for Stochastic Modeling, 2nd ed., Elsevier, 2013.
ARIMA vs. LSTM, (2018), arXiv preprint arXiv:1803.06386. [41] H.L. Christensen, J. Murphy, S.J. Godsill, Forecasting high-frequency futures re-
[23] T. Wang, Z. Huang, et al., The relationship between volatility and trading vol- turns using online Langevin dynamics, IEEE J Sel Top Signal Process 6 (4)
ume in the Chinese stock market: a volatility decomposition perspective, An- (2012) 366–380.
nals of Economics and Finance 13 (1) (2012) 211–236. [42] P. Diaconis, D. Ylvisaker, Conjugate priors for exponential families, The Annals
[24] F. Cong, C.W. Oosterlee, Pricing Bermudan options under Merton jump-diffu- of Statistics (1979) 269–281.
sion asset dynamics, Int J Comput Math 92 (12) (2015) 2406–2432. [43] A. Doucet, A.M. Johansen, A tutorial on particle filtering and smoothing: fifteen
[25] R.T.L. Chan, Adaptive radial basis function methods for pricing options under years later, Handbook of Nonlinear Filtering 12 (656–704) (2009) 3.
jump-diffusion models, Computational Economics 47 (4) (2016) 623–643. [44] V. Elvira, L. Martino, D. Luengo, M.F. Bugallo, Population Monte Carlo schemes
[26] A.A.E.-F. Saib, D.Y. Tangman, M. Bhuruth, A new radial basis functions method with reduced path degeneracy, in: Proc. 7th IEEE Int. Workshop Computational
for pricing american options under Merton’s jump-diffusion model, Int J Com- Advances Multi-Sensor Adaptive Processing, 2017, pp. 1–5.
put Math 89 (9) (2012) 1164–1185. [45] D. Salmond, N. Gordon, An introduction to particle filters, State Space and Un-
[27] K.S. Patel, M. Mehra, Fourth-order compact scheme for option pricing under observed Component Models Theo Appl (2005) 1–19.
the Mertons and Kous jump-diffusion models, Int J Theo Appl Financ 21 (04) [46] Y.-W. Cheung, K.S. Lai, Lag order and critical values of the augmented Dickey—
(2018) 1079–1098. Fuller test, Journal of Business & Economic Statistics 13 (3) (1995) 277–280.
[28] J. Chevallier, S. Goutte, Detecting jumps and regime switches in international [47] D. Barber, Bayesian Reasoning and Machine Learning, Cambridge University
stock markets returns, Appl Econ Lett 22 (13) (2015) 1011–1019. Press, 2012.
14