Comparative Performance of Machine Learning Algorithms For Cryptocurrency Forecasting
Comparative Performance of Machine Learning Algorithms For Cryptocurrency Forecasting
net/publication/326837070
CITATIONS READS
13 335
2 authors:
Some of the authors of this publication are also working on these related projects:
Real Time Facial Expression Detecting Using Convolutional Neural Network CNN View project
MACHINE LEARNING ALGORITHMS IN FINANCIAL TECNOLOGY FOR STOCK PRICE FORECASTING: CASE STUDY OF DEUTSCHER AKTIENINDEX (DAX OR GERMAN STOCK
INDEX) View project
All content following this page was uploaded by Nor Azizah Hitam on 25 December 2018.
Abstract
Machine Learning is part of Artificial Intelligence that has the ability to make future forecastings based on
the previous experience. Methods has been proposed to construct models including machine learning
algorithms such as Neural Networks (NN), Support Vector Machines (SVM) and Deep Learning. This paper
presents a comparative performance of Machine Learning algorithms for cryptocurrency forecasting.
Specifically, this paper concentrates on forecasting of time series data. SVM has several advantages over
the other models in forecasting, and previous research revealed that SVM provides a result that is almost or
close to actual result yet also improve the accuracy of the result itself. However, recent research has showed
that due to small range of samples and data manipulation by inadequate evidence and professional
analyzers, overall status and accuracy rate of the forecasting needs to be improved in further studies. Thus,
advanced research on the accuracy rate of the forecasted price has to be done.
Keywords: Artificial Intelligence; Machine Learning; Support Vector Machines, Neural Networks; Deep
Learning;
1.0 Introduction
Forecasting future values or price of experimental time series plays a vital role in almost all fields
of studies including economics, science and engineering, finance, business, meteorology and
telecommunication [1]. Cryptocurrency, an alternative medium of exchange consisting of over
1441 (as of January 2018) decentralized crypto coin types. Relating machine learning algorithms
to cryptocurrency is considered as a new field with limited research studies. In general, system
can be used to any directive machine learning problem, in return the system will provide a
description relevant to samples both in and out of the dataset.
There are numerous type of cryptocurrency including Bitcoin, Litecoin, Ethereum, Nem, Ripple,
Iota, Stellar and others. The cryptographic foundation of each crypto coin makes them vital.
Considering the exchange rates of cryptocurrencies are notorious for being volatile, we attempt
to model an algorithm that can be used in trading of numerous cryptocurrencies. In order to show
the accuracy rate of the predicted price of the proposed methodology, two different data are used
as explanatory examples. The comparative cryptocurrencies are Litecoin, Ethereum, Bitcoin,
Stellar, Ripple and Nem. This paper uses the mean absolute percentage error (MAPE) calculation
to evaluate the proposed model.
The outline of this paper is as follows. Section 1 introduces some basic notions of
cryptocurrencies and machine learning algorithms. Section 2 discusses the type of cryptocurrency
and two largest alternative blockchain technologies Litecoin, Ethereum, Bitcoin, Stellar, Ripple
and Nem and the purposes of each development. Section 3 presents about machine learning
algorithms and three most widely used algorithms, Artificial Neural Networks (ANN) and Support
Vector Machines (SVM) and Deep Learning. Section 4 explains the experiments and results of
experiments using all models.
1.1 Cryptocurrency
Litecoin (LTC) and Ethereum (XRP) are among the largest alternative blockchain technologies,
known as altcoins and were invented after Bitcoin (BTC). Altcoins may have different purposes of
development but are using general methodology based on decentralized P2P network, with the
assumption of no network failure and no Internet interruption [2,3,4,5]. Research on the
cryptocurrency field is still limited. Mostly, research in this field is focusing on a single
cryptocurrency rather than broader areas such as technological advancement, government
participation in market regulations as well as market development [6]. This section will focus on
six types of cryptocurrency begins with Bitcoin, Ethereum, Litecoin, Nem, Ripple followed by
Stellar. In the succeeding section, we focus the review of previous studies on Machine Learning,
Support Vector Machines (SVM), Artificial Neural Networks (ANNs) and Deep Learning applied
in forecasting.
A peer to peer (p2p) payment cash system, non regulated digital currency and introduced in 2008
with no legal status tendered is known as Bitcoin. It is called as one type of cryptocurrencies with
its cryptographic function in its security of creation and money transfer. In recent years, bitcoin
turns out to be the most well known currency in the area of volume trading, thus makes a Bitcoin
as the most potential financial medium for investors [7]. It locks the transaction as the
individualities of the sender, receiver and the volume of transaction are all encrypted [6].
Litecoin (LTC) was released in October 2011 using a similar technology to Bitcoin, and invented
by Charles Lee. The block generation time is decreased as much as 4 times per block (from 10
minutes to 2.5 minutes per block) 84 million of maximum limit, it is equivalent to 4 times higher
than Bitcoin and has adopted a different hashing algorithm [9,10]. Litecoin is considered as the
‘silver standard’ of crypto coin and turn into a second most accepted by both miners and
exchanges [9]. It uses Scrypt encryption algorithm and contradicts to SHA-256 and developed to
bid the Bitcoin network transaction confirmation speed and uses an algorithm that was resilient to
the advancement of hardware mining technologies.
NEM is a blockchain notarization also known as a peer-to-peer platform that provides services
like online payment and messaging system. Having a cojointly owned notarization, it then makes
NEM to become as the first public/private blockchain combination [8].
Ripple, an open source digital currency, produced by Jed McCaleb and partner, Chris Larsen, a
distributed peer-to-peer network payment medium controlled and managed by a single
organization and offers another medium of security mechanism [6,8]. The development of Ripple
is based on Byzantine Consensus Protocol and maximum number of Ripple is 100 million [8].
Stellar, like Ripple offers and entire substitute of security instrument and implemented based on
Byzantine Consensus Protocol. Stellar has implemented a new technology to process the
financial transactions including open source, scattered and unlimited ownership [6, 11].
(ANNs) and Support Vector Machine (SVM) and both has own patterns of learning [11, 13]. ANNs
has been widely used for prediction in securities. Number of issues in ANNs has been discussed
by researchers including the selection of parameters and training set [14]. According to [1], the
embedding formulation recommends that when a historical dataset S is available, the one-step
forecasting can be considered as supervised learning. Supervised learning is the task of deriving
a function from training data consist of a set of training dataset. It comes in a set of input and
output variables that is also considered as dependent on the inputs. One-step forecasting can be
applied when a mapping model is exist [1]. In one-step forecasting, the previous values of the
series, n are available, thus forecasting can be performed as a generic regression problem as
Figure 1 below. General approach to model an input/output sense, relies on the accessibility of
experimental pairs and denoted as training set. Training set is initiated by the historical series S
by creating the [(N – n -1) x n] input data matrix.
In one step forecasting, the approximator ˆf returns the prediction of the value of the time series
at time t + 1 as a function of the n previous values (the rectangular box containing z-1 represents
a unit delay operator, i.e., y t-1 = z-1 yt) [1].
And the [(N – n 1) x 1] output vector
(1)
For the sake of simplicity, a is assume as d = 0 lag time. Henceforth, in this chapter we will refer
to the ith row of X, which is essentially a temporal pattern of the series, as to the (reconstructed)
state of the series at time t – i + 1.
(2)
Another important point of discussion is the options offered by type of SVM. SVM offers linear
and nonlinear type of models. Linear SVMs outperforms the nonlinear in terms of speed and
execution time, but underperform dealing with complex datasets contains many training examples
but less features. While nonlinear SVMs although losing its explanatory power, seems to perform
4 p-ISSN: 2502-4752, e-ISSN: 2502-4760
steadily across various problems, and becomes most preferred choice compared to linear SVMs
[18].
In this paper, we consider time series data based on 5 years of daily history, as inputs for all
models and may vary based on the availability of datasets from the source. The data is prepared
from daily open, close, high and low price of a daily trading for all total of six types of
cryptocurrencies and are downloaded from the market capitalization database and range from
2013 through 2018.
Variable Description
Figure 2: The training and testing dataset in our time series data. The first part is the training set (number of
values as per #Observations) followed by testing set in the next segment. Several classifiers are then used
to predict the test data (number of values in the testing set is = 364) in the second segment.
Cryptocurrency Training Data Test Data
Name
From To #Observations From To #Observations
Bitcoin, BTC, 28-Mar- 16-Jan- 17-Jan- 16-Jan-
XBT 13 17 1388 17 18 364
Ether or 7-Aug- 16-Jan- 17-Jan- 16-Jan-
“Ethereum”, ETH 15 17 526 17 18 364
28-Apr- 16-Jan- 17-Jan- 16-Jan-
Litecoin, LTC
13 17 1358 17 18 364
1-Apr- 16-Jan- 17-Jan- 16-Jan-
Nem, XEM
15 17 657 17 18 364
4-Aug- 16-Jan- 17-Jan- 16-Jan-
Ripple, XRP
13 17 1262 17 18 364
5-Aug- 16-Jan- 17-Jan- 16-Jan-
Stellar, XLM
14 17 896 17 18 364
The result section begins by showing performance measures for each cryptocurrency types
according to classifiers. These serve as a control for the rest of the discussion.
The analysis is separated into two different experiments: i) Performance measures by various
classifiers ii) Forecasted cryptocurrency value by machine learning algorithms vs actual value
Figure 3 shows the performance accuracy in correspondence to four classifiers on the
cryptocurrency market capitalization. The maximum value is 95.5%, which means that any alphas
over 95.5% have p-value of 0.01 or less.
6 p-ISSN: 2502-4752, e-ISSN: 2502-4760
Several different classifiers were trained with the same set of features. In this case, the datasets
were evaluated using classification accuracy. The comparison of all classifiers generated by
different methods are based on the same dataset. Thus it will be fair for all classifiers to perform
the testing and training.
The results for the classifiers with the best performance on the test set are testified. The results
show that SVM classifier works well for Ethereum followed by Litecoin. While, ANN is seen works
best for Bitcoin followed by Nem. Ripple and Stellar has the best performance accuracy for
BoostedNN. However, among all, SVM classifier performs the best compared to the other
classifiers with the performance accuracy of 95.5%.
Figure 4: SVM value is comparable to actual Bitcoin for the period from 17/1/2017 to
16/1/2018.
7 p-ISSN: 2502-4752, e-ISSN: 2502-4760
Figure 5: SVM value is comparable to actual Litecoin for the period from 17/1/2017 to
16/1/2018.
Figure 6: SVM value is comparable to actual Ripple for the period from 17/1/2017 to
16/1/2018.
8 p-ISSN: 2502-4752, e-ISSN: 2502-4760
Figure 7: SVM value is comparable to actual Ethereum for the period from 17/1/2017
to 16/1/2018.
Figure 8: SVM value is comparable to actual Nem for the period from 17/1/2017 to
16/1/2018.
9 p-ISSN: 2502-4752, e-ISSN: 2502-4760
Figure 9: SVM value is comparable to actual Stellar for the period from 17/1/2017 to
16/1/2018.
For comparability, same data sets and period of 364 days were chosen for all classifiers.
Performance can be seen in Figure 4 – Figure 9 above. The SVM significantly outperformed the
other classifiers.
This result is further explored using mean absolute percentage error (MAPE) calculation. SVM
mean absolute percentage error is 0.31% and is the lowest MAPE. Thus, the SVM is considered
as reliable forecasting model for these six selected cryptocurrency.
4.0 Conclusion
The paper is highly focuses on the comparative performance of machine learning algorithms of
six cryptocurrencies. To begin with, the review of cryptocurrency has covered six major
cryptocurrency, there are Bitcoin, Ethereum, Litecoin, Nem, Ripple and Stellar. Further, previous
studies on Machine Learning, Support Vector Machines (SVM), Artificial Neural Networks (ANNs)
and Deep Learning forecasting has been explored.
Firstly, the performance measures were done to get the accuracy of classifiers over the selected
cryptocurrency and obtained the result as in Figure 3. Result shows that SVM outperformed other
classifiers with the accuracy of 95.5%. It is realized, that the quality of training data and population
of dataset plays an important role for a successful prediction.
Secondly, the forecasted cryptocurrency value by Machine Learning vs actual value of
cryptocurrency were then analyzed. From the comparative analysis done in this section, SVM has
a comparable values for all cryptocurrency for the period from 17/1/2017 to 16/1/2018.
Moreover, the result is further explored using mean absolute percentage error (MAPE)
calculation. The results show that SVM has the lowest value of MAPE. Thus, the SVM is
considered as a reliable forecasting model for the selected cryptocurrency.
In future, the algorithm will be improved on the accuracy rate of the forecasted price. Besides,
with the power of SVM, future work will be done to further optimize the SVM to get the most
accurate result as per actual value of cryptocurrency.
10 p-ISSN: 2502-4752, e-ISSN: 2502-4760
5.0 References
[1] Bontempi, G., Taieb, S. Ben, & Borgne, L. (2013). Machine Learning Strategies for
Time Series Forecasting, 62–77.
[2] Huckle, S., & White, M. (2016). Socialism and the blockchain. Future Internet, 8(4).
https://doi.org/10.3390/fi8040049
[3] Bitcoin. Bitcoin Developer Guide. Available online: https://bitcoin.org/en/developer-
guide#block-chain (accessed on 24 January 2018).
[4] Ethereum. Ethereum Project. Available online: https://www.ethereum.org/ (accessed on
24 January 2018).
[5] Litecoin. Litecoin—Open Source P2P Digital Currency. Available
online: https://litecoin.org/ (accessed on 24 January 2018).
[6] Farell, R. (2015). An Analysis of the Cryptocurrency Industry. Wharton Research
Scholars Journal. Paper, 130. Retrieved from
http://repository.upenn.edu/wharton_research_scholars%0Ahttp://repository.upenn.edu/
wharton_research_scholars/130
[7] Krause, D. (2017). Bitcoin – A Favourable Instrument For Diversification ? A
Quantitative Study On The Relations Between Bitcoin
[8] Lee, D., Chuen, K., Guo, L., Wang, Y., & Chian, L. K. (2017). Cryptocurrency: A New
Investment Opportunity?, 1–54.
[9] Heid, A. (2013). Analysis of the Cryptocurrency Marketplace. Retrieved February, 15,
2014.
[10] Application, F. A., & Guidelines, G. (2013). Ashesi University College. Office, 1–4.
[11] Chaigusin, S. (2014). An Application of Decision Tree for Stock Trading Rules : A Case
ofthe Stock Exchange of Thailand Proceedings of Eurasia Business Research
Conference,(June).
[12] Ahmed, N. K., Atiya, A. F., El Gayar, N., & El-Shishiny, H. (2010). An empirical
comparisonof machine learning models for time series forecasting. Econometric
Reviews, 29(5), 594–621.https://doi.org/10.1080/07474938.2010.481556
[13] Patel, J., Shah, S., Thakkar, P., & Kotecha, K. (2015). Predicting stock and stock
priceindex movement using Trend Deterministic Data Preparation and machine
learningtechniques. Expert Systems with Applications, 42(1), 259–
268.https://doi.org/10.1016/j.eswa.2014.07.040
[14] Kongsilp, W., Mateus, C., Huang, M., Ting-ting, Z., Wan-yi, C., Maita, A. R. C., …
deCarvalho, A. F. (2015). Prediction of Stock Trading Signal Based on Support
VectorMachine. Engineering Computations, 32(1), 445–
463.https://doi.org/10.1108/02644401311286099
[15] Zhang, L., & Wang, J. (2015). Optimizing parameters of support vector machines using
team-search-based particle swarm optimization. Engineering Computations, 32(5),
1194–1213. https://doi.org/10.1108/EC-12-2013-0310
[16] Basudhar, A. and Missoum, S. (2010), “An improved adaptive sampling scheme for the
construction of explicit boundaries”, Structural and Multidisciplinary Optimization, Vol.
42 No. 4, pp. 1-13.
[17] Lin, K., Basudhar, A., & Missoum, S. (2012). Parallel construction of explicit boundaries
using support vector machines. Engineering Computations, 30(1), 132–148.
https://doi.org/10.1108/02644401311286099
[18] Huerta, R., Corbacho, F., & Elkan, C. (2013). Nonlinear support vector machines can
systematically identify stocks with high and low future returns. Algorithmic Finance, 2(1),
45–58. https://doi.org/10.3233/AF-13016
[19] Baccarini, L.M.R., Rocha e Silva, V.V., de Menezes, B.R. and Caminhas, W.M. (2011),
“SVM practical industrial application for mechanical faults diagnostic”, Expert Systems
with Applications, Vol. 38 No. 6, pp. 6980-6984.
[20] Hacib, T., Acikgoz, H., Bihan, Y. Le, Mekideche, M. R., Meyer, O., & Pichon, L. (2010).
Support vector machines for measuring dielectric properties of materials. COMPEL: The
International Journal for Computation and Mathematics in Electrical and Electronic
Engineering, 29(4), 1081–1089. https://doi.org/10.1108/03321641011044497
11 p-ISSN: 2502-4752, e-ISSN: 2502-4760