0% found this document useful (0 votes)
11 views32 pages

Khedr 2021

This document is a survey on cryptocurrency price prediction, analyzing traditional statistical and machine-learning techniques used from 2010 to 2020. It highlights the challenges of predicting cryptocurrency prices due to their volatility and lack of seasonal effects, suggesting that machine learning offers more robust solutions compared to traditional methods. The article aims to provide insights and a comprehensive overview of existing research, methodologies, and future opportunities in the field of cryptocurrency price prediction.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views32 pages

Khedr 2021

This document is a survey on cryptocurrency price prediction, analyzing traditional statistical and machine-learning techniques used from 2010 to 2020. It highlights the challenges of predicting cryptocurrency prices due to their volatility and lack of seasonal effects, suggesting that machine learning offers more robust solutions compared to traditional methods. The article aims to provide insights and a comprehensive overview of existing research, methodologies, and future opportunities in the field of cryptocurrency price prediction.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Received: 27 March 2020 Revised: 2 November 2020 Accepted: 12 January 2021

DOI: 10.1002/isaf.1488

REVIEW

Cryptocurrency price prediction using traditional statistical and


machine-learning techniques: A survey

Ahmed M. Khedr1,2 Ifra Arif1 Pravija Raj P V1 Magdi El-Bannany3


Saadat M. Alhashmi4 Meenu Sreedharan1

1 Department of Computer Science, University

of Sharjah, Sharjah, 27272, UAE Abstract


2 Mathematics Department, Zagazig University,
Cryptocurrencies are decentralized electronic counterparts of government-issued
Zagazig, Egypt
3 Department of Accounting,University of
money. The first and best-known cryptocurrency example is bitcoin. Cryptocurren-
Sharjah, Sharjah, 27272, UAE cies are used to make transactions anonymously and securely over the internet.
4 Department of Information Systems, The decentralization behavior of a cryptocurrency has radically reduced central
University of Sharjah, Sharjah, 27272, UAE
control over them, thereby influencing international trade and relations. Wide fluc-
Correspondence tuations in cryptocurrency prices motivate the urgent requirement for an accurate
Ahmed M. Khedr, Department of Computer
Science, University of Sharjah, Sharjah, 27272, model to predict its price. Cryptocurrency price prediction is one of the trending
UAE. areas among researchers. Research work in this field uses traditional statistical and
Email: [email protected]
machine-learning techniques, such as Bayesian regression, logistic regression, linear
regression, support vector machine, artificial neural network, deep learning, and rein-
forcement learning. No seasonal effects exist in cryptocurrency, making it hard to
predict using a statistical approach. Traditional statistical methods, although simple to
implement and interpret, require a lot of statistical assumptions that could be unre-
alistic, leaving machine learning as the best technology in this field, being capable of
predicting price based on experience. This article provides a comprehensive summary
of the previous studies in the field of cryptocurrency price prediction from 2010 to
2020. The discussion presented in this article will help researchers to fill the gap in
existing studies and gain more future insight.

KEYWORDS
cryptocurrency price prediction, bitcoin (BTC), machine learning (ML), reinforcement learning
(RL), deep learning (DL)

1 INTRODUCTION funds without the interference of a central entity, such as a bank. The
cryptocurrency market has evolved exponentially in a short period
A cryptocurrency is a digital or virtual currency used as a mode owing to its uncontrolled and untraceable nature. Digital currencies
of exchange and transfer of assets digitally. It uses cryptography are used for financial transactions worldwide and are becoming pop-
to transfer assets securely, to control and regulate the addition ular. It is now an exciting research area, and many researchers are
of cryptocurrencies, and secure their transactions (Garcia, Tessone, finding ways to analyze cryptocurrency features, such as stock mar-
Mavrodiev, & Perony, 2014), hence the name cryptocurrency. Cryp- ket and market price prediction, and analyze their impact on real
tocurrencies are established on the principle of decentralized control life. The increased interest of cryptocurrencies in economics and the
when compared with standard currencies, which rely on central bank- financial world has attracted researchers to this area. However, the
ing systems. Thus, a cryptocurrency is used to electronically transfer applications of cryptocurrency and its associated technologies are

Intell Sys Acc Fin Mgmt. 2021;28:3–34. wileyonlinelibrary.com/journal/isaf © 2021 John Wiley & Sons, Ltd. 3
4 KHEDR ET AL.

the central bank, the national treasury, the mint, and commercial
banks. Commodities, commodity-backed assets, and fiat currency are
three traditional types of monetary systems. In a commodities-type
monetary system, some precious metals or commodities with intrinsic
value are exchanged physically as a currency. The obvious examples
of this are gold and silver coins that have been used widely throughout
history. These type of systems are not divisible, which means that they
are inconvenient to use for purchases. They may also suffer from the
bandwagon effect, in which prices may fluctuate based on the buying
behavior of the general population. In a commodity-backed monetary
system, money draws its value from a commodity having no actual
physical value, such as paper notes. The gold standard is a famous
FIGURE 1 Global search trends in the field of cryptocurrency
example of this system. In this system, the more valuable commodity
will gradually disappear from circulation and so people will not prefer
not necessarily finance related; substantial computer science litera-
this for regular transactions. Fiat money is the most common and
ture exists on the supporting cryptocurrency technologies that can
widespread monetary system in which governments guarantee the
contribute toward new and efficient approaches appropriate for han-
value of a currency. In this system, people use notes or bank balances
dling bitcoin and other cryptocurrencies, their price volatility, and
as the mode of exchange and store. In this system, risk arises at the
other associated technologies. This paper provides a comprehen-
moment the funds are deposited with the bank. Central banks may
sive survey of cryptocurrency price prediction research in the period
cause inflation by printing and devaluing money.
2010–2020, by covering the significant research studies on various
From the limitations of the traditional systems, the idea of using
aspects of cryptocurrency price prediction, which include both statis-
currency in digital form has emerged. One of the main reasons for
tical and machine-learning (ML) approaches. This paper also gives an
introducing cryptocurrency is to deal with imperfections present in
insight into the data sets, research trends and techniques, and predic-
the existing monetary system, which relies on fiat money and leads to
tion approaches, concluding with some promising opportunities that
inflation. The main aim of such a digital monetary system is to solve
remain open in cryptocurrency price prediction research.
the problem of inflation and negative yields for consumers and provide
An integrated overview of the characteristics of cryptocurrencies
better financial stability. This might enhance the convenience, speed,
was provided in the seminal review paper of Corbet, Lucey, Urquhart,
and cost, and thereby provides economic benefit. The possible ways
and Yarovaya (2019). Kyriazis (2019) introduced survey papers on
of implementing a digital monetary system include central-bank-based
the efficiencies of cryptocurrency markets in 2019 and the bubble
systems with or without decentralization, a fully decentralized digital
characteristics of cryptocurrencies in 2020 (Kyriazis, Papadamou, &
monetary system that replaces the monopolistic power held by the
Corbet, 2020). A survey on anticipating the prices of cryptocurrencies
central banks, and blockchain-based digital monetary systems. Among
using deep learning (DL) was introduced by Akshaya, Eswari, Dharani,
these, the blockchain-based systems are more successful in providing
and Lalitha (2019). The survey gives a good introduction to the exist-
truly decentralized solutions to the public.
ing ML approaches used in this field, but it does not provide details
The first and most prominent cryptocurrency is bitcoin, intro-
on each technique and how they were used in the prediction. More-
duced by pseudonymous developer Satoshi Nakamoto (Nakamoto &
over, there is no comparison between the ML and DL approaches
Bitcoin, 2008). In January 2009, Bitcoin was implemented and released
used in this field. In this survey, we tried to provide a state-of-the-art
to the public as an open-source code by Nakamoto. The top three
snapshot of the statistical and ML models developed for cryptocur-
cryptocurrencies circulating in the market are bitcoin, altcoin, and
rency price prediction. To the best of our knowledge, this is the first
tokens. Civic and BitDegree are examples of tokens/dApps. Cryptocur-
comprehensive study on cryptocurrency price prediction detailing the
rency technology moved the financial market one extra step toward
different traditional and ML approaches used in this field. All search-
the future by decentralizing the currency and releasing it from the
able articles on cryptocurrency price prediction related to traditional,
hierarchical power structures. As a substitute, consumers and orga-
ML, DL, and reinforcement-learning (RL) approaches are reviewed in
our study. This survey gives an insight into the current status of cryp- nizations perform transactions digitally on a peer-to-peer network.
tocurrency price prediction. We have categorized and analyzed the Within a short period of its existence, the cryptocurrency market has
studies according to the different techniques used, which can provide experienced exponential growth and widespread popularity. In recent
researchers with knowledge related to the data sets, research trends years, cryptocurrencies have gained increasing popularity and have
and techniques, and prediction approaches, and also some promis- received worldwide attention from the media, attracting investors,
ing opportunities that remain open in cryptocurrency price prediction academia, governments, regulators, and speculators. Figure 1 shows
research. the global search trends in the field of cryptocurrency during the
period 2010–2020 (https://trends.google.com). The use of cryptocur-
1.1 Background of cryptocurrency rency is growing, and it is necessary to study its impact on countries'
The traditional monetary system involves a set of rules, policies, monetary systems. The future of bitcoin, or any cryptocurrency, is
frameworks, and institutions that a government uses to create money not limited to any particular discipline; rather, it transcends every field
in the economy. The main participants involved in this system include (Holub & Johnson, 2018).
KHEDR ET AL. 5

FIGURE 2 Cryptocurrency workflow using blockchain mechanism

FIGURE 3 Volatility dynamics of top cryptocurrencies during 2010–2020

1.2 Features of cryptocurrencies and need for price transaction information into the global storage at the receiver end,
prediction and then the transaction is closed.
The crypto industry has gained increasing popularity over the past
Cryptocurrencies are used for many useful purposes, such as
years. Governments are now aiming to consider both taxation and
online transaction systems, and their usage is increasing very fast.
regulation of cryptocurrencies. Accountants and auditors are now
Cryptocurrency-based transactions involve a decentralized and dis-
seeking business guidance from the standard setters, as they are now
tributed peer-to-peer system that allows the information to be accepting cryptocurrencies as a payment form. Cryptocurrencies are
recorded in an open transaction ledger, called the blockchain. It is not primarily volatile and characterized by the number of transactions and
under the control of any company or the government (Mittal, Arora, the changes in their prices; this makes cryptocurrency price predic-
& Bhatia, 2018). The blockchain supports transactional databases tion challenging. The popular cryptocurrencies with the largest market
and offers greater transparency, which is unfamiliar in the world of capitalizations are bitcoin, ethereum, ripple, and litecoin. Cryptocur-
classical financial markets. If the block refers to a financial transac- rencies operate differently and are distinguished from one another
tion, then each transaction in the blockchain, by definition, includes mainly due to their values, transaction speeds, usages, and volatility
information about previous transactions, and thus verifies the own- characteristics. For instance, by the end of 2013 there was no signif-
ership of the financial asset being transferred. A sample blockchain icant price fluctuation with bitcoin, whereas other cryptocurrencies,
database is shown in Figure 2, in which there are two clients, A and such as litecoin and ripple, have showed significant instability in price
B, where client A wants to send money to client B. Authentication since the end of 2013 (Böhme, Christin, Edelman, & Moore, 2015).
of each transaction from client A to client B can be verified through Though cryptocurrency prices have soared since 2016, with great
this distributed ledger or blockchain that is maintained by all partici- fluctuation, people's interest in it has stayed more or less constant.
pants. It checks the validity of the transaction at the sender and adds Therefore, in recent years, various techniques have been proposed by
6 KHEDR ET AL.

TABLE 1 Volatility trends of bitcoin in the past the two commonly used techniques in cryptocurrency price prediction.
10 years The motivation of this work is the speed of acceptance or recognition
Period Change ($) Change (%) Trend of cryptocurrencies as financial instruments and the increased utiliza-
Last 30 days −166.86 −1.79 ↘ tion of ML techniques in predicting time-series problems to seek more
Last 6 months +304.31 +3.43 ↗ accurate predictions. We aim to synthesize the collective knowledge
Last 1 year −1,427.26 −13.47 ↘
from the first 10 years of cryptocurrency operation and development
Last 2 years +1,824.36 +24.84 ↗
and to highlight the significant techniques adopted for cryptocurrency
Last 4 years +8,491.00 +1,255.43 ↗
price prediction, ranging from traditional statistical methods to the
Last 6 years +8,539.22 +1,359.49 ↗
Last 8 years +9,158.34 +101,759.33 ↗ recent ML approaches.
Last 10 years +9,153.30 +65,194.44 ↗ Figure 5 gives an outline of the survey theme and paper selection.
This study also compares the different ML models to identify efficient
and robust approaches for price prediction. This article can help
researchers to predict and model the price of cryptocurrencies and to
researchers in developing robust models for cryptocurrency price
analyze the volatility of the crypto market. The interest in cryptocur-
prediction, using the conclusions derived at the end.
rencies increased even more after the great cryptocurrency crash
We divide the rest of this article into the following sections. In
(also called the Bitcoin Crash) at the beginning of 2018. The price
Section 2, we discuss the methodology used to conduct this survey.
reached a peak of nearly $20,000 per bitcoin in late 2017 and has
In Section 3, we review the common statistical and ML techniques. In
since fluctuated quite a bit (2018–2019), averaging at about $7,000
Section 4, traditional statistical techniques for prediction of cryptocur-
as of April 2020. Crypto Research Report (2020) presents price pre-
rency prices are discussed. In Section 5, we discuss the ML techniques
dictions for several cryptocurrencies (bitcoin, bitcoin cash, ethereum,
for prediction of cryptocurrency prices. Section 6 provides the discus-
litecoin, and stellar); based on a new comprehensive analysis, the price
sion, and future directions are listed in Section 7. Finally, Section 8
of bitcoin was predicted to reach almost $20,000 in 2020 and to
concludes our work.
keep rising to almost $400,000 by 2030. Figure 3 shows the graph
of cryptocurrency volatility for the top cryptocurrencies during the
period 2010–2020 (https://coinmarketcap.com). A short summary on 2 METHODOLOGY
the volatility trends of bitcoin and other top cryptocurrencies cur-
rently circulating in the market can be found in Table 1 (https://coin. Research interest in the field of cryptocurrencies has started to
dance) and Table 2 (https://www.tradingview.com) respectively. increase considerably over the past few years. With the increasing
It is difficult to say exactly what drives the price of these cryptocur- popularity of cryptocurrencies and the worldwide attention as an
rencies over time. As with most price fluctuations in the cryptocurrency emerging financial market, the number of publications in this area has
world, the exact causes are difficult to pinpoint. The entire crypto been increasing since 2017, and especially in 2018 and 2019; and the
industry is known for its extreme volatility. The top 10 cryptocur- trend continued in 2020. It is interesting that the research studies
rencies represent approximately 85% of the total market share, initially focused more on bitcoin than on the cryptocurrency topic in
with bitcoin dominating with about 64% of the market capitaliza- general. However, since 2018, the focus on the general cryptocur-
tion (https://coinmarketcap.com). With the ever increasing interest rency topic has started increasing (https://www.scopus.com, https://
in cryptocurrencies and their significance in the financial world, it is onlinelibrary.wiley.com/). The topic of cryptocurrency price prediction
necessary to have a comprehensive analysis on and forecasting of the is still in its nascent stages and requires further research efforts to
volatility dynamics of cryptocurrencies. explore this area. When we consider only the finance-related literature
However, despite the growing interest, acceptance, and integration (which includes the areas of finance, economics, business, manage-
of cryptocurrencies in global financial markets, there is limited research ment, and accounting), the number of publications is not very high.
on modeling the volatility dynamics of cryptocurrencies. Such unstable However, it is notable that the topic is being explored and is gain-
fluctuation are difficult to predict for users. Several factors may ing attention in other disciplines, including mathematics, engineering,
influence prices of cryptocurrencies over the years. This includes both and computer science. The major milestones in cryptocurrency price
internal and external factors. Factors related to the crypto market prediction research are depicted in Figure 6.
(e.g., trading volume, market beta, and volatility) can be regarded In this paper, we present a systematic literature review of cryptocur-
as one of the significant factors determining cryptocurrency price rency price prediction using traditional statistical and ML techniques.
(during 2010–2020). Moreover, the attractiveness of cryptocurrencies We searched for topic-related keywords and short phrases such
can also influence their price (Sovbetov, 2018). Figure 4 provides an as ‘‘cryptocurrency price prediction,’’ ‘‘ML techniques (and) cryp-
overview of the various types of factors that influence cryptocurrency tocurrency price prediction,’’ ‘‘cryptocurrency price prediction using
prices. ML techniques,’’ and ‘‘DL (and) cryptocurrency price prediction.’’
Generating an accurate prediction model for such a complex After conducting the search process, the papers were screened and
problem is very challenging. The problem of cryptocurrency price classified into different categories that included traditional statistical
prediction is still in its nascent stages and requires further research and ML techniques. We further classified the latter into different ML
efforts to explore this area. So, the main objective of this work is to and DL techniques. The majority of the papers are related to use of
review all the techniques and methods used between 2010 and 2020 ML techniques in predicting the price of a cryptocurrency, and some
in cryptocurrency price prediction. Statistical and ML approaches are of the papers fall under DL and RL techniques. All of these papers
KHEDR ET AL. 7

Performance (%) TABLE 2 Volatility trends of top


Name Change (%) Weekly Monthly 6-Month Year to date Yearly Volatility cryptocurrencies currently circulating in
the market
Bitcoin −0.25 −1.05 0.40 71.32 27.71 4.98 0.55
Ethereum −0.55 −2.48 4.13 89.97 82.11 26.91 1.33
Tether 0.03 0.05 0.13 0.31 0.56 0.19 0.20
XRP −0.83 −0.36 12.02 29.82 2.98 −27.65 3.10
Bitcoin cash −0.05 −3.25 1.16 25.22 9.32 −21.73 0.61
Cardano 1.09 −6.16 54.73 363.11 279.22 188.38 2.81
Bitcoin SV −0.87 −4.51 7.31 41.61 78.80 32.06 1.19
ChainLink −0.04 −1.84 74.45 282.45 352.10 190.72 3.30
Binance coin 3.58 −1.71 15.39 71.53 29.96 −11.50 6.19
Litecoin −0.15 −3.32 2.53 16.94 3.04 −31.32 1.20

FIGURE 4 Overview of common factors influencing cryptocurrency price

talk about cryptocurrency price prediction using different methods. for time-series analysis is the autoregressive moving-average model
This area is still new, and the number of published papers is not very (Choi, 1992). While investigating the cryptocurrency price fluctua-
high; nevertheless, it is a very crucial area and deserves to be explored tions using econometrics, researchers usually utilize statistical models
because of the impact it is having on the financial system. Figure 7 on time-series data. Among these models, the most widely used
gives an illustration of the survey process and the scheme used. ones are the generalized autoregressive conditional heteroscedastic-
ity (GARCH) model, multivariate linear regression, multivariate vector
autoregressive model, and extended vector autoregressive model
(Fang et al., 2020).
3 COMMON STATISTICAL AND ML
With the advancement of big data technology and artificial intel-
TECHNIQUES: AN OVERVIEW
ligence, numerous research studies have applied ML models to
classification and prediction problems (El-Bannany, Sreedharan, &
The emergent price prediction schemes for cryptocurrency include
Khedr, 2020; Sreedharan, Khedr, & El-Bannany, 2020a; 2020b).
strategies that are based on statistical and ML technologies. Gen-
Many researchers have focused their efforts on applying these new
erally, the statistical models use mathematical equations to encode
techniques on financial markets (Dixon, Halperin, & Bilokon, 2020;
information extracted from the data. The traditional techniques
El-Bannany et al., 2020; Galeshchuk & Mukherjee, 2017; Hatefi
for cryptocurrency price prediction usually adopted statistical and Ghahfarrokhi & Shamsfard, 2020; Nikou, Mansourfar, &
econometric models (Brooks, 2019). Econometric approaches apply Bagherzadeh, 2019; Sarlin & Marghescu, 2011; Sreedharan et al.,
an integration of statistical and economic theories to estimate and 2020a; 2020b). Hatefi Ghahfarrokhi and Shamsfard (2020) investi-
predict the values of various economic variables. In some cases, gated the impact of social media data in predicting the Tehran Stock
statistical-model-based techniques can quickly provide adequate mod- Exchange variables. Galeshchuk and Mukherjee (2017) investigated
els (Wang & Chen, 2020). A linear statistical-model-based approach the ability of deep convolution neural networks (NNs) to predict
evaluates the linear relationship between prices and an explana- the direction of change in forex rates. Sarlin and Marghescu (2011))
tory variable. If multiple explanatory variables exist, it is possible constructed a neuro-genetic model for predicting currency crises by
to model the linear relationship between explanatory variables using a genetic algorithm for specifying (1) the combination of inputs,
(independent) and response variables (dependent) with the help of (2) the network configuration, and (3) the training parameters for a
multiple linear models. The commonly used linear statistical model back-propagation artificial NN (ANN). Nikou et al., (2019) evaluated
8 KHEDR ET AL.

FIGURE 5 Outline of the subject matter


in this survey

FIGURE 6 Major milestones in


cryptocurrency price prediction
research

the prediction power of ML models in a stock market. In this context, has a high-dimensional feature space and is also used for system
in the financial industry, the application and use of ML algorithms learning. This technique is flexible as it creates explicit and accurate
for cryptocurrency price prediction is increasing and is getting atten- boundaries that lead to fast training results, besides being easy to
tion from researchers in various disciplines as well. ML is a technique use (Zhang & Wang, 2015). It performs well with a small data set
that has used as a model or framework for predicting various aspects and provides a nonlinear solution by applying a kernel function to

across the industry for over three decades. In recent years, ML and its map the input variables into high-dimensional space. It produces

associated techniques have made notable advances in various fields. a classification hyperplane that is used to differentiate between
two classes with maximum margin and is used to solve the pattern
ML is categorized into supervised and unsupervised learning. In super-
classification problem.
vised learning, labeled instances are present in a data set, whereas
• Artificial neural network: An ANN is comprised of interconnected
unsupervised learning has no such labels. Examples of supervised
units called neurons that are activated depending on the input. It
learning techniques include NNs and support vector machines (SVMs),
mimics the human brain in information processing and interacts
whereas clustering techniques fall under unsupervised learning. ANNs
with other processing features (Lu, 2010). The network consists
and SVMs are the two most widely used algorithms for predicting price
of input and output neurons, where input neurons are triggered
fluctuation (Patel, Shah, Thakkar, & Kotecha, 2015). Deriving a func-
based on the sensing environment. Other neurons are activated
tion from a training data set is the main task in supervised learning. using weighted connections from neurons. The sigmoid function is
Some of the main ML techniques discussed are as follows. employed at each of the hidden layers as a transfer function. To
adjust the weights, gradient descent with momentum is used so
• Logistic regression: Logistic regression is one of the traditional
that the global minimum can be achieved.
variate regression methods, mainly used in binary classification
problems (Kleinbaum & Klein, 2002). The output or response vari- • Random forest: Random forest (RF) is one of the popular

able y ∈ [0, 1] indicates a class label, which can be predicted using approaches used for performing classification tasks. It uses an

the input feature value xi , where i = 1, … , k. The logistic regression ensemble of decision trees for better classification results (Chen,

model can be represented as follows: Li, & Sun, 2020). The decision tree is one of the vital ML methods
that uses a tree structure to iteratively partition the feature space
(a node) until a single class sample is obtained. This pure node at
P(y) the end is called a leaf node, and a class label is assigned to this
logit(P(y)) = log = 𝛽0 + 𝛽1 x1 + 𝛽2 x2 + … + 𝛽k xk (1)
1 − P(y)
node. By performing bootstrap aggregation and a random feature
selection process, RF allows a random subset of the whole feature
• Support vector machine: The SVM was introduced as an induction space to be assigned to the growth of each tree.
principle by Vapnik (1999) to avoid overfitting of data. SVMs • Deep learning: DL is a subclass of ML algorithms that uses multiple
are primarily used to identify the maximum margin hyperplane. It layers to gradually extract high-level features based on raw data
KHEDR ET AL. 9

FIGURE 7 Survey scheme

FIGURE 8 A transaction–address graph


representation of the bitcoin network

input. The term ‘‘deep’’ refers to the number of processing layers collection of individual models. It produces a prediction model that
through which the data transformation takes place. It is used to is an ensemble of weak prediction models, such as decision trees
get more precise details using the layered approach. For example, (Li, Chamrajnagar, Fong, Rizik, & Fu, 2019; Sun, Liu, & Sima, 2020).
in image processing, details such as an edge can be identified using It builds the model in a stage-wise fashion, and it generalizes them
lower level layers while all high-level information, such as letters, by allowing optimization of an arbitrary differentiable loss function.
digits, or faces, can be extracted using higher level layers. Most of The implementations of this technique can have different names,
the modern DL models are based on ANNs (Längkvist, Karlsson, & among which the common ones are GB machines (GBMs) and
Loutfi, 2014). XGBoost.
• Reinforcement learning: In the reinforcement learning (RL) pro-
cess, learning is achieved through interaction between learning 4 TRADITIONAL STATISTICAL TECHNIQUES
objects and its related environment. Objects try to learn using the FOR CRYPTOCURRENCY PRICE PREDICTION
trial-and-error method. RL consists of three components: a value
function, an environment, and a reinforcement function. The envi- The traditional approaches for cryptocurrency price prediction usually
ronment for RL is often dynamic, with a set of probable states. For applied statistical and econometric models. Econometric approaches
each state, there exist a set of possible actions at each time (Dixon apply an integration of statistical and economic theories to estimate
et al., 2020). and predict the values of various economic variables (Brooks, 2019).
• Gradient boosting: Gradient boosting (GB) is a technique for both While examining the cryptocurrency price volatility and predic-
regression and classification problems. Similar to RFs, GB is an tion using econometrics, researchers adopted statistical models on
ensemble learner. This means it will create a final model based on a time-series data, generally. In this section, we review the exist-
10 KHEDR ET AL.

ing statistical and econometric techniques for cryptocurrency price cryptocurrency (bitcoin, ethereum, monero, litecoin, and dash) prices
prediction. using the dynamic ordinary least-squares method and found that the
Classical approaches were used previously, such as Holt–Winters prices of these currencies depend on their computing power and
exponential smoothing by Chatfield and Yar (1988), to forecast network.
time-series data. This approach depends on linear assumptions. In Bystrom and Krygier (2018) extracted daily, weekly, and monthly
this approach, input data are segregated into several trends and are data covering from 2011 to 2017 to investigate correlations, regres-
used for predicting features that have seasonal effects, such as in sions, vector autoregression (VAR), and impulse response functions
sales. This approach cannot be used to predict cryptocurrency price techniques. The variables driving the changes in bitcoin were exam-
accurately as there are no seasonal effects with cryptocurrency. ined, and they found that these variables change with volatility of
Sovbetov (2018) employed the augmented Dickey–Fuller unit-root the trade-weighted USD currency index and search pressures on
test and bound testing approach to examine factors influencing the bitcoin-related words on Google. Kaya (2018) used correlation and
prices of five cryptocurrencies (bitcoin, ethereum, dash, litecoin, and regression analyses and extracted weekly data from August 8, 2014,
monero) over 2010–2018 using weekly data. The results revealed that to May 4, 2018, from the online cryptocurrency market Bitstamp to
market beta, trading volume, and volatility have an influential impact study the impact of public interest, volatility metric of S&P 500 Index
on the prices of all five cryptocurrencies both in the short and long run. options, and political and regulatory news on cryptocurrency prices.

Roy, Nanjiba, and Chakrabarty (2018), using annual bitcoin data The results revealed that public interest is the most influential factor
that drives cryptocurrency prices.
from 2013 to 2017, applied time-series models (autoregressive inte-
Kjærland, Khazal, Krogstad, Nordstrøm, and Oust (2018) used
grated moving-average [ARIMA] model, autoregressive model, and
econometric methods represented by an autoregressive distributed
moving-average model) to forecast the bitcoin price. They found that
lag model and the GARCH model to study the determinants of bit-
the ARIMA model was the best model to predict the bitcoin price. Guo
coin (BTC) price dynamics. The data used were daily spot rates for
and Antulov-Fantulin (2018) collected data about volatility and order
BTC/USD for the period between January 1, 2013, and February 20,
book related to bitcoin over the period September 2015 to April 2017
2018. The results revealed that returns on the S&P 500 are essen-
and proposed temporal mixture models to predict the changes in bit-
tial in explaining the BTC price dynamics. Phillips and Gorse (2018a)
coin price. The proposed models worked better than other models
used the wavelet coherence approach to investigate the co-movement
in predicting the changes in the price of bitcoin. Abu Bakar, Rosbi,
between a cryptocurrency price and its related factors represented
and Uzaki (2019) used a moving-average method to predict the bit-
by social media factors, Google search volume, and Wikipedia. Cryp-
coin price. Data collected from October 1 until December 20, 2019,
tocurrency price data collected from a number of exchanges over the
was used in experiments. The moving-average forecasting method
period 2010–2017, from social media factors derived from Reddit,
was implemented using 2-day, 3-day, 4-day, and 7-day calculations.
from Google search volume from Google Trends service, and from
The results revealed that the 2-day moving-average method is the
Wikipedia to track the number of new users learning about a cryp-
better prediction method with the lowest mean absolute error (MAE)
tocurrency were used for analysis. The results revealed a positive
percentage for all observation periods.
relationship between a cryptocurrency price and its related factors.
Akcora, Dey, Gel, and Kantarcioglu (2018) extracted bitcoin price
Wiedmer (2018) investigated the determinants of cryptocurrency
data over the period 2009–2018 to predict bitcoin prices. They
price using a panel of 17 cross-sections. He employed unit-root
introduced a new concept of k-chainlets on bitcoins that enlarges
and cointegration tests and estimated the effects with vector error
the ideas of motifs and graphlets to blockchain graphs. They used
correction models, dynamic ordinary least squares and fully mod-
chainlets or bitcoin subgraphs to evaluate the topological structure
ified ordinary least squares. Causality flows were tested by weak
over time. They developed an approach to comprehend chainlets and
exogeneity and Granger causality tests. The results showed that
for local topological structure and used the techniques that have a
Metcalfe's law, community factors, and search engine queries had
greater effect on the price dynamics. These important chainlets are
an influential impact on the price of the cryptocurrency. Blau (2017)
used for price prediction. The bitcoin transaction graph possesses
three main components: addresses, blocks, and transactions. Bitcoins extracted price data from Bitcoin Charts and Bloomberg over the
transferred from an input address to an output address are said to be period July 17, 2010, to June 1, 2014, and used regression analysis
one transaction. Figure 8 shows 13 addresses and four transactions to investigate the impact of speculative trading, the prior 5-day bit-
in a network. Granger causality is used for bitcoin price prediction. coin return, the prior 5-day volume turnover, outstanding bitcoins,
Addresses are represented as circles, and transactions are shown as and the volatility estimate for the exchange rates of 51 other cur-
rectangles. An edge indicates a transfer of coins. The coin at address rencies variables on the price of bitcoin. The results revealed that
a8 represents unspent coin. Chainlet analysis provides a more in-depth speculative trading is irrelevant in explaining changes in the price
insight into local topological properties of the blockchain and the role of bitcoin. Hayes (2017) used cross-sectional data about 66 of the
of those local higher order topologies in the bitcoin price formation. most widely used cryptocurrencies to investigate the impact of three
They found that specific types of chainlets have a high predictive factors on the cryptocurrency value: the level of competition in the
utility for bitcoin prices. Moreover, extreme chainlets exhibit a vital network of producers, the rate of unit production, and the diffi-
role in the bitcoin price prediction. culty of the algorithm used to ‘‘mine’’ for the cryptocurrency. The
Bhambhwani, Delikouras, and Korniotis (2019) extracted data from results revealed that the three factors have a significant impact on the
August 2015 to January 2019 to investigate the fundamental drivers of cryptocurrency value.
TABLE 3 Summary of traditional statistical and econometric techniques surveyed in Section 4
Technique Cryptocurrency Reference Data type Data frequency Time range Target variables Performance metric Data source
ADF unit-root Bitcoin, ethereum, Sovbetov (2018) Market data Weekly data 2010–2018 Market beta, trading Crypto50 index price BitInfoCharts,
test and dash, litecoin, and volume, and volatility Finance, World
bound testing monero Bank, and Google
KHEDR ET AL.

approach Trends
ARIMA, AR, Bitcoin Roy et al., (2018) Market data Daily July Price prediction Accuracy CoinDesk
and MA models 2013–August
2017
Temporal mix- Bitcoin Guo and Order book data Hourly Sep Prediction of volatil- Accuracy They are not men-
ture model Antulov-Fantulin (2018) volatility 2015–April ity tioned
2017
k-chainlets Bitcoin Akcora et al., (2018) Market data Daily 2009–2018 Prediction of price RMSE, wallet gain Bitcoin core
performance
MA method Bitcoin Abu Bakar Cash data Daily October Price prediction Mean absolute error CoinDesk
et al., (2019) 1–December percentage
20, 2019
Vector error Bitcoin Abbatemarco Market data Daily November Price prediction Cost and revenues Publicly
correction et al., (2018) 2013–September databases
2017
Dynamic Bitcoin, ethereum, Bhambhwani Market data Weekly aver- August 7, Prediction of price Least squares Coinmetrics
ordinary dash, litecoin, mon- et al., (2019) age 2015–January method
least squares ero, and dash 25, 2019
method
Correlation Bitcoin Giudici and Market data clos- Daily May 18, Price prediction RMSE full, RMSE cryptocoincharts.info
network and Abu-Hashish (2019) ing price 2016–April autoreg
VAR model 30, 2018
FCVAR model Bitcoin Dos Santos Maciel High and low bit- Daily January Price prediction Range (i.e., measure coindesk.com
and Ballini (2019) coin prices 2012–February of realized volatility)
2018
MF-DCCA Bitcoin Kim et al., (2016) Daily volume and Daily July 17, Price–volume Cross-correlations cryptocompare
price index data 2010–May 2, cross-correlatio coefficient
2018
Kapetanios Bitcoin, ethereum, Gunay (2019) Market capitaliza- Daily August 7, Impact of public log price, Maki CoinMarketCap
unit-root test, litecoin, and ripple tion closing price, 2015–January information arrivals cointegration test,
Maki cointe- public information 13, 2018 on cryptocurrency Granger causality
gration analysis arrivals in bull and market via Twitter
and Markov bear markets posts (causality)
regime switch-
ing regression
analysis
ARIMA model Bitcoin Anupriya and Open, low, and Daily January 1, Seasonality and Accuracy coindesk.com
Garg (2018) high close price 2015–Septembertrend present in the
23, 2018 price data
Random matrix N = 119 cryptocur- Stosic et al., (2018) Market data (clos- Daily August 26, Cross-correlations Cross-correlations CoinMarketCap
theory and rencies ranked by ing price) 2016–January between price
minimum market capitalization 18, 2018 changes of different
spanning trees cryptocurrencies
GARCH–MIDAS Bitcoin, etherium, Walther et al., (2019) Explanatory vari- Daily, weekly, May 1, Prediction of volatil- HMSE CoinMarketCap,
framework litecoin, ripple, and able (global real and monthly 2013–July ity in cryptocurrency CRIX from
stellar, the cryp- economic activity) 31, 2019 markets thecrix.de (Trimborn
tocurrency index & Härdle, 2019)
CRIX
Correlation in Bitcoin, etherium, Alvarez-Ramirez Market data (mar- Sliding August Analyze major cryp- Confidence band CoinMarketCap
DFA and DCCA ripple, and litecoin et al., (2018) ket capitalization windows 2015–April tocurrencies' prices
and availability) approach 2019 and prediction
11

Table 3 continued on next page


TABLE 3 Continued
12

Technique Cryptocurrency Reference Data type Data frequency Time range Target variables Performance metric Data source
Markov-switching Bitcoin, etherium, Caporale and Closing prices Daily Different Modeling volatility, P-values CoinMarketCap,
GARCH ripple, and litecoin Zekokh (2019) periods with prediction of VaR CoinDesk price
models end date and ES index
as April 30,
2018
ARIMA Bitcoin, XRP, and Alahmari (2019) Daily, weekly, and Time-series 2013–2018 Price prediction MAE, MSE, RMSE CoinMarketCap
ethereum monthly data
Natural Bitcoin Karalevicius Bitcoin news Daily Till February Price movement pre- Returns and Sharpe Expert news media,
language et al., (2018) media on investor 2016 diction ratios CoinDesk, Cointele-
processing sentiment graph, NewsBTC
techniques,
lexicon-based
sentiment
analyzer
Correlations, Bitcoin Bystrom and Kry- Market data Daily, weekly, 2011–2017 Prediction of volatil- RMSE, PSE, QL loss Luxembourg-based
regressions, gier (2018) and monthly ity function and the bitcoin exchange
VAR, and data R2 LOG loss function Bitstamp
impulse
response
Correlation Bitcoin Kaya (2018) Financial data Weekly data August 8, Price movement Coefficients, SE, t Bitstamp
and regression 2014–May 4, stat, P-values
analysis 2018
Autoregressive Bitcoin Kjærland et al., (2018) Sentiment-based Daily January 1, Determinants of BTC Variance inflation Quandl
distributed data 2013, and price dynamics factors
lag model and February 20,
generalized 2018
autoregressive
conditional
heteroscedas-
ticity
model
Wavelet Bitcoin, ethereum, Phillips and Sentiment-based Daily 2010–2017 Wavelet coherence P-values Brave New Coin
coherence monero Gorse (2018a) data analysis of price
approach
Vector error Bitcoin and others Wiedmer (2018) Market data Daily September Volume-weighted Unit root and cointe- Marketcap
correction 2017–January average of prices gration tests
models, 2018 from each market
dynamic OLS,
and fully
modified OLS
Regression Bitcoin Blau (2017) Financial and tec- Daily July 17, Volatility estimation Correlation matrix Bitcoincharts and
analysis nical data 2010–June 1, Bloomberg
2014
Regression Bitcoin Hayes (2017) Data of 66 cryp- Daily 2010–2013 Cost of production SE and t-statistic CoinMarketCap,
model tocurrencies coinwarz.com,
cryptsy.com, bitcoin-
wisdom.com, and
blockchain.info
Correlations, Bitcoin Vaddepalli and Market data Daily 2014–2016 Impact of macroe- SE Fiatleak, Cryp-
regression, and Antoney (2018) conomic factors on tocomp, and
ANOVA bitcoin price BitCoiny
Regression Bitcoin Vieira (2017) Market data Daily November 1, Asymmetrical test-stat, P-value Quandle and US
analysis 2013–January impacts on price Department of Trea-
20, 2016 volatility sury

Table 3 continued on next page


KHEDR ET AL.
KHEDR ET AL.

TABLE 3 Continued
Technique Cryptocurrency Reference Data type Data frequency Time range Target variables Performance metric Data source
Wavelet Bitcoin Krǐstoufek (2015) Market data Daily 2011–2014 Trade, money supply Total circulating bit- Bitfinex, Bitstamp,
coherence and price level coins, transactions and BTC-e
analysis no., difficulty, est.
output vol., trade vol.
vs. transaction vol.,
rash rate
GARCH–MIDAS Bitcoin Conrad et al., (2018) Financial and Daily May Long- and Returns on the Bitcoinity
model macroeconomic 2013–Decembershort-term volatility S&P 500
2017
GARCH and Bitcoin Troster et al., (2019) Market data Daily July 19, Bitcoin returns and RMSE Coindesk
GAS 2010–April risk
16, 2018
Copula quan- Bitcoin, Ethereum, Bouri et al., (2019) Detrended vol- Daily January 1, Prediction of price
tile causality and other 5 ume data 2013–Decembervolatility, return
31, 2017
Granger-causality Bitcoin Badenhorst (2018) Market data Daily 2014–2018 Volatility prediction, SD, skewness, min CoinMarketCap
method and return and max
ARCH
MSGARCH Bitcoin Ardia et al., (2019) Bitcoin mid prices Daily August 18, Volatility dynam- Mean, median, SD, Datastream
model 2011–March ics of bitcoin's log skewness, kurtosis,
3, 2018 returns VaR forecasting
Four Bitcoin, dash, lite- Charles and Closing price Daily June 1, Breakpoints and sud- Jumps and structural CoinMarketCap
GARCH-type coin, and ripple Darné (2019) 2014–Novemberden shifts in volatility breaks
models 11, 2018
Markowitz 500 most capitalized Brauneis and Mes- Market data Daily January Risk–return benefits Mean and variance CoinMarketCap
mean–variance cryptocurrencies tel (2019) 2015–December
framework 2017
Fractional 13 popular coins, Yaya et al., (2018) Price data Daily August 7, Volatility trends Dependency of the Coin Metrics
cointegration including bitcoin 2015–November paired variables
technique in 28, 2018
VAR set-up
HMM and SIR Bitcoin and other 3 Phillips and Daily closing price Daily April Price prediction and Sharpe ratio, Sortino Crypto Compare
method Gorse (2017) and volume 2015–Septemberbubbles ratio, returns
2016
Dynamic topic Bitcoin and Phillips and Social media data Daily August 8, Future price move- Event-based data Reddit
model and ethereum Gorse (2018b) and price data 2016–August ment prediction analysis
Hawkes model 30, 2017
Sentiment Cryptocurrencies Zamuda et al., (2019) Social media data — — Crypto forecasting New sentiment anal- —
analysis, ARIM and investment ysis indicators
Granger-causality Bitcoin, ethereum Bartolucci Sentiment and — December Price prediction Mean, SD, min–max Github, CoinMarket-
method et al., (2020) market data 2010–August values Cap
2017
13
14 KHEDR ET AL.

the fractionally cointegrated vector autoregressive (FCVAR) model


to understand the price patterns of bitcoin and dollar. The empir-
ical analysis was performed for the period from January 2012 to
February 2018. They made comparisons with various other algo-
rithms, and the results indicated that fractionally cointegrated VAR
performed better.
Kim et al., (2016) used multifractal detrended cross-correlations
analysis to determine price–volume cross-correlation from July 2010
to May 2018 in the bitcoin market. A cross-correlation test was
performed to evaluate the cross-correlations between the series.
The level of cross-correlations was quantified using the detrended
cross-correlations analysis coefficient. Changes in bitcoin price and
FIGURE 9 Illustration of ML model in cryptocurrency price trading volume mutually interact with each other in a nonlinear
prediction way that may assist the participants of the bitcoin market in boost-
ing profit. Multifractality in cross-correlations presents the turbulent
Vaddepalli and Antoney (2018) examined the impact of financial and dynamic characteristics of the bitcoin market. Public information
openness, internet penetration, and inflation on the change of the arrival is the most essential asset of an efficient market.
price of bitcoin of United States, Canada, Russia, Brazil, China, and Gunay (2019) explored the impact of public information arrival
South Africa. Data collected from World Bank databases, Fiatleak, (official Twitter announcements) on the cryptocurrency market (bit-
and Cryptocompare and BitCoiny over the period 2014–2016 and coin, ethereum, litecoin, and ripple) using various methods, such as
15-years' average values for the explanatory variables. The results Maki cointegration analysis, Markov regime-switching regression anal-
of correlations, regression, and analysis of variance indicated that ysis and the Kapetanios unit-root test, to study its impact. The results
the variables analyzed do not significantly affect the price of bitcoin. indicated that positive public information arrival has a positive impact
Vieira (2017) used regression analysis on a set of time-series data on ripple's value. All models exhibited same results with bitcoin as an
spanning the period from November 1, 2013, to January 20, 2016, independent variable, indicating its significant effect on ripple's value.
to investigate the impact of the following variables on bitcoin price: The results were evaluated under bull and bear markets. In the bull
S&P 500 index; daily Treasury real yield curve rates on ‘‘Treasury market, public information arrival had a positive impact on ripple's
Inflation-Protected Securities’’ for a fixed maturity of 7 years; daily value, whereas the bear market did not have enough power to divert
USD price per ounce of gold; the daily number of confirmed bitcoin the ripple value from a downward trend.
transactions; the total number of unique addresses used on the bitcoin An ARIMA model was used by Anupriya and Garg (2018) for
blockchain; total value of coinbase block rewards and transaction fees predicting bitcoin price. This model is recommended as it gives better
paid to miners and the daily number of the term 'Bitcoin' queries made results than predicting directly. This model shows good accuracy, with
in Wikipedia. The results showed that only the index for S&P 500 is a mean percentage error of less than 6%. Cross-correlations between
irrelevant to explain the changes in bitcoin prices. price fluctuations of various cryptocurrencies were analyzed using
Kriš toufek (2015), using data over the period from September 14, minimum spanning trees and random matrix theory by Stosic, Stosic,
2011, to February 28, 2014, employed wavelet coherence analysis to Ludermir, and Stosic (2018). To quantify correlations, change in price
examine the impact of usage in trade, money supply, and price level on or cryptocurrency return value were calculated. The results showed
bitcoin price over the long term. The results showed that these factors the multiple collective behaviors in the crypto market that can be
have an influential impact on bitcoin price. Abbatemarco, Maria De appropriate in constructing cryptocurrency investment portfolios.
Rossi, and Salviotti (2018) used an econometric model represented by Several econometrics methods in time-series research, such as
the augmented Dicky–Fuller test, the Johansen tests for cointegration GARCH and the Baba–Engle–Kraft–Kroner model, have been used
and a vector error correction model to estimate the bitcoin price in the literature on cryptocurrency research. Walther, Klein, and
based on the price data available over the period from November 11, Bouri (2019) used the GARCH–mixed data sampling (MIDAS) frame-
2013, to September 5, 2017. The results supported the validity of the work was used to predict the volatility of cryptocurrencies (bitcoin,
suggested model to estimate the price of bitcoin. litecoin, etherium, stellar and ripple) and the cryptocurrency index
Giudici and Abu-Hashish (2019) introduced partial correlations CRIX. They determined most exogenous drivers and found that
and correlation networks into VAR models. The model helped to global real economic activity gave the most accurate predictions.
determine the cryptocurrency prices dynamics in different crypto Heteroscedasticity-adjusted mean squared error was used to ana-
exchange markets and allowed one to understand its correlation lyze the forecast. Future work aimed to address the same issue with
with other traditional market prices. The application of VAR corre- some other methodology or by using intra-day data to construct daily
lation networks also enabled building a model for predicting bitcoin measures of cryptocurrency volatility.
price that leverages the information contained in different corre- Alvarez-Ramirez, Rodriguez, and Ibarra-Valdez (2018) adopted
lation patterns among various exchange prices. Dos Santos Maciel detrended fluctuation analysis over a sliding window to evalu-
and Ballini (2019) analyzed the dynamic behavior and predictabil- ate long-range correlations for bitcoin price returns, Caporale and
ity of daily dynamics (high and low) of bitcoin prices. They applied Zekokh (2019) used Markov-switching GARCH (MSGARCH) models to
KHEDR ET AL. 15

estimate the changes in the cryptocurrencies (bitcoin, ethereum, ripple The findings from Charles and Darné (2019), who examined four
and litecoin). The result shows that using the standard GARCH model cryptocurrencies (bitcoin, dash, litecoin, and ripple), showed cryptocur-
may yield incorrect predictions resulting in ineffective risk manage- rency returns are strongly characterized by the presence of jumps
ment. This can be improved by using regime-switching, which might and structural breaks except for the dash market. Four GARCH-type
be useful for both regulators and investors. An ARIMA model was models (i.e., GARCH, integrated GARCH, asymmetric power autore-
presented by Alahmari (2019) to predict three major cryptocurrency gressive conditional heteroscedasticity, and fractionally integrated
prices (bitcoin, XRP, and ethereum) on the basis of daily, weekly, and GARCH) and three return types with structural breaks (original returns,
monthly time series. The ARIMA model outperformed other models jump-filtered returns, and jump-filtered returns with structural breaks)
in terms of mean squared error (MSE), MAE, and root-mean-squared were considered. Their findings indicated the importance of jumps in
error (RMSE). cryptocurrency volatility and structural breakthroughs.
The influence on media sentiment and bitcoin price was stud- Brauneis and Mestel (2019) adopted the use of a Markowitz
ied by Karalevicius, Degrande, and De Weerdt (2018). Natural mean-variance framework to analyze the risk–return benefits of
language-processing methods were used for data preprocessing. cryptocurrency portfolios. In an out-of-sample analysis account-
Lexicon-based sentiment analysis methods were integrated with ing for transaction cost, they found that combining cryptocur-
Harvard Psychosocial and finance-industry-specific dictionaries to rencies enriches the set of low-risk cryptocurrency investment
quantify sentiments and to inspect the reaction patterns. Reaction opportunities.
patterns were analyzed, which indicated that the price results fol- Yaya, Ogbonna, and Olubusoye (2018) applied the fractional coin-
lowing noteworthy news stories and then interday trading strategy tegration technique in VAR set-up to identify the dependence and
is proposed to maximize the return. The return and Sharpe ratio val- persistence of bitcoin on other popular alternatives before and after
ues were examined, which showed that sentiment-driven technique the 2017–2018 crash in cryptocurrency markets. The research focus
outperformed other bitcoin strategies. was necessitated since market players fouind bitcoin to drive other
Conrad, Custovic, and Ghysels (2018) extracted the long and cryptocurrencies, though bitcoin is the most valuable and highly cap-
short-term volatility components of bitcoin using a GARCH–MIDAS italized coin, taking about 40% of the 2,074 cryptocurrency types
model. This model converted the conditional variance into low- and market share (https://cointelegraph.com). They included 13 highly
high-frequency components. The results findings were that the volatil- priced and data-available cryptocurrencies in their analysis. The results
ity has a negative or significant impact on long-term bitcoin volatility revealed that higher persistence of shocks is expected after the crash
and that volatility risk premium has a significant positive impact on due to speculation in the minds of cryptocurrency traders, and more
long-term bitcoin volatility. evidences of non-mean reversions, implying chances of further price
In Troster, Tiwari, Shahbaz, and Macedo (2019), GARCH and gen- falls in cryptocurrencies.
eralized autoregressive score (GAS) models were used to predict Phillips and Gorse (2017) applied a hidden Markov model and the
bitcoin returns and risks. Out-of-sample performance was compared superiority and inferiority ranking approach to analyze the bubble-like
for both of these models. The results showed that the GAS model with behavior in cryptocurrency time series. Considering the hidden Markov
heavy-tailed distribution provided the best out-of-sample prediction model and the superiority and inferiority ranking method, an epidemic
and, owing to its flexibility, the GAS model was more robust. Bouri, detection mechanism was used in social media to predict cryptocur-
Lau, Lucey, and Roubaud (2019) applied a copula-quantile causality rency price bubbles, which classify bubbles through epidemic and
approach on volatility of cryptocurrencies. The strategy of the exper- nonepidemic labels. Experiments demonstrated a strong relationship
iment extended the copula–Granger-causality in distribution (CGCD) between Reddit usage and cryptocurrency prices.
method of Lee and Yang (2014). The study used copula functions Phillips and Gorse (2018b) applied a dynamic topic model and
to construct two tests of CGCD. A parametric test employed six Hawkes model to decipher relationships between topics and cryp-
parametric copula functions to discover dependency density between tocurrency price movements. The authors used a latent Dirichlet
variables. The performance matrix of these functions varied with inde- allocation model for topic modeling, which assumes each document
pendent copula density. The results provided remarkable evidence of contains multiple topics to different extents. The experiment showed
Granger causality from trading volume to the returns of seven large that particular topics tend to precede certain types of price movements
cryptocurrencies. in the cryptocurrency market.
The work of Badenhorst (2018) focused on revealing whether spot Zamuda et al., (2019) used new sentiment analysis indicators to
and derivative market volumes cause volatility in bitcoin price. He used analyze cryptocurrency trends. A general model evaluating the influ-
the Granger-causality method and ARCH(1, 1) for analysis. The result ence between a user's network action–reaction influence model
provided evidence that the spot trading volumes have a significant is mentioned in this research. The research covered the different
positive effect on price volatility, whereas the relationship between aspects of necessary perspectives needed when preparing forecasting
cryptocurrency volatility and the derivative market is uncertain. and investment, supported by cryptocurrency social media sentiment
An MSGARCH model was applied by Ardia, Bluteau, and analysis.
Rüede (2019) to test the existence of institutional changes in the Bartolucci et al., (2020) examined cryptocurrency prices with
GARCH volatility dynamics of bitcoin's logarithmic returns. The results the ‘‘butterfly effect,’’ which means ‘‘issues’’ of open-source
showed that MSGARCH models clearly outperform single-regime project provides insights to improve prediction of cryptocurrency
GARCH for value-at-risk forecasting. prices. Sentiment, politeness, emotions analysis of GitHub
16 KHEDR ET AL.

TABLE 4 Summary of ML techniques surveyed in Section 5


Technique Cryptocurrency Reference Data type Data frequency Time range Target variables Performance metric Data source
Binary Bitcoin, ripple, and Derbentsev Time-series Daily January 1, Short-term RMSE Yahoo! Finance
autore- ethereum et al., (2019) data in log 2017–January forecasting of
gressive return 3, 2019 cryptocurrencies'
tree prices
NN, switch- STORJ Chakraborty STORJ token Daily July 2, Forecast perfor- MAE, ME, MAPE, Not mentioned
ing regression and prices and 2017–April 6, mance of various RMSE, and RMPE
model Roy (2019) related 2018 time-series mod-
clustering els
coefficients
of the
transaction
network
Linear regres- Bitcoin Greaves and Bitcoin trans- Bitcoin price at Prior to April Bitcoin blockchain MSE, accuracy CS224W website,
sion, logistic Au (2015) actions 15 s intervals 7, 2013 analysis bitcoincharts.com
regression,
ANN, SVM
Extreme ZClassic, ZCash, Li Sentiment-based Hourly pricing For a period Predicting Correlation coef- RStudio, rtweet
gradient and bitcoin private et al., (2019) on Twitter data of 3.5 weeks ficients, adjusted package
boosting and trading SD
regression volume
tree model
(XGBoost)
Linear Bitcoin Mittal Social media Average daily April 9, Short-term bitcoin R2 score, correlation bitcoincharts
regression, et al., (2019) and web price 2014–January price fluctuation coefficient
polynomial search data 7, 2019
regression,
RNN, and
LSTM-based
analysis
Bayesian Bitcoin Poyser (2019) Website data Daily January The association RMSE Block information
structural 2013–May between bitcoin's service
time-series 2017 market price and a
approach set of internal and
external factors
Time-series Bitcoin and others Mohanty Market Daily data like February Prediction of fluc- — Users' comments
analysis using et al., (2018) sentiment price, additional 2016–December tuation in the and tweets from
bidirectional and social 26 features about 2018 future price of Twitter using
LSTM sentiment the blockchain of cryptocurrencies Apache Flume,
bitcoin and mar- and price data
ket, interval data were fetched
from exchange
(Blockchain Info)
Logistic Bitcoin, ripple, Bouri, Market data Daily August 7, Date-stamped — CoinMarketCap
regression ethereum, litecoin, Shahzad, and 2015–December price
nem, dash, and Roubaud (2019) 31, 2017 explosiveness
stellar
Linear regres- Bitcoin Uras Stock market Daily November Price prediction Relative RMSE, MAPE Yahoo! Finance
sion, multiple et al., (2020) data 2015–August website and
linear 2018 CoinMarketCap
regression,
multilayer
perceptron
NN, and long
short-term
memory NN
Logistic Ethereum Chen Market data Hourly August 30, Predict price Prediction Not mentioned
regression, et al., (2017) 2015, and changes
naive Bayes, December 2,
SVM, RF, 2017
ARIMA, and
RNN
ARIMA, Ethereum Bush and Network h = 5, 20, and July 2, Impact of RMSE, MAE, MAPE coingecko.com,
ARMA-GARCH, Choi (2019) transaction 50 days 2017–March ethereum STORJ coinmarket-
VAR, 𝛼-Sutte data 17, 2018 token clustering cap.com
indicator, coefficients in
and NNAR forecasting
models
ANFIS and Bitcoin and others Atsalakis Historical Daily September Price forecast RMSE, MSE, and MAE bitcoincharts.com
ANN et al., (2019) time series 13,
data 2011–October
12, 2017
Hybrid model Bitcoin Hashish Market 2 min frequency August Price prediction Mean, SD Coinbase
of hidden et al., (2019) data: orders 20–September exchange market
Markov and trades, 20, 2018
models technical
and opti- indicators
mized LSTM
networks
BPNN, Bitcoin Radityo Historical bit- Daily June 10, Predict the close MAPE cryptocompare.com
GANN, et al., (2017) coin price 2013–February value of Bitcoin in
GABPNN, 4, 2017 the next day
NEAT
ANN with Bitcoin Sovia Open, high, Daily close Not Bitcoin prices for Not mentioned Not mentioned
back-propagation et al., (2019) low, volume, mentioned the next hour
and request

Table 4 continued on next page


KHEDR ET AL. 17

TABLE 4 Summary of ML techniques surveyed in Section 5


Technique Cryptocurrency Reference Data type Data frequency Time range Target variables Performance metric Data source
Feedforward Bitcoin, bitcoin Almasri and Market price Hourly — Daily close price — —
ANN with cash, dash Arslan (2018) and hourly close
back-propagation price
NN, SVM RF Bitcoin, ethereum, Valencia Market and Daily 80 days of Daily price Price movement cryptocompare.com,
ripple, and litecoin et al., (2019) social data historical data Twitter
Bayesian NN Bitcoin Jang and Response Daily September Bitcoin price RMSE, MAPE bitcoincharts.com,
Lee (2018) variable, 2011–August blockchain.info
blockchain 2017
information
PNN, SVM Bitcoin, ripple, Kim User com- Daily December 1, Predict the price Accuracy, F1 score, CoinMarketCap,
ethereum et al., (2016) ment 2013–November and the number MCC CoinDesk
data 10, 2015 of transactions of
cryptocurrencies
ANN, SVM, Bitcoin Mallqui and Internal (bit- Daily August 19, Min–max and MAE, MAPE, RMSE bitcoincharts.com,
ensemble Fernan- coin behavior) 2013–July closing price quandl.com,
techniques des (2019) and external 19, 2016, investing.com
(economic April
factors, 2013–2017
demand)
ANN Bitcoin Almeida Historical Previous day's Since 2009 Bitcoin trend MSE and profit analysis
et al., (2015) data price and volume quandl.com
Linear regres- Ethereum Poongodi Market price 60 min intervals — Closing price and Percentage accuracy etherchain.org
sion, SVM et al., (2020) data lowest price
Optimized Bitcoin, litecoin, Hitam Daily Bitcoin, train- Open price, close Accuracy —
SVM–PSO ethereum, ripple, et al., (2019) Market price ing: March 28, price, high price,
nem, and stellar data 2013–January low price
16, 2017;
testing: Jan-
uary 17,
2017–January
16, 2018
Multivariate Bitcoin, exclusive Mittal Cryptocoin Daily 2015, 2016, Daily price RSE, multiple R2 , Kaggle
linear coin, ripple, lite- et al., (2018) historic price and 2017 F-statistic, adjusted R2 ,
regression coin, lisk, monero, data set P-value
siacoin, cryptonex,
AdEx, guldencoin
Multiple lin- Bitcoin and lite- Jain Concurrent Per minute Bitcoin: 2 hr price R2 score CoinDesk, Twitter
ear regression coin et al., (2018) price data, March 2018.
model tweets Litecoin:
February,
March 2018
Bitcoin Chen Aggregated Daily, 5 min inter- Dataset 1: 5 min interval Accuracy, precision, CoinMarketCap
XGBoost, et al., (2020) price val February 2, price, daily price recall, F1 score
quadratic 2017–February
discriminant 1, 2018.
analysis, RF, Dataset 2:
LSTM, SVM July 17,
2017–January
17, 2018
Bayesian Bitcoin Shah and Market price 10 s intervals February Price variation Profit Okcoin.com
regression Zhang (2014) 2014–July
2014
Binomial Bitcoin Madan Market price Daily, 10 min, 10 s — Price change using Sensitivity, specificity, CoinBase, Okcoin
logistic et al., (2015) data 10 min time inter- precision, accuracy
regression, vals
SVM, RF,
binomial GLM
SVM and Bitcoin Silva de Souza Market price, Daily Bitcoin: May Gold, silver, and Risk-adjusted returns Bloomberg
ANN et al., (2019) gold and sil- 2012–May bitcoin prices
ver data 2017. Gold
and sil-
ver: April
2012–May
2017
RF, SVM, GB, Bitcoin Virk (2017) Date, open, Interval data May Prediction and Accuracy, precision, Kaggle
and linear high, low, 2013–May correlaton recall, and F1 -score
regression close, vol- 2017
ume, market
capital
SVM, naive Bitcoin Sun Historical Daily January Next-day price Prediction accuracy —
Bayes, and RF et al., (2019) prices 2011–December trends, maximum
2018 and minimum daily
variations
RFs with Bitcoin and others Barnwal Market his- 1 min, 5 min, August Price prediction Accuracy Bitfinex, Binance
factors in et al., (2019) tory data 30 min, 1 hr, and 2017–December
Alpha101 1 day 2018

Table 4 continued on next page

comments are applied in ethereum and bitcoin markets. The Table 3 summarizes the traditional statistical and economet-
results showed that these metrics have predictive power on ric approaches for cryptocurrency price prediction. It is easily
cryptocurrency prices. evident from this table that a lot of researchers within the
18 KHEDR ET AL.

TABLE 4 Continued from previous page


Technique Cryptocurrency Reference Data type Data frequency Time range Target variables Performance metric Data source
Generative Bitcoin Attanasio Volume, Daily August Cryptocurrencies Accuracy quandl
and dis- et al. (2019) volatility, 2017–July
criminative trend, and 2018
classifiers momentum
combined
using a
one-layer NN

ANFIS: adaptive neuro-fuzzy inference system; ANN: artificial neural network; ARIMA: autoregressive integrated moving-average; ARMA: autore-
gressive moving-average; BPNN: back-propagation neural network; GABPNN: genetic algorithm back-propagation neural network; GANN: genetic
algorithm neural network; GARCH: generalized autoregressive conditional heteroscedasticity; GB: gradient boosting; GLM: generalized linear model;
LSTM: long short-term memory; MAE: mean absolute error; MAPE: mean absolute percentage error; ME: mean error; MSE: mean square error; NEAT:
neuroevolution of augmenting topologies; NN: neural network; NNAR: neural network autoregression; PNN: probabilistic neural network; PSO: par-
ticle swarm optimization; RF: random forest; RMPE: relative mean percentage error; RMSE: root-mean-square error; RNN: recurrent neural network;
RSE: residual standard error; VAR: vector autoregression; SVM: support vector machine.

statistics/econometrics field have applied various schemes for cryp- which method best suited the prediction of bitcoin price. The analysis
tocurrency price prediction. Financial data, sentiment analysis, and conducted will be further extended by improving the precision of less
various internal and external factors are used in the studies. The accurate techniques. STORJ tokens were presented in Chakraborty
most common and popular cryptocurrency studied by researchers was and Roy (2019) to make transactions in a STORJ network for pre-
bitcoin. dicting future price. Different time-series models were presented,
including Box–Jenkins models, NN, and a switching regression model.
5 ML TECHNIQUES FOR CRYPTOCURRENCY The key advantage of using these technologies is that we can obtain
PRICE PREDICTION the estimated equation for each regime with flexibility.
Greaves and Au (2015) applied blockchain data for bitcoin price
ML refers to the automated process of learning from experience prediction using an SVM, an ANN, linear regression, and logis-
(Alon, Lokshtanov, & Saurabh, 2009). Automatic learning and adapta- tic regression. An NN classifier with two hidden layers marked
tion with exposed data without the need of human intervention are the highest price accuracy of 55%, followed by logistic regres-
the main emphases of ML. Rather than writing an explicit program sion and SVM. In addition, the research also mentions the analysis
to solve a task, in ML the computer learns from provided example with several tree-based models and K-nearest neighbors. Limited
data and comes up with its program (Domingos, 2012). Based on
predictability was observed in this research using only blockchain
that, the computers emulate human behavioral learning towards par-
data for training and prediction. The research concluded that by
ticular decision-making and reasoning. Cryptocurrency is volatile in
using features directly extracted from bitcoin exchanges, like finan-
nature. This motivated researchers to apply DL and ML paradigms to
cial flow features, would likely improve the bitcoin price prediction
cryptocurrency concerns. Adopting the stock market price prediction
accuracy.
techniques can help in increasing the precision rate (El-Bannany et al.,
Li et al., (2019) analyzed Twitter signals to predict price fluctuations
2020; Lahmiri, 2011; Sreedharan et al., 2020a; 2020b). In recent years,
using ZClassic. Tweets were collected on an hourly basis for 3.5 weeks
ML is one of the most researched approaches in cryptocurrency price
and each tweet was classified as positive, negative, or neutral. Tweets
prediction because of its ability to identify the general trend and fluc-
were then compiled to create a weighted or unweighted index. The
tuation. Figure 9 gives an illustration of using ML in cryptocurrency
model was trained using an extreme GB regression tree model and
price prediction. Several ML techniques are applied in cryptocurrency
compared with historic price data.
price prediction. This includes classification, regression, DL, and RL
Mittal, Dhiman, Singh, and Prakash (2019) used ML techniques
models. The ML techniques used in cryptocurrency price prediction
such as linear regression, polynomial regression, recurrent NN (RNN),
research are as discussed in the following. We have distinguished
and long short-term memory (LSTM)-based analysis to identify the
these schemes and have added a separate section specifically on DL
and RL models because of their inherent variation and wide adop- correlation among bitcoin price and Twitter and Google search pat-

tion. Some researchers have focused on the comparison of different terns. Among Google Trends, tweet volumes, and tweet sentiments,

classification and regression ML methods. tweet sentiment analysis gives the worst outcome. When LSTM, RNN,

A short-term forecasting model was presented by Derbentsev, and polynomial regression were applied on Google Trends and tweet
Datsenko, Stepanenko, and Bezkorovainyi (2019) to predict cryptocur- volume, an improved accuracy in performance was shown.
rency prices of ripple, bitcoin, and ethereum using an ML approach. Atsalakis, Atsalaki, Pasiouras, and Zopounidis (2019) used a hybrid
A binary autoregression tree was implemented in that paper that neuro-fuzzy controller called PATSOS to predict the daily price change
combines classification and ARIMA. Simulation results proved that trend of bitcoin. The scheme outperformed two other computa-
this algorithm is more accurate and efficient than other traditional tional intelligence models, the first being developed with a simpler
approaches are. Various ML algorithms were compared by Rane and neuro-fuzzy approach and the second being developed with ANNs.
Dhage (2019) to select an optimal technique for predicting bitcoin They also stated that the performance of the PATSOS system was
price. A survey of different ML techniques was presented to show robust to use for other cryptocurrencies.
KHEDR ET AL. 19

TABLE 5 Summary of deep- and reinforcement-learning techniques surveyed in Section 5.1

Technique Cryptocurrency Reference Data type Data frequency Time range Target variables Performance metric Data source
MLP and Bitcoin Misnik Market data Timestamp of given Rate at 0 s and Predict price Precision Kraken
LSTM et al., (2018) minute 59 s of given
minute
LSTM and Bitcoin, digital Lahmiri and Digital cur- Daily Start:bitcoin, Price prediction RMSE Not mentioned
GRNN cash, and ripple Bekiros rencies July 16, 2010;
(2019) digital cash,
February 8,
2010; rip-
ple, January
21, 2015.
End:October 1,
2018
LSTM and Bitcoin McNally Average Daily August 19, Closing price RMSE, accuracy, preci- Coindesk, Bitcoin
RNN et al., (2018) of 5 major 2013–July 19, sion, sensitivity Price Index
exchanges 2016
LSTM and Bitcoin and lite- Yao Market data Minute-to-minute January Predict price MASE, SMAPE, MAE, Marketcap
RNN coin et al., (2018) updates 2012–March and RMSE
2018
DNN, Bitcoin Ji et al., (2019) Market data Daily 29 Novem- Predict price Accuracy, precision, bitcoincharts
LSTM, ber 29, sensitivity, recall,
CNN, deep 2011–December specificity, F1 score
residual 31, 2018
network
GBDT and Bitcoin Alessandretti Market data Daily November Return on invest- Geometric mean and CoinMarketCap
LSTM et al., (2018) 2015–April ment cumulative returns
2018
Multiple Bitcoin Li et al., (2019) Social media Daily August 8, Price volatility RMSE Yahoo! Finance
input data, market, 2015–November
LSTM blockchain 17, 2018
com- statistics
bined with
Black–Scholes
Theil–Sen, Bitcoin Phaladisailoed Market data Minute January 1, Predict price MSE and R2 Bitstamp
Huber and Num- 2012–January
regression, nonda (2018) 8, 2018 to
LSTM and
GRU
LSTM, GB Bitcoin, ethereum, Kwon Time series 10 min June 9, Classification F1 -score, recall, preci- Bithumb API
ripple, bitcoincash, et al., (2019) data 2017–May 8, of the cryp- sion
litecoin, dash, 2018 tocurrency price
ethereum classic, time-series data
fiat, Korean won
Neural Bitcoin Jiang and Financial data 30 min 1 year in time Portfolio weights Cumulative return Poloniex
network Liang (2017) span or portfolio man-
agement
GBDT, 42 different cryp- Sun Market data Daily January 1, Price trend Accuracy Investing.com
SVM, and tocurrencies et al., (2020) 2018–June 30,
RF 2018
Convolution Bitcoin, dash, Alonso-Monsalve Cryptocurrencies 1 min July 1, Price prediction of Accuracy Cryptocompare.
LSTM ether, litecoin, et al., (2020) data 2018–June 30, exchange rates
monero, ripple 2019
Multiple Bitcoin Snihovyi Cryptocurrency Daily January 25, Prices variation MSE and R2 Open source
linear et al., (2018) data 2017–January
regression, 22, 2018
RF, and
LSTM ML
LSTM Bitcoin, ripple, Altan Digital cur- Daily Start:BTC pre- Currency MAE, RMSE, MAPE Not mentioned
dash, and ltc et al., (2019) rency data diction, July 18,
2010; XRP, Jan-
uary 22, 2015;
dash, Febru-
ary 14, 2014;
ltc, August 24,
2016.End:March
28, 2019

CNN: convolutional neural network; DNN: deep neural network; GB: gradient boosting; GBDT: gradient boosting decision tree; GRNN: generalized
regression neural network; GRU: gated recurrent unit; LSTM: long short-term memory; MAE: mean absolute error; MAPE: mean absolute percentage error;
MASE: mean absolute scaled error; ML: machine learning; MLP: multilayer perceptron; MSE: mean square error; RF: random forest; RMSE: root-mean-square
error; RNN: recurrent neural network; SMAPE: symmetric mean absolute percentage error; SVM: support vector machine.

Almeida, Tata, Moser, and Smit (2015) predicted bitcoin's trend for selected that had a major effect on demand and supply of bitcoin
the next day based on the previous day's price and volume using an and then they used this to train a model that enhanced the predictive
ANN. They used the historical data from 2009 onwards to learn the power of bitcoin price in the future. The model showed high accuracy
behavior. Mohanty, Patel, Patel, and Roy (2018) used LSTM for bitcoin and good precision.
future price prediction, and Twitter data was used to predict public Hashish, Forni, Andreotti, Facchinetti, and Darjani (2019) used
mood. In this method, some salient features from the blockchain were hidden Markov models to describe historical movements of cryptocur-
20 KHEDR ET AL.

FIGURE 10 Percentage of publications


(cumulative) versus year

TABLE 6 Paper distribution among generation of the model was done 30 times and the average value of
categories MAPE and training time were calculated. Nine features were selected
Paper category Distribution (%) (open, low, high, volume, close, EMA 12, EMA 26, %R 5, and %R 14, 12,
Statistical/econometric 45.97 26), for all these methods and for selection, a greedy forward-selection
ML 54.03
approach was used. The results showed that the GABPNN had greater
(ML: 70.22)
accuracy, whereas the genetic algorithm NN was the worst, but the
(DL/RL: 29.78)
training time for the GABPNN is very long, which is why we cannot
DL: deep learning; ML: machine learning; RL: use this approach in real practice. So the next best candidate was the
reinforcement learning.
BPNN with 300 times faster accuracy time and slightly less accuracy
compared with the GABPNN. The price movement of bitcoin was
rencies and LSTM networks to predict future movements. A genetic
predicted using ANNs based on back-propagation algorithm in Sovia,
algorithm was applied to further optimize the parameters of the hybrid
Yanto, Budiman, Mayola, and Saputra (2019) using graph movements:
approach. The simulation results showed the lowest MSE, RMSE
open, low, and high, bitcoin requests, volumes, and next hour predic-
and MAE compared with traditional models of time-series prediction,
tion prices. Prediction variables with target values can be predicted
ARIMA, and conventional LSTM, which proved the effectiveness of
using prior bitcoin price. The best network patterns can be obtained
the proposed approach. Internal details of bitcoin transactions were
that can be useful for the prediction process.
not accounted for in this model. So, for future work, they planned to
Almasri and Arslan (2018) applied a feedforward ANN with a
consider additional features to provide more information about the
back-propagation learning algorithm for training the network. Data
blockchain.
were collected from Crypto compare API for prediction of day and
Bush and Choi (2019) used ARIMA, ARMA-GARCH, VAR,
hour close prices with the help of the Encog framework for ML.
alpha-Sutte indicator, and NN autoregression (NNAR) to forecast
the ethereum STORJ token price. The dynamics of the model were Bitcoin showed good results, but other coins showed fewer variations,

evaluated using three time horizons (h = 5, 20, and 50 days). The which shows that a specific cryptocurrency needs a particular model

performance of the model was analyzed using RMSE, mean abso- to show good results. Hour close-price was more stable than the day

lute percentage error (MAPE), and mean absolute error (MAE). The close-price. Data-mining techniques might help in future. Various ML
simulation results showed that the VAR exceeded the other models algorithms were also compared by Valencia, Gómez-Espinosa, and
in short- and mid-term horizons, whereas NNAR outperformeds for Valdés-Aguirre (2019), such as NN, RF, and SVM, to predict price
long-term forecasting. Radityo, Munajat, and Budi (2017) used the movements of cryptocurrencies with highest market capitalization,
bitcoin exchange rate (closing price) on the American dollar for the which included bitcoin, ripple, ethereum, and litecoin. These. Three
next-day prediction using four variants of ANN. The methods selected approaches were used to train a model. The first approach used social
for comparison were genetic algorithm NN, back-propagation NN data to train a model, the second approach used market data, and the
(BPNN), genetic algorithm BPNN (GABPNN), and neuroevolution of third one utilized both market and social data for training. To evaluate
augmenting topologies (NEAT). The performance of this method was the robustness of each model, accuracy, precision, recall, and F1 scores
compared depending on the accuracy (using MAPE) and complexity were used. Comparison results showed that the NN outperformed the
in terms of time required to construct a model. For each technique, other models.
KHEDR ET AL. 21

FIGURE 11 Distribution of
papers among various disciplines

FIGURE 12 Distribution of
papers among different
categories

Uras, Marchesi, Marchesi, and Tonelli (2020) used four tech- and yuan-to-USD exchange rate, whereas it was positively related
niques, namely linear regression, multiple linear regression, multilayer to the stock market index, USD-to-EUR exchange rate, and various
perceptron (MLP) NN, and LSTM NN, to forecast the changes in signs among the different countries' search trends. Bouri, Shahzad,
the price of bitcoin based on the daily bitcoin price series, from and Roubaud (2019) covered a daily data set on seven cryptocurren-
November 11, 2015 to August 8, 2018. The results showed that all cies (bitcoin, ripple, ethereum, litecoin, nem, dash, and stellar) from
models performed well in predicting the changes in the price of bit- August 7, 2015, to December 31, 2017. They used logistic regression
coin. Chen and Narwal (2017) collected data about the price of ether to study the likelihood of change in the price of one cryptocurrency
sampled at approximately 1 hr intervals between August 30, 2015, and in response to a change in the price of other cryptocurrencies. They
December 2, 2017, to predict the price changes in ethereum using six showed that change in the price of one cryptocurrency led to changes
techniques, namely logistic regression, naive Bayes, SVM, RF, ARIMA, in the price of other cryptocurrencies.
and RNN. Jang and Lee (2018) applied a Bayesian NN (BNN) model, with
Poyser (2019) studied using a Bayesian structural time-series 10-fold cross-validation, to predict the fluctuations in bitcoin price, as
approach for the determinants of bitcoin's price. He extracted the it can naturally deal with the increased number of input variables. An
data from different websites over the period from January 2013 to additional regularization factor was included in the objective function
May 2017 for 27 different countries to show the variables affect- of the BNN to prevent the problem of overfitting, which is crucial
ing bitcoin's price level provided by search trends as a proxy for to any framework. The input features contained blockchain informa-
blockchain. Macro-financial statistics and public interest were explored tion in addition to macroeconomic variables. The experiment applied
using a Bayesian structural time-series method. The experiments con- two input sets, one comprising of all 26 variables collected and the
cluded that gold price and exchange rate negatively affected the other containing 16 abridged input variables by extracting important
price of bitcoin. The results showed that the bitcoin price was neg- variables and eliminating the redundant ones from the linear correla-
atively associated with a neutral investor's sentiment, gold's price, tion analysis. The results confirmed that the BNN is better than the
22 KHEDR ET AL.

benchmarked models of SVR and linear regression. The error value cryptocurrencies using R2 score. This approach worked in two phases,
of the BNN was comparatively small with all the 26 input variables, in which one is the training phase and the other is the detection phase.
compared with the reduced variable data set. This indicates that the The training phase was a single-time process in which the concurrent
ignored variables explained the nonlinear relationships that accounted cryptocurrency prices and twitter data were collected. The collected
for the output variables. The study showed the fluctuations using the data, if not in the same format, were converted into the same format
BNN model in bitcoin up to August 2017. and then tweets were evaluated for sentiment polarity. Tweets were
Kim et al., (2016) analyzed cryptocurrency communities for social then tagged as positive (if polarity above zero), negative (if polarity less
activities to determine whether these attributes affected the prices of than zero), and neutral (if polarity equal to zero). The tagged tweets
bitcoin, ripple, or ethereum. User comments and replies from these were then stored and broken into chunks and the number of tweets in
communities were extracted and modeled using ML models like the each chunk was counted. The counted number was then mapped with
probabilistic NN and SVM to predict the price fluctuations and the the average price in time duration of 2 hr. These tags represented the
transactions count. The paper concluded that the method predicted features of the data set, with mapped average price as its label. The
the low-cost cryptocurrencies' price variability. model was then evaluated using actual labels, and if the labels matched
Mallqui and Fernandes (2019) proposed that a selected set of with the original labels then the model was acceptable and was ready
attributes combined with the best data-mining model can improve the for prediction by validating real-time tweets, otherwise a new model
accuracy rate of cryptocurrency price prediction. In phase 1, differ- was generated. The same process was done until an acceptable model
ent feature-selection algorithms, including relief technique, correlation was generated. After that, the detection phase started and the aver-
analysis, information gain method, correlation-based feature subset age price was predicted for the next 2 hr. Litecoin price was found
selection, and principal component analysis, were evaluated to deter- to be highly affected by the tweet sentiments. Social factors can also
mine the most relevant attribute for prediction. In phase 2, ML models influence the price and its prediction; these were not considered fully
like ANN, SVM, ensemble techniques (based on RNN and k-means in this paper and were to be considered in future studies.
clustering) were evaluated. The proposed model was used for the pre- Chen et al., (2020) compared the statistical methods of linear
diction of maximum, minimum, and closing price of bitcoin. The SVM discriminant analysis and logistic regression with more compli-
model with relief technique for attribute selection had the highest cated ML approaches, including quadratic discriminant analysis,
and most consistent accuracy rate in all time intervals. The proposed XGBoost, RF, LSTM, and SVM, on predicting daily bitcoin price using
model showed a prediction accuracy improvement of 10% compared high-dimensional features. ML models outperformed statistical mod-
with the models proposed in previous research works. Whereas most els with an average prediction accuracy of 62.2% over 53.05%. The
of the research work focused on predicting the price of bitcoin, best performance was observed with the LSTM model, generating an
Poongodi et al., (2020) used ML models of linear regression and accuracy of 67.2%.
SVM for ethereum price prediction. SVM provided a higher accuracy Shah and Zhang (2014) applied Bayesian regression to pre-
of 96.06% over the linear regression method, which generated an dict bitcoin price and compared this with earlier researches on
accuracy of 85.46%. bitcoin price prediction. The two major governing features of
Hitam, Ismail, and Saeed (2019) proposed an optimized SVM based bitcoin price prediction are bitcoin mining speed and market cap-
on particle swarm optimization (PSO) in predicting price of selected italization. Though this method achieved high profitability, the
cryptocurrencies, such as bitcoin, ethereum, litecoin, nem, ripple, and relationship between features in space and bitcoin price was not
stellar. PSO iwas used for optimizing the SVM in the cryptocurrency discussed.
market. Five years' daily price from 2013 to 2018 were used to predict Madan, Saluja, and Zhao (2015) explored the link in the problem
the future price. Various classifiers were trained with the same feature space to understand 16 additional features surrounding the bitcoin
set and were evaluated based on classification accuracy. The results network, while also implementing different ML algorithms. In phase 1,
showed that SVM–PSO exceeded others and is considered a reliable daily bitcoin data were analyzed using three binomial classification
forecasting model. algorithms: binomial logistic regression, SVM, and RF algorithms. In
Daily price changes in multiple cryptocurrencies, including bitcoin, phase 2, the same set of ML algorithms was applied on bitcoin price
exclusive coin, ripple, litecoin, lisk, monero, siacoin, cryptonex, AdEx, interval data. The binomial generalized linear model outperformed in
and guldencoin, were predicted in Mittal et al., (2018) using a multi- daily bitcoin price prediction with a higher percentage of true positives
variate linear regression model. In the initial stage, the independent compared with true negatives. RF yielded a result fairly close to
features in the data set were examined, followed by figuring out the the original data. In phase 2, a 10 min window data yielded better
correlation between independent and dependent attributes. Later, the results compared with a 10 s interval. RF showed higher accuracy
lowest and highest prices of cryptocurrencies were predicted using lin- than the generalized linear model using interval data because RF uses
ear regression. Though the proposed approach yielded high efficiency, nonparametric decision trees, so linear separability of the data and
the experiments were limited to smaller data sets. outliers were not a concern. The error rates were higher for SVM using
Jain, Tripathi, Dwivedi, and Saxena (2018) introduced a novel daily and interval-based bitcoin data because there was a necessity to
method to predict the price of two of the most widely used cryptocur- generate artificial separations between data points in high-dimensional
rencies, namely litecoin and bitcoin, based on sentiments of users' space for the classification of points. Though a prediction accuracy of
tweets. Useful features from the tweets were analyzed and extracted 97% is noted in the research, the models were not cross-validated,
in a multiple linear regression model for price prediction of the limiting the generalizability of the results.
KHEDR ET AL. 23

Silva de Souza, Almudhaf, Henrique, Negredo, Ramos, Sobreiro, TABLE 7 Comparison of common machine-learning techniques
and Kimura (2019) showed how an SVM and an ANN could generate used
abnormal risk-adjusted returns. They showed their approach to bitcoin. Attribute ANN SVM RF
The procedure is more suited to practitioners, although the model Network Type of feed- Nonlinear Decision-tree-model
model forward neural model
ignored actual economic policy uncertainly and the relationship with network
bitcoin. Another interesting study is that of Valencia et al., (2019), Layers Single-layer N/A It follows tree struc-
who used social media data along with ML to predict price movement. and multilayer ture
forms
However, the work acknowledged that social media data variables are
Used Minimizing Time-series Eliminates instability
limited. for error functions prediction task in a network
Virk (2017) compared RF, SVM, GB, and linear regression to predict Accuracy ANN has high- It eliminates Best feature with
the price of bitcoin. The results indicated that the SVM achieved the est accuracy irrelevant and highest information
for predicting scattered data efficiency is selected
highest accuracy of 62.31% and a precision value 0.77 among binomial price so it shows in this network so it
classification ML algorithms. improved pre- outperformed SVM in
cision and predicting price
Sun, Zhou, and Lin (2019) used RFs with factors in Alpha101 accuracy but
(Kakushadze, 2016) by representing features using cryptocurrency not com-
parable to
market history data obtained from Bitfinex and Binance to build a
RF
prediction model. They collected data from API in cryptocurrency
exchanges and used 5 min frequency data for backtesting. The results
ANN: artificial neural network; N/A: not applicable; RF: random forest;
indicated that the performances were proportional to the quantity of SVM: support vector machine.
data (more data, more accurate) and the factors used in the RF model
appeared to have distinct significance.
TABLE 8 Comparison of common deep-learning techniques used
Barnwal, Bharti, Ali, and Singh (2019) used generative and dis-
Attributes MLP RNN LSTM
criminative classifiers to create a stacking model. The model included
Network Feedforward Forward Forward
three generative and six discriminative classifiers combined using a model static neural and back- and back-
one-layer NN to predict the cryptocurrency price trends. A discrim- network ward dynamic propagated
network with for-
inative classifier directly models the relationship between unknown get/remember
and known data, whereas generative classifiers model the prediction gates
indirectly through the data generation distribution. Technical indica- Layers Input, output, Input, output, Input, output,
and hidden hidden, and and forget
tors, such as trend, momentum, volume, and volatility, are collected as layers context layers gates
features of the model. The authors discussed how different classifiers Used Classification Time-series Recognizing
and features affected the prediction. for tasks prediction task long-term
associations
Attanasio, Garza, Cagliero, and Baralis (2019) compared various
Limitations Vanishing gra- Vanishing Addresses
classification algorithms, including SVM, naive Bayes, and RF, in fore- dient problem gradient is both gra-
casting the next-day price trends of a given cryptocurrency. The and signals still an issue, dient and
only pass for- but signals signal pass-
results indicated that forecasting models based on a series of forecasts ward in static can pass for- ing problem
appeared better than a single classification model, due to the volatility nature ward as well with long term
as backward dependency
and heterogeneity of cryptocurrencies' financial instruments. dynamically
Table 4 summarizes the ML techniques surveyed in Section 5.

LSTM: long short-term memory; MLP: multilayer perceptron; RNN:


5.1 DL and RL techniques recurrent neural network.

Whereas ML-based models have achieved some success in predicting


cryptocurrency prices, their chaotic and very complex nature and fluc- and convenient for sequential data, such as time series. The commonly
tuations in accordance with fast-paced technological developments, applied DL models in cryptocurrency price prediction research include
security, political, and economic factors have motivated researchers RNNs, gated recurrent units (GRUs), convolutional NNs (CNNs), LSTM
to investigate the application of DL and RL models toward providing networks, and MLPs. An RNN is a class of ANN in which the con-
accurate predictions of cryptocurrency prices. DL models have already nections between nodes form a directed graph with possible loops,
found numerous applications in quantitative finance, such as forecast- and this structure of recurrent connections with memory makes them
ing volatility trends. In a supervised learning scheme, NNs are a useful suitable for processing time-series data. However, they face the van-
tool for price prediction since no strong assumption is needed for ishing gradients problem, and hence different variations have recently
their application, which contrasts with conventional time-series mod- been proposed (Kwon, Kim, Heo, Kim, & Han, 2019). Another class
els, such as ARIMA and its extensions. DL models are based on ANNs. of standard RNN used in cryptocurrency price prediction is a GRU
The key difference that distinguishes a DL model from an ANN is that, (Phaladisailoed & Numnonda, 2018). An MLP is a type of feedforward
generally, the DL model corresponds to an ANN with multiple hidden ANN used in predicting cryptocurrency prices (Misnik, Krutalevich,
layers. Moreover, DL models capture patterns with a significant gen- Prakapenka, Borovykh, & Vasiliev, 2018). A CNN is another class of
eralization power. Most recent LSTM networks seem more suitable DL used for supervised learning that has been successfully applied in
24 KHEDR ET AL.

various image-processing and natural language-processing problems Yao et al., (2018) developed a framework using DL for cryptocur-
and is also adopted for cryptocurency price prediction by researchers rency price prediction. The nonlinear nature of cryptocurrency was
(Ji, Kim, & Im, 2019). Another widely used DL model, called LSTM, considered in this for prediction. In this paper, various factors are
is a special RNN structure found to be superior to non-gated RNNs considered for cryptocurrency price prediction, such as volume, mar-
on financial time-series problems because of its ability in selectively ket cap, circulating supply, and maximum supply based on an RNN
remembering patterns for a long time (Chen et al., 2020; Mohanty and LSTM. Different phases were involved in this method, including
et al., 2018). RL is another class of ML that enables software agents data analysis phase, data filtration phase, train–test split phase, data
to learn and take appropriate action in an interactive environment to scaling phase, model building phase, model learning and evaluation
maximize a cumulative reward using feedback from its own actions and phase, and prediction phase. Then, the model was executed and tested
experiences. Deep and reinforcement models are also being consid- for benchmark data sets. The proposed approach achieved excellent
ered by researchers to provide efficient prediction of cryptocurrency accuracy, and this varied with reference to the size of the data set.
prices (Jiang & Liang, 2017; Längkvist et al., 2014). The deep and rein- Simulation results showed that the market open plays a significant
forcement models used in cryptocurrency price prediction research role in predicting price.
are as discussed in the following. Ji et al., (2019) discussed various DL approaches, including a deep
DL was used by Lahmiri and Bekiros (2019) for the first-time to pre- NN, LSTM, a deep residual network, and a CNN, and compared them
dict digital currency prices of the three currencies used most: bitcoin, for bitcoin price prediction. Simulation results showed that the LSTM
ripple, and digital cash. Long memory was used to access the mar- model outperformed the other models. Three methods were used to
ket efficiency of cryptocurrencies. Inherent nonlinear dynamics were predict cryptocurrency prices of 1,681 currencies by Alessandretti,
examined, which include inherent chaoticity and fractality, to measure ElBahrawy, Aiello, and Baronchelli (2018). The two models used were
the predictability of digital currency. Then they used DL to extract based on GB decision trees, in which one uses the same model to
hidden information or patterns as an underlying dynamical system. predict the return on investment of all currencies and the second
They also focused on short-term predictability of cryptocurrency. The method uses a different model for each currency and uses information
largest Lyapunov exponent and a detrended fluctuation analysis based related to the whole market to predict a single currency. The third
on the extracted Hurst exponent were used to deal with the chaotic method was based on LSTM for RNN, in which prediction depends on
and fractal characteristics of digital currency. To make predictions previous currency prices. LSTM is very stable and is also capable of
accurate and fast, they developed a complex NN based on LSTM to capturing long-term dependencies.
extract hidden information. This process of extracting domain-specific Multiple inputs using an LSTM-based prediction model was pro-
patterns is time consuming and extremely expensive, but it introduces posed by Li, Arab, Liu, Liu, and Han (2019) in conjunction with
consistency and accuracy in predicting digital currency prices. An Black–Scholes to predict the price volatility for the next 30 days.
LSTM NN overcomes the problem present with RNNs. Nodes will be They leveraged on the multiple inputs of the LSTM-RNN as a predic-
replaced with memory cells, and a forget/remember gating mechanism tion model. They found out that it was more efficient in predicting
is used. Information can be stored for future processing. A well-known bitcoin price. The data set iwas collected using traditional market
benchmark known as the generalized regression NN provides fast statistics, such as daily price, trading volume, and historical volatil-
learning and optimal convergence. Owing to these unique characteris- ity. Blockchain statistics included wallet address liveness of bitcoin
tics of a generalized regression NN and LSTM, they efficiently applied exchange. In contrast, social media trends include an impact factor of
it to solve various data analysis and data modeling problems. LSTM Google/Reddit/Twitter. Using blockchain statistics, accurate results
has higher predictive accuracy, but it takes more time to converge. were obtained that also reduced the RMSE compared with the baseline
In the case of ANNs, the precision is highly dependent on an approaches.
adequate set of inputs. Considering this, Misnik et al., (2018) applied Phaladisailoed and Numnonda (2018) compared different ML mod-
various approaches to obtain additional data for an NN to analyze the els and selected features such as open, close, high, and low values
high volatility trends exhibited by cryptocurrencies, and the impact on to predict the future price. MSE and R2 were used to measure the
precision was analyzed. accuracy of the Theil–Sen regression model, Huber regression, LSTM,
McNally, Roche, and Caton (2018) used ML for predicting bitcoin and GRU. The simulation results showed that the DL models, such
price, and they compared this approach with a parallelization method as LSTM and GRU, showed better results than Theil–Sen and Huber
that executes on multi-core and graphics processing unit environ- regression, with an accuracy of 99.2% in the case of GRU. In contrast,
ments. An ARIMA time-series model was introduced in that paper to the calculated time for Huber regression is much less than with LSTM
compare the performance with DL models. An RNN and LSTM were and GRU.
used as a DL model with LSTM more capable of recognizing longer An LSTM model is presented by Kwon et al., (2019) to clas-
term associations. The closing price in USD for bitcoin was consid- sify cryptocurrency price time-series. Past data were collected and
ered an independent variable for thes study and was collected from preprocessed to clean it for training and testing data. Then, data
CoinDesk Bitcoin Price Index. The work considered five major bitcoin were encoded into a three-dimensional price tensor that represented
exchanges: Bitstamp, Bitfinex, Coinbase, OkCoin, and itBit. To assess the past price changes of cryptocurrencies. A grid-search-based
the performance of the system, RMSE was used. The actual perfor- k-cross-validation was applied to search for the most suitable param-
mance of the ARIMA model was worse thanthe the NN model. LSTM eters in Lthe STM model. Comparison results showed that the LSTM
outperformed RNN but took a significantly longer time to train. outperformed GB model and other ML models.
KHEDR ET AL. 25

Jiang and Liang (2017) proposed using RL. The deep RL was carried common currency when it comes to implementation and research.
out to address a portfolio management problem. A decision-making However, since 2018, researchers have started considering the topic
process in which some amount of funds is allocated to different of cryptocurrency price prediction as a whole.
financial products to achieve a maximum return while restraining from Each of these papers focuses on cryptocurrency price prediction
the risk. As an input, a portfolio vector was produced from raw data using various approaches. The papers surveyed are classified into dif-
and historical prices. A model-less CNN was used in this approach, in ferent categories that include traditional statistical and ML techniques.
which historic prices acted as an input, outputting portfolio weights. The primary category contains various traditional statistical and econo-
Training was done to reinforce and that maximizes the accumulative metric techniques and the secondary category contains different ML
sums; that is, the reward function of the network. This network can and DL techniques used in cryptocurrency price prediction research.
be applied to any other financial market and is not limited to only one Table 6 presents the distribution of papers analyzed in this survey
cryptocurrency. Its performance was tested after every 30 min with versus different categories.
three recent portfolio-selection algorithms, achieving positive results. The majority of them are related to ML models, though conven-
Simulation results showed that this method had less cumulative return tional statistical and econometric models are still significant. Among
than the other techniques. The major drawback of using this technique the papers studied in this survey, 45.97% of papers belong to the cat-
is that two assumptions are made that are not useful in the real market egory of conventional statistical or econometric approaches and the
because history data is not involved in testing and training of the remaining 54.03% belong to the ML category. Among the ML category
algorithm. of papers related to cryptocurrency price prediction research, though
Sun et al., (2020) adopted a GB decision tree algorithm, light GB 70.22% of papers concern ML models, 29.78% of research applies
machine (LightGBM), to forecast the price trend of the cryptocur- the DL/RL category of models. The distribution of papers among
rency market. Though results were good, the study did not try a different publication sources is illustrated in Figure 11. Among the
large data set. Alonso-Monsalve, Suárez-Cetrulo, Cervantes, and Quin- published papers, finance-related literature includes the finance and
tana (2020) proposed a convolution LSTM NN model and tested it on economics area and the business, management, and accounting area;
six currencies: bitcoin, dash, ether, litecoin, monero, and ripple. The other disciplines include Physics and mathematics, computers, science
model was used for price prediction, and results were compared with and engineering, artificial intelligence, expert systems and data mining,
a hybrid CNN–LSTM network, MLP, and radial basis function NN. ArXiv, and others. It is evident that the topic is being explored and is
According to the paper, the model could identify trends but may not gaining attention from multiple disciplines.
lead to profitability. From the analysis conducted in this survey, it is clear that the
Snihovyi, Ivanov, and Kobets (2018) developed a multiple line researchers are now focusing more on models using ML for cryptocur-
multiple linear regression, RF, and LSTM ML model and worked on rency price prediction, as ML and DL approaches play a significant role
bitcoin to predict price variation. The authors proposed a criterion in cryptocurrency price prediction research. It is clear from the papers
for predicting cryptocurrency prices using the Python Anaconda Data that the accuracy of prediction depends highly on the input attributes
Science tool combined with ML algorithms, and this combined criterion and ML technique used. Also, the accuracy of an ML algorithm is
could explain more than 70% of the variation in cryptocurrency prices highly dependent on the problem and the integrity and complexity of
using either multiple regression, RF, or LSTM networks. According to the training data set. The majority of the studies focused on daily and
the study, the data set was small. Altan, Karasu, and Bekiros (2019) interval-based price data. In contrast, some studies also focused on the
used an LSTM model and applied it on four cryptocurrencies (namely, impact of socio-economic factors, user trends, and macro attributes
bitcoin, ripple, dash and ltc) for currency prediction according to the on the price prediction. Irrespective of the input attributes being
paper model. It showed a good prediction of currency. The paper did focused on, SVM, Bayesian network, linear regression, ANN, logistic
not highlight any limitations. regression, RF, and LSTM are some of the main techniques highlighted
Table 5 summarizes the DL and RL techniques surveyed in in cryptocurrency price prediction. Figure 12 gives an illustration of
Section 5.1. research paper distribution among different categories of models used
in the cryptocurrency price prediction domain. Moreover, among the
papers analyzed, it is noteworthy that 26.92% of researchers applied
6 DISCUSSION
and compared different models in their work.

This section discusses and analyzes the findings from our survey of
6.1 Analysis: Research contribution among
cryptocurrency price prediction research. We have included the most
relevant papers related to cryptocurrency price prediction, published
different categories
in the period 2010–2020. For this, we reviewed different types The issue of developing suitable models for predicting prices of cryp-
of research papers that included papers from categories such as tocurrencies is relevant and significant for the scientific community,
journals, conferences, workshops, surveys, and other publications. investors, financial analysts, and traders. Researchers have adopted
The focus of researchers towards cryptocurrencies started to ascend different approaches to analyze the volatility dynamics of digital cur-
considerably over the past few years. The research publications in rencies and to predict their prices. One such approach is to build a
this domain have been increasing since 2017, and especially in 2018 cause-and-effect casual model that describes the relationship between
and 2019; and this trend continued in 2020 (see Figure 10 for exchange rates and other macroeconomic variables (especially eco-
cumulative percentage of publications per year). Bitcoin is the most nomic growth rate, trade and balance of payments, inflation rates, etc.)
26 KHEDR ET AL.

incorporating various economic concepts (e.g., Akcora et al., 2018; between the yield of cryptocurrencies and the complexity of their
Gunay, 2019; Wiedmer, 2018). Another approach is to study the time extraction (Liu & Tsyvinski, 2020). Also, the macroeconomic factors,
series and make a prediction based on the processing and analysis which usually determine the dynamics of currency, stock, and com-
of past observations. The most common models are the Box–Jenkins modity markets, have no significant effect on the dynamics of the
ARIMA time-series models and their modifications, GARCH models, or cryptocurrencies market. The results of Conrad et al., (2018) indicate
ANNs (Alahmari, 2019; Derbentsev et al., 2019; Ho, Xie, & Goh, 2002; that the influence of the US stock market (S&P 500 index) and the
Lu, 2010; Radityo et al., 2017; Sovia et al., 2019; Vapnik, 1999). Gen- global stock market index (Nikkei 225 index) on bitcoin's volatility was
erally, the models for predicting cryptocurrency prices depend on an not significant. The studies conducted by Baek and Elbeck (2015),
analyst's perception of the causal relationships in the pricing process. Kaya (2018), Blau (2017), and Ciaian, Rajcaniova, and Kancs (2016)
For example, consider a prediction model specified as a price for- analyzed the price dynamics of cryptocurrencies and its impact in
mation model, taking into account different aspects as discussed in the cryptocurrency market using classical log-periodic models of price
the following. It can be based on the interaction of market players bubbles and their modifications. In the same way, a number of recent
that make economic decisions based on some indicators, taking into cryptocurrency market researches reveal that, unlike other financial
consideration the objective economic laws or behavioral finance laws assets, cryptocurrency prices are influenced by a number of specific
(econometric and balance models) or by considering the production factors that influence their demand, such as the number of Google
and other technological possibilities of creating the corresponding Trends searches, and the number of posts in social networks and other
asset, such as commodity markets, fundamental, mining cryptocur- mass media (Karalevicius et al., 2018; Lahmiri, 2011; Matta et al., 2015;
rency, and so on (Alessandretti et al., 2018; Alvarez-Ramirez et al., Mittal et al., 2019; Mohanty et al., 2018). These studies substantiated
2018; Bouri, Shahzad, & Roubaud, 2019; Gunay, 2019; Kaya, 2018; the feasibility of using nontypical factors as predictors. All of these
Lahmiri, 2011; Matta, Lunesu, & Marchesi, 2015; Wang & Chen, 2020). factors complicate the development of casual econometric models of
Moreover, given the past dynamics and volatility trends, time-series cryptocurrency price dynamics.
models and autoregressive models are also adopted by researchers As a consequence, in recent years, ML-based techniques have
(e.g., Anupriya & Garg, 2018; Conrad et al., 2018; Derbentsev et al., receiving increased attention in the study and analysis of cryptocur-
2019; Längkvist et al., 2014; Mohanty et al., 2018; Poyser, 2019; Roy rencies and predicting their future price.
et al., 2018; Troster et al., 2019). ML approaches: These techniques usually exploit supervised mod-
We analyzed and compared the research papers of different cat- els trained on historical data in order to automatically generate and
egories, highlighting the contribution made in each work. Paper predict the price volatility trends of the cryptocurrencies in the finan-
summaries of different categories can be found in Tables 3,4, and 5. cial markets (Mittal et al., 2018). In the past decade, the ML community
Our survey highlights that ML and DL play a vital role in predicting the has explored the use of ML techniques (e.g., classification, regres-
price of cryptocurrencies, but traditional statistical and econometric sion, time-series forecasting, DL, and RL) for cryptocurrency price
approaches are still significant. The volatility forecasts can be used to prediction.
gauge the cryptocurrency price fluctuations, which is also advanta- Recently, nonparametric methods based on ML and DL have gained
geous in the development and analysis of quantitative financial trading popularity for the analysis and forecasting of financial and economic
practice (Fang et al., 2020). time series and received increased attention from researchers for use
Traditional statistical (or econometric) approaches: Among this in cryptocurrency price prediction (Derbentsev et al., 2019). ML-based
category of papers, the commonly used techniques include basic models allow one to solve the problem of prediction and classifica-
regression schemes and correlations (linear statistical model, func- tion by utilizing learning sequences in the data. The effectiveness
tion estimation, CGCD model using copula functions, etc. Blau, 2017; of such models depends on the training speed and the degree of
Gunay, 2019; Hayes, 2017; Kaya, 2018; Kjærland et al., 2018; universality of approximating functions. This includes a collection of
Vaddepalli & Antoney, 2018; Vieira, 2017; Wiedmer, 2018). Schemes techniques, such as SVMs, ANNs, fuzzy logic, genetic algorithms, lin-
based on time-series analysis include the GARCH model and its deriva- ear and nonlinear statistical models, DL and RL models, and so on
tives, ARIMA model, wavelet coherence ananlysis, and so on (Caporale (Atsalakis et al., 2019; Galeshchuk & Mukherjee, 2017; Hitam et al.,
& Zekokh, 2019; Conrad et al., 2018; Roy et al., 2018; Troster et al., 2019; Jiang & Liang, 2017; Längkvist et al., 2014; Lahmiri, 2011;
2019; Walther et al., 2019). In addition, multivariate linear regres- Lahmiri & Bekiros, 2019; Nikou et al., 2019; Peng, Albuquerque,
sion, the multivariate vector autoregressive model and extended de Sá, Padula, & Montenegro, 2018; Radityo et al., 2017; Sarlin &
vector autoregressive model, the value at-risk (VaR) model, the least Marghescu, 2011; Sin & Wang, 2017; Tupinambás, Cadence, &
square method, chainlets, the temporal mixture model, the augmented Lemos, 2018; Uras et al., 2020).
Dicky–Fuller test, the Johansen tests for cointegration, the vector Though some ML models (classification and regression) were
error correction model, the autoregressive distributed lag model, and applied in predicting price volatility trends of cryptocurrencies, some
the GARCH model, and so on, were also adopted (Abbatemarco et al., researchers focused on the comparison of different statistical and ML
2018; Akcora et al., 2018; Bhambhwani et al., 2019; Choi, 1992; Dos methods, and also classification and regression-based ML schemes
Santos Maciel & Ballini, 2019; Giudici & Abu-Hashish, 2019; Guo & (Abbatemarco et al., 2018; Bush & Choi, 2019; Chen et al., 2017;
Antulov-Fantulin, 2018; Kjærland et al., 2018; Wang & Chen, 2020). Ciaian et al., 2016; Hashish et al., 2019; Mittal et al., 2019; Mohanty
The empirical analysis of the three most capitalized cryptocurren- et al., 2018; Poongodi et al., 2020; Sovbetov, 2018; Uras et al., 2020;
cies (bitcoin, ripple, and ethereum) did not reveal a static relationship Vaddepalli & Antoney, 2018; Valencia et al., 2019). Furthermore, some
KHEDR ET AL. 27

schemes integrated various prediction models, including some of the as blockchain has potential applications far beyond bitcoin, some
popular classification techniques as well as some popular time-series researchers have examined the applications based on blockchain and
forecasting techniques, while considering multiple aspects (Roy et al., their benefits in the crypto market in providing decentralized systems.
2018; Längkvist et al., 2014; Chakraborty & Roy, 2019; Derbentsev The work of O'Leary (2017) investigated the alternative configurations
et al., 2019; Wang & Chen, 2020; Poyser, 2019). For example, some of different blockchain architectures that can be used for gathering
researchers studied the results using classical ARIMA models and dif- and processing transactions in a range of different settings, including
ferent ML techniques, such as RF, linear discriminant analysis, logistic accounting, auditing, supply chain, and other types of transaction infor-
regression, and LSTM (Amjad & Shah, 2017; McNally et al., 2018; mation. Although there has been substantial focus on the peer-to-peer
Saxena, Sukumar, Nadu, & Nadu, 2018). Their analyses indicated that and public versions of blockchain, this paper focuses primarily on
the models that relied on training proved to be more appropriate cloud-based and private configuration versions of blockchains and
for predicting both the prices of cryptocurrencies and their volatility. investigates use configurations, advantages and limitations as firms
Another comparative analysis of the ARIMA forecasting properties bring blockchain-based market mechanisms into their organizations.
with RNNs for cryptocurrencies, such as ethereum (ETH), dash, lite- In addition, this paper investigates some emerging issues associated
coin (LTC), stellar (STR), siacoin (SC), nem (XEM), monero (XMR), and with blockchain use in consortium settings. Finally, this paper relates
ripple (XRP), showed that RNNs had better prediction capabilities than some proposed uses of blockchain for transaction processing to other
ARIMA models (Bush & Choi, 2019; Hashish et al., 2019; Mohanty technologies, such as data warehouses and databases. O'Leary (2018)
et al., 2018; Rebane, Karlsson, Papapetrou, & Denic, 2018). investigated ‘‘open information transactions.’’ Such transactions are
Most of the researches on cryptocurrency price prediction dur- in contrast to traditional transactions, where typically two parties to
ing 2010 to 2020 aimed at predicting the price of bitcoin, which a transaction are the only ones with information about the transac-
is the trending and most capitalized cryptocurrency. For example, tion. For example, in a sale, the seller and the purchaser typically are
Hayes (2017), Jang and Lee (2018), Madan et al., (2015), McNally the only ones with information about the transaction. However, some
et al., (2018), Sin and Wang (2017), and Wu, Lu, Ma, and Lu (2018) emerging technologies, such as blockchain accounting, supply chain
addressed the prediction of the next-day trend of bitcoin (up or down) social media, and hashtag commerce, are making information about
by adopting binary classification models trained on historical data. Var- the transactions potentially openly available to others. They investi-
ious models, such as logistic regression, RF (Attanasio et al., 2019; Sun gated some of the implications and strategies that include the use of
et al., 2019; Virk, 2017), SVMs (Silva de Souza et al., 2019; Madan et al., that open information. For example, open information in accounting
2015), MLPs and genetic algorithms (Sin & Wang, 2017), Bayesian and supply chain transactions provides the potential for both business
NNs (Jang & Lee, 2018), and LSTM and RNNs (Hashish et al., 2019; intelligence analysis of the information and possibly misleading and
Kwon et al., 2019; Li et al., 2019; McNally et al., 2018; Rebane et al., illusory transactions, analogous to those that have garnered the recent
2018; Wu et al., 2018). Parallel attempts to perform intra-day price attention of the Justice Department in cryptocurrencies. Finally, this
forecasting of bitcoin have also been made (e.g., Shah & Zhang, 2014; paper suggests that blockchain transaction processing will provide
Tupinambás et al., 2018). Since bitcoin is also a distributed network reliable information in those settings where there is a ‘‘single truth’’
that enables users to store and transfer digital currency, particular feed of information flow for the phenomena of interest, no ability to
attention has been paid to the enrichment of time-series data with do off-blockchain transactions (or a large penalty cost), and limitation
ad hoc features related to bitcoin trading and the bitcoin network, such to a single identity for each enterprise on the blockchain.
as the average transactions count per block. The feature engineering An interesting research problem in our age of big data is that
process is aimed at including new variables that describe potentially of determining provenance. Granular evaluation of provenance of
discriminating factors in the prediction models, such as user activities, physical goods (e.g., tracking ingredients of a pharmaceutical or
the level of attractiveness for investors, and global macrofinancial fac- demonstrating authenticity of luxury goods) has often not been
tors (Attanasio et al., 2019; Ciaian et al., 2016; Mohanty et al., 2018). possible with today's items that are produced and transported in com-
Moreover, the correlation between the distribution of the bitcoin's plex, interorganizational, often internationally spanning supply chains.
price and the volumes of the related tweets or media published on the Recent adoptions of the Internet of Things and blockchain technolo-
Web have been investigated as well (Matta et al., 2015). gies give promise at better supply-chain provenance. The work of Kim
As bitcoin and blockchain technology have begun to shape and and Laskowski (2018) evaluated how ontologies can contribute to
define new aspects, researchers have started focusing on the matter blockchain design. They were particularly interested in the blockchain,
of blockchain and cryptocurrencies. The public nature of blockchain as many favored use cases of blockchain are for provenance track-
technology opens the door for new price prediction challenges. Inves- ing, and they examined the application of ontologies on blockchain.
tigations on the relationship between a given currency's transaction To support this case, they analyzed a traceability ontology and trans-
network and its price have increased rapidly in recent years; the lated some of its representations to smart contracts that execute a
growing attention on user identification also strongly supports this provenance trace and enforce traceability constraints on the ethereum
direction. In-depth knowledge of these networks can help researchers blockchain platform.
in future to better identify new features in price prediction. In the past O'Leary (2019) reviewed some recent blockchain-based applica-
few years, there has been a rapid growth of numerous cryptocurren- tions for information capture, distribution, and preservation. As part
cies, hashing algorithms, and consensus agreements in the networks of that review, this paper examines two key concerns with current
(Brooks, 2019). This grabbed the attention of researchers. Moreover, blockchain designs for accounting and supply-chain transactions: data
28 KHEDR ET AL.

independence and multiple semantic models for the same informa- analyzed the role of attribute selection on the development of a
tion distribution problem. Blockchain applications typically integrate simple DL-ANN multiagent framework to accomplish a profitable
database, application, and presentation tiers all in the same ledger. trading strategy in the course of a series of trading simulations in
This results in a general inability to query information in the ledger and the foreign exchange market. The paper evaluated the performance
other concerns. Further, since most applications appear to be private of the DL-ANN multiagent framework over different time spans of
blockchain applications, there is a concern of agents needing to accom- high-frequency intraday asset time-series data and determined how
modate multiple blockchains depending on who their trading partners a set of the framework attributes produced effective forecasting for
are and what they request. Finally, this paper uses a distributed profitable trading. The paper showed the existence of predictable
database to design a ‘‘blockchain-like’’ system for virtual organizations. short-term price trends in the market time series, and an understanding
Money exchange is one of the most common day-to-day activities of the probability of price movements may be useful to high-frequency
performed by humans in the daily market. Mohamed et al., (2019) traders. The results of that paper can be used to further develop
presented an approach to money tracking through a blockchain. The financial decision-support systems and autonomous trading strategies
proposed approach consisted of three main components: serial num- for the financial market.
ber localization, serial number recognition, and a blockchain to store all There are also several studies that analyze the contagion risks and
transactions and ownership transfers. The approach was tested with spillover effects in cryptocurrencies, such as the work of Catania and
a total of 110 banknotes of different currency types and achieved an Sandholdt (2019), Zhang, Chan, Chu, and Nadarajah (2019), and Eross,
average accuracy of 91.17%. They conducted a user study in real time McGroarty, Urquhart, and Wolfe (2019). The predictability of cryp-
with 21 users, and the mean accuracy across all users was 86.42%. tocurrencies' returns and volatility has been studied by Catania and
Each user gave feedback on the proposed approach, and most of them Sandholdt (2019). These authors modeled the predictability at high
welcomed the idea. frequencies up to 6 hr, but not at higher aggregation levels, while real-
Behavioral science states that emotions, principles, and the manner ized volatility was characterized by long memory and leverage effects.
of thinking can affect the behavior of individuals and even investors in Such researches have either considered model-based estimates of
their decision-making on financial markets. Mnif, Jarboui, Hassan, and bitcoin volatility or other nonparametric measures to construct daily
Mouakhar (2020) tried to measure the investor sentiment by three or higher frequency time series of volatility. Additional results about
means of big data. The first was based on a search query of a list of the time-dependence properties of cryptocurrencies are reported and
words related to Islamic context. The second was inferred from the analyzed in the model proposed by Zhang et al., (2019) to forecast
engagement degree on social media. The last measure of sentiment the VaR for bitcoin. Moreover, some studies have indicated that the
was built, based on the Twitter API classified into positive and negative standard volatility models, like GARCH, are generally not suitable for
directions by an ML algorithm based on the naive Bayes method. cryptocurrency time-series and suggest to use a more sophisticated
Then, they investigated whether these sensations and emotions had modeling technique based on the score-driven approach (Creal, Koop-
an impact on the market sentiment and the price fluctuations by means man, & Lucas, 2013). The studies performed by Eross et al., (2019)
of a vector autoregression model and Granger causality analysis. In pointed out that the European and North American traders are the
the final step, they applied the agent-based simulation by means of main drivers of bitcoin trading, and the trading volume is the high-
the sequential Monte Carlo method with the control of the Twitter est during the morning and day time, which is consistent with the
measure on Islamic index returns. They showed that the three social other currency markets. By employing GMT timestamped tick data
media sentiment measures presented a remarkable impact on the aggregated to the frequency of 5 min, they found that bitcoin returns
contemporaneous and lagged returns of the different Islamic assets have increased over time, while trading volume, volatility, and liquid-
studied. They also gave an estimation of the parameters of the latent ity varied substantially over time. Realized volatility is fairly consistent
variables relative to the agent model studied. throughout the day, although it is highest during the opening times
In Rivas, Parras-Gutiérrez, Merelo, Arenas, and of the three major global stock markets. Overall, it is seen that the
García-Fernández (2017), a time-series forecasting method (jsEvRBF) trend for research into bitcoin, and increasingly into other cryp-
based on a genetic algorithm and neural nets written in JavaScript tocurrencies, continues to grow and attract further research. Table 7
language that can be executed in most web browsers was proposed. summarizes some of the findings on common ML approaches applied
Consequently, everybody can participate in the experiments, and for cryptocurrency price prediction.
scientists can nowadays take advantage of the available browsers and Based on the analysis conducted in this survey, we highlight the
devices as computation environments. This is also a great challenge, following points that can help researchers to fill the gap in existing
as the language support and performance varys from one browser studies and gain more insight on cryptocurrency price prediction
to another. jsEvRBF has been tested in a volunteer computing research.
experiment, and also in a single-browser one. Both experiments were • The price fluctuations of bitcoin were anticipated by these studies
related to forecasting currencies exchange, and the results showed to different degrees and revealed that NN-based algorithms yielded
the viability of the proposal. the best results (Greaves & Au, 2015; Jang & Lee, 2018; Chen et al.,
In financial trading, technical and quantitative analysis tools are used 2020). This is because an ANN has the ability to learn the nonlin-
for the development of decision-support systems. Although these ear relationship between input and output variables contained in
traditional tools are useful, new techniques in the field of ML have real-time cryptocurrency data sets. Bayesian regression provided
been developed for time-series forecasting. The work of Aloud (2020) acceptable accuracy in terms of profit and daily price prediction.
KHEDR ET AL. 29

• RF outperformed the SVM in predicting bitcoin price (Madan et al., are forwarded and backpropagated between layers. This approach
2015). This could be because RF uses nonparametric decision trees, recognizes long-term dependencies. An LSTM unit is composed of
so outliers and linear separability of the data are not a concern. SVM, a cell and three gates. The cell remembers values over arbitrary
on the other hand, needs to create artificial separations between time intervals, and the three gates (i.e., input, output, and forget
data points in higher dimensional space to classify points. However, gate) regulate the flow of information into and out of the cell. In
the SVM model, combined with the relief technique for attribute this way, weak signals get blocked, avoiding the vanishing gradient
selection, yielded an acceptable prediction accuracy. The perfor- problem. LSTM has three state dependencies (i.e., previous cell
mance of the SVM depends greatly on the parameters set during state, previous hidden state, and current time steps) for memorizing
training—mainly the kernel function. The settings that produced things and special gates for manipulating this memory. So this
the best accuracy for a problem may result in reduced efficiency for approach selects the most appropriate information from the cell
others. The price of ethereum was predicted accurately by the SVM state.
model with the given selected features (Poongodi et al., 2020). • From the aforementioned techniques, we can conclude that LSTM
• The linear regression model is most often the first to be used in can be considered to be the best technique for solving time-series
many problems owing to its simplicity and widespread availability. prediction problems and recognizing long-term associations by
Though not exceeding other ML models, linear regression provided removing irrelevant information from the network.
a reasonable accuracy in cryptocurrency price prediction (Greaves
& Au, 2015; Mittal et al., 2018; Poongodi et al., 2020).
• While considering the combined performance of all the ML classi-
fiers on any cryptocurrency price prediction, ANN and RF models 7 FUTURE DIRECTIONS
are best suited for cryptocurrency price prediction in common. The
SVM provided the highest accuracy for ethereum. In the cryptocurrency world, prices are very volatile. Hence, it is
• Ensemble techniques in ML are not explored that much in the field essential to encourage research that incorporates new techniques,
of price prediction. Also, there is not much focus on optimizing ML strategies, and alternative approaches, such as more sophisticated pre-
techniques to improve accuracy. diction algorithms, advanced ensemble methods, feature engineering
• Researchers are recommended to use a hybridized approach of an techniques, and other validation metrics for gaining accurate cryp-
SVM combined with a genetic algorithm to gain more accuracy, as tocurrency price prediction. This can assist cryptocurrency investors
well as optimized results. toward potential increased profits, support policy-makers and financial
researchers in studying cryptocurrency markets behavior. In general,
In the field of DL, the RNN and LSTM are the well-known
it is seen that the trend for research into bitcoin, and increasingly
approaches, and they have many potential advantages compared with into other cryptocurrencies, continues to grow and attract further
the traditional MLP. Some of the findings on common DL approaches research. There is no guarantee of which cryptocurrency will dominate
applied for cryptocurrency price prediction are summarized in Table 8. in the future, so research work in this area is of significant importance.
The discussion here could be beneficial in exploring some promis-
• The MLP model is a simple feedforward NN comprised of input and
ing opportunities that remain open in cryptocurrency price prediction
output layers with a hidden layer in between them. Each output
research.
from one of these layers represents a unit that is similar to the
neuron in a human brain (Yao et al., 2018). Connections between • The adoption rates are increasing all the time, which includes more
these units are considered as weights that are similar to synapses and more people using the network and an increase in wallets
in the brain. The performance of the MLP is limited due to the and apps. So, this may help in increasing the price, as it gives the
vanishing gradient problem. This problem is the primary concern, cryptocurrency real-world usage. There is plenty of further work to
as the network becomes too small to learn. Another limitation in be done in this area to understand the influence of these factors.
using an MLP is that signals can only pass forward statically, so it • Further study would require inclusion of all relevant factors that
becomes difficult for it to recognize the temporal element of time would influence market moods, as well as tracking over a longer time
series effectively. period, to understand the anomalous behavior of cryptocurrencies
• The RNN is a dynamic NN that addresses some of the limitations and their prices. This is a field with a lot of potential for research in
present in an MLP. Its structure is similar to an MLP, but in this financial time-series problems because of their high data availability
technique the signals can pass forward as well as backward in both and accessibility. The evidence from this survey suggests that there
directions in an iterative manner and make it easy to add another is much room for further work and enhancements.
context layer in the RNN. In this new layer, at each time, the step • Researchers are recommended to analyze the use of LSTM mod-
state can be overwritten. In this approach, the temporal issue of els in future study, such as CNN LSTMs, bidirectional LSTMs,
an MLP can be solved by assigning weights to events that are encoder–decoder LSTMs, and also to compare their results to attain
occurring in series rather than assigning the same weights to all good future insight and improve price prediction results. The intra-
inputs, but the vanishing gradient is still an issue in this approach. day behavior of the intraday variables does vary over time, and
• LSTM addresses both the issues present in the MLP and the this indicates that researchers should be careful to guard against
RNN. This approach allows the weight preservation of signals that this when examining any aspects of cryptocurrency (Blau, 2017;
30 KHEDR ET AL.

Eross et al., 2019; Karalevicius et al., 2018; Kjærland et al., 2018; • Another possible attempt is to identify new pricing methods by
Poyser, 2019). analyzing real-time market changes. Considering the proportion of
• Another promising perspective in the study of cryptocurrencies informed traders increasing in the cryptocurrency market and the
includes analyzing the impact of public opinion, as measured corresponding relation with the pricing process is another area to
through social media traces, on the market behavior and prices in explore.
an improved way (Mittal et al., 2019; Phillips & Gorse, 2017). Social • When considering the price of a cryptocurrency, it is always better
media traces are found to be effective in predicting the stock mar- to look at real-world events. This can include improved technology,
ket behavior, and this makes it an interesting direction for future
future road-map objectives, new partnerships or even regulations.
work in cryptocurrency price prediction research by making use
The influence of Lightning-based financial infrastructure, Lightning
of larger volumes of media input in sentiment analysis, enhancing
Labs (which has received investments from senior individuals from
the baseline natural language-processing models to provide more
Twitter, PayPal, and litecoin) in the performance of bitcoin trans-
robust text preprocessing, applying NNs in label training, extend-
ing samples in terms of holding period, transaction fees, and user actions can be analyzed. Transactions will become much faster and
reputation research. cheaper. If this is successful then it should have a very positive
• The application of sentiment analysis for collecting social sig- effect on the price of bitcoin. Moreover, regulation policies also
nals can further be enhanced by improving the quality of influence the bitcoin prices. If major nations follow common regula-
the content, and through use of increased number of con- tion policies, it will give bitcoin far more legitimacy. Ultimately, the
tent sources. Elimination of duplicates and content filtering price of bitcoin should increase. Research analysis in this direction
from advertising or bots could further improve the prediction is also recommended.
performance. • Further research with respect to the interaction of market players
• From past studies, it can be noted that the use of content from that make economic decisions based on some indicators, tak-
other social networks, such as Facebook or Reddit (Matta et al., ing into consideration the objective economic laws or behavioral
2015; Mittal et al., 2019; Phillips & Gorse, 2017), can enhance the finance laws (econometric and balance models) or by considering
prediction power and improve the model in this direction, which the production and other technological possibilities of creating the
would also be beneficial. It will be good to take advantage of the best corresponding asset, such as commodity markets, fundamentals,
models for standard sentiment analysis and then tune them using mining cryptocurrency, and so on, can be encouraged in future.
transfer learning or any other approach to enhance the prediction • Moreover, the research works that involves the model implementa-
power. The other important refinement is to analyze different tion in a practical or real-time setting for predicting into the future
neural architectures and different data input options integrating can be encouraged.
media data (e.g., Twitter) by adding other data types that include • Another promising path for future studies could be the linguistic
news and reddit posts from influential users and/or their tweets. analysis of the coordination of pump-and-dumps in online chat
• Another area of opportunity would be the usage of more special- groups, and the means by which misinformation about specific
ized models that have different types of approaches, such as LSTM currencies is spread (e.g., on social media) and the influence on
networks and temporal MLPs. Recent work proves that the pre- cryptocurrency price.
dictability of LSTMs is significantly higher when compared to the
generalized regression neural architecture (Hashish et al., 2019; 8 CONCLUSION
Kwon et al., 2019; Li et al., 2019; Saxena et al., 2018; Wu et al.,
2018). Such networks may be able to capture the inherent market Overall, the main contribution of this paper is a recent analysis that
trends and easy adapt as needed. Future research can make use of explores and summarizes the articles published in the domain of
separate models for Twitter and market data in order to improve cryptocurrency price prediction, by applying models using traditional
accuracy and precision scores of models. Also, proving whether statistical and ML techniques. We extracted information from docu-
these predictive models can be used for creating trading strategies ments published during the periods from 2010 to 2020. To the best
would be interesting. of our knowledge, this is the first study on this topic. In this study,
• Considering the anomalous behavior of cryptocurrency, correla- we have also investigated various challenges present in traditional
tion between cryptocurrency and other assets is still waiting for approaches of cryptocurrency price prediction. We indicate how we
further work and enhancements. Possible breakthroughs might be can solve these problems by moving to ML and DL paradigms. Despite
achieved with principal component analysis, relationship between the existence of many ML approaches, many issues and challenges still
cryptocurrency and other currencies in extreme conditions (e.g., exist in predicting the price of cryptocurrency accurately. The majority
financial collapse). of these approaches still require, and are under, further consideration.
• The public nature of blockchain technology opens the door for The discussion presented in this article could be beneficial in exploring
new price prediction challenges. Investigations on the relationship future research problems and finding ways to solve them. This study
between a given currency's transaction network and its price have will hopefully give researchers a unique insight for the future and fill
increased rapidly in recent years; the growing attention on user the gap in existing studies. The cryptocurrency domain is still in its
identification also strongly supports this direction. In-depth knowl- infancy, as can be seen from the published papers. The majority of the
edge of these networks can help researchers in future to better documents are published over the last 3 years or so. And based on
identify new features in price prediction. the observations, one can safely say, over the next few years, we can
KHEDR ET AL. 31

expect a lot of studies in this domain. We believe that this paper has Ardia, D., Bluteau, K., & Rüede, M. (2019). Regime changes in bitcoin
GARCH volatility dynamics. Finance Research Letters, 29, 266–271.
provided that foundation by bringing relevant papers and analyzing
Atsalakis, G. S., Atsalaki, I. G., Pasiouras, F., & Zopounidis, C. (2019). Bitcoin
the contributions, to attract and support future research. price forecasting with neuro-fuzzy techniques. European Journal of
Operational Research, 276(2), 770–780.
Attanasio, G., Garza, P., Cagliero, L., & Baralis, E. (2019). Quantitative cryp-
tocurrency trading: exploring the use of machine learning techniques.
In Proceedings of the 5th Workshop on Data Science for Macro-modelling
with Financial and Economic Datasets (DSMM'19), pp. 1–6. Available at
https://dl.acm.org/doi/pdf/10.1145/3336499.3338003
Badenhorst, J. J. (2018). Effect of bitcoin spot and derivative trading
ORCID
volumes on price volatility. PhD thesis, University of Pretoria, 2019.
Ahmed M. Khedr https://orcid.org/0000-0001-7957-7862 Available at https://repository.up.ac.za/bitstream/handle/2263/
68877/Badenhorst_Effect_2019.pdf?sequence=1&isAllowed=y
Baek, C., & Elbeck, M. (2015). Bitcoins as an investment or speculative
REFERENCES vehicle? A first look. Applied Economics Letters, 22(1), 30–34.
Abbatemarco, N., De Rossi, L., & Salviotti, G. (2018). An econometric model Barnwal, A., Bharti, H. P., Ali, A., & Singh, V. (2019). Stacking with neural
to estimate the value of a cryptocurrency network. The Bitcoin case. network for cryptocurrency investment, 2019 New York scientific data
Research Papers. 164. https://aisel.aisnet.org/ecis2018_rp/16 summit (NYSDS), pp. 1–5.
Abu Bakar, N., Rosbi, S., & Uzaki, K. (2019). Forecasting cryptocurrency Bartolucci, S., Destefanis, G., Ortu, M., Uras, N., Marchesi, M., & Tonelli,
price movement using moving average method: A case study of bitcoin R. (2020). The butterfly ‘‘affect’’: Impact of development practices on
cash. International Journal of Advanced Research, 7(12), 609–614. cryptocurrency prices. EPJ Data Science, 9(1), 21.
Akcora, C. G., Dey, A. K., Gel, Y. R., & Kantarcioglu, M. (2018). Forecasting Bhambhwani, S., Delikouras, S., & Korniotis, G. M. (2019). Do fundamen-
bitcoin price with graph chainlets, Advances in knowledge discovery and tals drive cryptocurrency prices? Available at https://news.miami.edu/
data mining. PAKDD 2018, pp. 765–776. miamiherbert/_news-assets/files/2019/12/crypto.pdf
Akshaya, R., Eswari, B., Dharani, S., & Lalitha, R. (2019). A survey on antici- Blau, B. M. (2017). Price dynamics and speculative trading in bitcoin.
pation the prices of crypto currency using deep learning. International Research in International Business and Finance, 41, 493–499.
Journal for Research in Applied Science & Engineering Technology, 7(3), Böhme, R., Christin, N., Edelman, B., & Moore, T. (2015). Bitcoin: Economics,
1639–1644. technology, and governance. Journal of Economic Perspectives, 29(2),
Alahmari, S. A. (2019). Using machine learning ARIMA to predict the price of 213–238.
cryptocurrencies. ISeCure—The ISC International Journal of Information Bouri, E., Lau, C. K. M., Lucey, B., & Roubaud, D. (2019). Trading volume and
Security, 11(3), 139–144. the predictability of return and volatility in the cryptocurrency market.
Alessandretti, L., ElBahrawy, A., Aiello, L. M., & Baronchelli, A. (2018). Finance Research Letters, 29, 340–346.
Anticipating cryptocurrency prices using machine learning. Complexity, Bouri, E., Shahzad, S. J. H., & Roubaud, D. (2019). Co-explosivity in the
2018, 8983590. cryptocurrency market. Finance Research Letters, 29, 178–183.
Almasri, E., & Arslan, E. (2018). Predicting cryptocurrencies prices with neu- Brauneis, A., & Mestel, R. (2019). Cryptocurrency-portfolios in
ral networks. In 2018 6th International Conference on Control Engineering a mean–variance framework. Finance Research Letters, 28,
& Information Technology (CEIT), IEEE, pp. 1–5. 259–264.
Almeida, J., Tata, S., Moser, A., & Smit, V. (2015). Bitcoin prediction using Brooks, C. (2019). Introductory econometrics for finance. Cam-
ANN. Neural Networks, 7, 1–12. bridge, UK: Cambridge University Press. Available at https://
Alon, N., Lokshtanov, D., & Saurabh, S. (2009). Fast FAST, International www.cambridge.org/ae/academic/subjects/economics/finance/
colloquium on automata, languages, and programming. pp. 49–58. introductory-econometrics-finance-4th-edition?format=PB&
Alonso-Monsalve, S., Suárez-Cetrulo, A. L., Cervantes, A., & Quintana, isbn=9781108436823
D. (2020). Convolution on neural networks for high-frequency trend Bush, R., & Choi, S. (2019). Forecasting ethereum storj token prices:
prediction of cryptocurrency exchange rates using technical indicators. Comparative analyses of applied bitcoin models. In International Con-
Expert Systems with Applications, 149, 113250. ference on Data Mining Workshops (ICDMW). Available at: https://www.
Aloud, M. E. (2020). The role of attribute selection in deep ANNs learning computer.org/csdl/proceedings/icdmw/2019/1gAwR0WCegU
framework for high-frequency financial trading. Intelligent Systems in Bystrom, H., & Krygier, D. (2018). What drives bitcoin volatility? Available
Accounting, Finance and Management, 27(2), 43–54. at https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3223368
Altan, A., Karasu, S., & Bekiros, S. (2019). Digital currency forecasting Caporale, G. M., & Zekokh, T. (2019). Modelling volatility of cryptocurren-
with chaotic meta-heuristic bio-inspired signal processing techniques. cies using Markov-switching GARCH models. Research in International
Chaos, Solitons & Fractals, 126, 325–336. Business and Finance, 48, 143–155.
Alvarez-Ramirez, J., Rodriguez, E., & Ibarra-Valdez, C. (2018). Long-range Catania, L., & Sandholdt, M. (2019). Bitcoin at high frequency. Journal of
correlations and asymmetry in the bitcoin market. Physica A: Statistical Risk and Financial Management, 12(1), 36.
Mechanics and its Applications, 492, 948–955. Chakraborty, D., & Roy, A. (2019). Time series methodology in storj token
Amjad, M., & Shah, D. (2017). Trading bitcoin and online time series prediction. In 2019 International Conference on Data Mining Work-
prediction. Proceedings of Machine Learning Research, 55, 1–15. shops (ICDMW), pp. 224–231. Available at https://ieeexplore.ieee.org/
Anupriya, & Garg, S. (2018). Autoregressive integrated moving average stamp/stamp.jsp?arnumber=8955655
model based prediction of bitcoin close price. International Conference Charles, A., & Darné, O. (2019). Volatility estimation for cryptocurrencies:
on Smart Systems and Inventive Technology (ICSSIT), pp. 473–478. Avail- Further evidence with jumps and structural breaks. Economics Bulletin,
able at https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber= 39(2), 954–968.
8748423
32 KHEDR ET AL.

Chatfield, C., & Yar, M. (1988). Holt-winters forecasting: some practical Gunay, S. (2019). Impact of public information arrivals on cryptocurrency
issues. Journal of the Royal Statistical Society: Series D (The Statistician), market: A case of Twitter posts on ripple. East Asian Economic Review,
37(2), 129–140. 23(2), 149–168.
Chen, M., & Narwal, N. (2017). Predicting price changes in ethereum. Guo, T., & Antulov-Fantulin, N. (2018). Predicting short-term Bitcoin price
Available at http://cs229.stanford.edu/proj2017/final-reports/ fluctuations from buy and sell orders. Preprint. arXiv:1802.04065.
5244039.pdf Hashish, I. A., Forni, F., Andreotti, G., Facchinetti, T., & Darjani, S. (2019).
Chen, Z., Li, C., & Sun, W. (2020). Bitcoin price prediction using machine September. A hybrid model for bitcoin prices prediction using hidden
learning: An approach to sample dimension engineering. Journal of Markov models and optimized LSTM networks. In 2019 24th IEEE Inter-
Computational and Applied Mathematics, 365, 112395. national Conference on Emerging Technologies and Factory Automation
Choi, B. (1992). ARMA model identification. New York, NY: Springer Science (ETFA), IEEE, pp. 721–728.
& Business Media. Available at https://www.springer.com/gp/book/ Hatefi Ghahfarrokhi, A., & Shamsfard, M. (2020). Tehran stock exchange
9781461397472 prediction using sentiment analysis of online textual opinions. Intelligent
Ciaian, P., Rajcaniova, M., & Kancs, A. (2016). The economics of bitcoin Systems in Accounting, Finance and Management, 27(1), 22–37.
price formation. Applied Economics, 48(19), 1799–1815. Hayes, A. S. (2017). Cryptocurrency value formation: An empirical study
Conrad, C., Custovic, A., & Ghysels, E. (2018). Long-and short-term cryp- leading to a cost of production model for valuing bitcoin. Telematics
tocurrency volatility components: A GARCH–MIDAS analysis. Journal and Informatics, 34(7), 1308–1321.
of Risk and Financial Management, 11(2), 23. Hitam, N. A., Ismail, A. R., & Saeed, F. (2019). An optimized support
Corbet, S., Lucey, B., Urquhart, A., & Yarovaya, L. (2019). Cryptocurrencies vector machine (SVM) based on particle swarm optimization (PSO) for
as a financial asset: A systematic analysis. International Review of cryptocurrency forecasting. Procedia Computer Science, 163, 427–433.
Financial Analysis, 62, 182–199. Ho, S.-L., Xie, M., & Goh, T. N. (2002). A comparative study of neural
Creal, D., Koopman, S. J., & Lucas, A. (2013). Generalized autoregressive network and Box–Jenkins ARIMA modeling in time series prediction.
score models with applications. Journal of Applied Econometrics, 28(5), Computers & Industrial Engineering, 42(2–4), 371–375.
777–795. Holub, M., & Johnson, J. (2018). Bitcoin research across disciplines. The
Crypto Research Report (2020). An absolute valuation approach Information Society, 34(2), 114–126.
to crypto assets, Crypto Research Report, June 2020, Edition X, Jain, A., Tripathi, S., Dwivedi, H. D., & Saxena, P. (2018). Forecasting price
pp. 6–36.Retrieved from https://cryptoresearch.report/wp-content/ of cryptocurrencies using tweets sentiment analysis, 2018 eleventh
uploads/2020/06/Crypto-Research-Report-June-2020-ENG.pdf international conference on contemporary computing (IC3), pp. 1–7.
Derbentsev, V., Datsenko, N., Stepanenko, O., & Bezkorovainyi, V. (2019). Jang, H., & Lee, J. (2018). An empirical study on modeling and prediction
Forecasting cryptocurrency prices time series using machine learning of bitcoin prices with Bayesian neural networks based on blockchain
approach. SHS Web of Conferences, 65, 02001. information. IEEE Access, 6, 5427–5437.
Dixon, M. F., Halperin, I., & Bilokon, P. (2020). Frontiers of machine learning Ji, S., Kim, J., & Im, H. (2019). A comparative study of bitcoin price prediction
and finance, Machine learning in finance: From theory to practice, pp. using deep learning. Mathematics, 7(10), 898.
519–541. Jiang, Z., & Liang, J. (2017). Cryptocurrency portfolio management with
Domingos, P. M. (2012). A few useful things to know about machine deep reinforcement learning. In 2017 Intelligent Systems Conference
learning. Communications of the ACM, 55(10), 78–87. (IntelliSys), IEEE, pp. 905–913.
Dos Santos Maciel, L., & Ballini, R. (2019). On the predictability of high and Kakushadze, Z. (2016). 101 formulaic alphas. Wilmott, 2016(84), 72–81.
low prices: The case of bitcoin. Brazilian Review of Finance, 17(3), 66–84. Karalevicius, V., Degrande, N., & De Weerdt, J. (2018). Using sentiment
El-Bannany, M., Sreedharan, M., & Khedr, A. M. (2020). A robust deep analysis to predict interday bitcoin price movements. The Journal of
learning model for financial distress prediction. International Journal of Risk Finance, 19(1), 56–75.
Advanced Computer Science and Applications, 11, 170–175. Kaya, Y. (2018). Analysis of cryptocurrency market and drivers of the
Eross, A., McGroarty, F., Urquhart, A., & Wolfe, S. (2019). The intraday bitcoin price: Understanding the price drivers of bitcoin under specu-
dynamics of bitcoin. Research in International Business and Finance, 49, lative environment (Dissertation). KTH Royal Institute of Technology,
71–81. Stockholm, Swedon. Retrieved from http://urn.kb.se/resolve?urn=urn:
Fang, F., Ventre, C., Basios, M., Kong, H., Kanthan, L., Li, L., & Wu, nbn:se:kth:diva-246060
F. (2020). Cryptocurrency trading: A comprehensive survey. arXiv Kim, H. M., & Laskowski, M. (2018). Toward an ontology-driven blockchain
preprint arXiv:2003.11352. design for supply-chain provenance. Intelligent Systems in Accounting,
Galeshchuk, S., & Mukherjee, S. (2017). Deep networks for predicting Finance and Management, 25(1), 18–27.
direction of change in foreign exchange rates. Intelligent Systems in Kim, Y. B., Kim, J. G., Kim, W., Im, J. H., Kim, T. H., Kang, S. J., & Kim,
Accounting, Finance and Management, 24(4), 100–110. C. H. (2016). Predicting fluctuations in cryptocurrency transactions
Garcia, D., Tessone, C. J., Mavrodiev, P., & Perony, N. (2014). The digital based on user comments and replies. PloS ONE, 11(8), e0161197.
traces of bubbles: feedback cycles between socio-economic signals https://doi.org/10.1371/journal.pone.0161197
in the bitcoin economy. Journal of the Royal Society Interface, 11(99), Kjærland, F., Khazal, A., Krogstad, E. A., Nordstrøm, F. ransB. G., & Oust,
20140623. A. (2018). An analysis of bitcoin's price dynamics. Journal of Risk and
Giudici, P., & Abu-Hashish, I. (2019). What determines bitcoin exchange Financial Management, 11(4), 63.
prices? A network VAR approach. Finance Research Letters, 28, Kleinbaum, G., & Klein, M. (2002). Logistic regression. New York, NY:
309–318. Springer.
Greaves, A., & Au, B. (2015). Using the bitcoin transac- Kriš toufek, L. (2015). What are the main drivers of the bitcoin price?
tion graph to predict the price of bitcoin. Available Evidence from wavelet coherence analysis. PLoS ONE, 10(4), e0123923.
via DIALOG. https://pdfs.semanticscholar.org/a0ce/
Kwon, D.-H., Kim, J.-B., Heo, J.-S., Kim, C.-M., & Han, Y.-H. (2019). Time
864663c100582805ffa88918910da89add47.pdf. Accessed 8 Dec
series classification of cryptocurrency price trend based on a recurrent
2015.
KHEDR ET AL. 33

LSTM neural network. Journal of Information Processing Systems, 15(3), exchanges. Intelligent Systems in Accounting, Finance and Management,
694–706. 26(4), 193–201.
Kyriazis, N., Papadamou, S., & Corbet, S. (2020). A systematic review of Mohanty, P., Patel, D., Patel, P., & Roy, S. (2018). Predicting fluctuations
the bubble dynamics of cryptocurrency prices. Research in International in cryptocurrencies' price using users' comments and real-time prices.
Business and Finance, 54, 101254. In 2018 7th International Conference on Reliability, Infocom Technolo-
Kyriazis, N. A. (2019). A survey on efficiency and profitable trading gies and Optimization (Trends and Future Directions) (ICRITO), IEEE, pp.
opportunities in cryptocurrency markets. Journal of Risk and Financial 477–482.
Management, 12(2), 67. Nakamoto, S., & Bitcoin, A. (2008). Bitcoin: A peer-to-peer electronic cash
Lahmiri, S. (2011). A comparison of PNN and SVM for stock market trend system. Bitcoin. Retrieved from https://bitcoin.org/bitcoin.pdf
prediction using economic and technical information. International Nikou, M., Mansourfar, G., & Bagherzadeh, J. (2019). Stock price prediction
Journal of Computer Applications, 29(3), 24–30. using deep learning algorithm and its comparison with machine learning
Lahmiri, S., & Bekiros, S. (2019). Cryptocurrency forecasting with deep algorithms. Intelligent Systems in Accounting, Finance and Management,
learning chaotic neural networks. Chaos, Solitons & Fractals, 118, 35–40. 26(4), 164–174.
Längkvist, M., Karlsson, L., & Loutfi, A. (2014). A review of unsupervised O'Leary, D. E. (2017). Configuring blockchain architectures for transaction
feature learning and deep learning for time-series modeling. Pattern information in blockchain consortiums: The case of accounting and
Recognition Letters, 42, 11–24. supply chain systems. Intelligent Systems in Accounting, Finance and
Lee, T.-H., & Yang, W. (2014). Granger-causality in quantiles between finan- Management, 24(4), 138–147.
cial markets: Using copula approach. International Review of Financial O'Leary, D. E. (2018). Open information enterprise transactions: Business
Analysis, 33, 70–78. intelligence and wash and spoof transactions in blockchain and social
Li, L., Arab, A., Liu, J., Liu, J., & Han, Z. (2019). Bitcoin options pricing commerce. Intelligent Systems in Accounting, Finance and Management,
using LSTM-based prediction model and blockchain statistics, 2019 25(3), 148–158.
IEEE international conference on blockchain (blockchain), pp. 67–74. O'Leary, D. E. (2019). Some issues in blockchain for accounting and
Li, T. R., Chamrajnagar, A., Fong, X., Rizik, N., & Fu, F. (2019). the supply chain, with an application of distributed databases to
Sentiment-based prediction of alternative cryptocurrency price fluc- virtual organizations. Intelligent Systems in Accounting, Finance and
tuations using gradient boosting tree model. Frontiers in Physics, 7, Management, 26(3), 137–149.
98.
Patel, J., Shah, S., Thakkar, P., & Kotecha, K. (2015). Predicting stock and
Liu, Y., & Tsyvinski, A. (2020). Risks and returns of cryptocurrency.
stock price index movement using trend deterministic data preparation
The Review of Financial Studies, hhaa113. https://doi.org/10.1093/rfs/
and machine learning techniques. Expert Systems with Applications,
hhaa113h
42(1), 259–268.
Lu, C.-J. (2010). Integrating independent component analysis-based denois-
Peng, Y., Albuquerque, P. H. M., de Sá, J. M. C., Padula, A. J. A., &
ing scheme with neural network for stock price prediction. Expert
Montenegro, M. R. (2018). The best of two worlds: Forecasting high
Systems with Applications, 37(10), 7056–7064.
frequency volatility for cryptocurrencies and traditional currencies
Madan, I., Saluja, S., & Zhao, A. (2015). Automated bitcoin trading via
with support vector regression. Expert Systems with Applications, 97,
machine learning algorithms.
177–192.
Mallqui, D. C. A., & Fernandes, R. A. S. (2019). Predicting the direc-
Phaladisailoed, T., & Numnonda, T. (2018). Machine learning models
tion, maximum, minimum and closing prices of daily bitcoin exchange
comparison for bitcoin price prediction. In 2018 10th International
rate using machine learning techniques. Applied Soft Computing, 75,
Conference on Information Technology and Electrical Engineering (ICITEE),
596–606.
IEEE, pp. 506–511.
Matta, M., Lunesu, I., & Marchesi, M. (2015). Is bitcoin's market pre-
Phillips, R. C., & Gorse, D. (2017). Predicting cryptocurrency price bubbles
dictable? Analysis of web search and social media, Knowledge discovery,
using social media data and epidemic modelling, 2017 IEEE symposium
knowledge engineering and knowledge management. IC3K 2015, pp.
series on computational intelligence (SSCI), pp. 1–7.
155–172.
McNally, S., Roche, J., & Caton, S. (2018). Predicting the price of bitcoin Phillips, R. C., & Gorse, D. (2018a). Cryptocurrency price drivers: Wavelet
using machine learning, 2018 26th Euromicro international confer- coherence analysis revisited. PloS ONE, 13(4), e0195200.
ence on parallel, distributed and network-based processing (PDP), pp. Phillips, R. C., & Gorse, D. (2018b). Mutual-excitation of cryptocurrency
339–343. market returns and social media topics, ICFET '18: Proceedings of the
Misnik, A., Krutalevich, S., Prakapenka, S., Borovykh, P., & Vasiliev, M. 4th International conference on frontiers of educational technologies, pp.
(2018). Neural network approximation precision change analysis on 80–86.
cryptocurrency price prediction. In Fuzzy Technologies in the Industry-FTI Poongodi, M., Sharma, A., Vijayakumar, V., Bhardwaj, V., Sharma, A. P.,
2018, pp. 96–101. Iqbal, R., & Kumar, R. (2020). Prediction of the price of ethereum
Mittal, A., Dhiman, V., Singh, A., & Prakash, C. (2019). Short-term bitcoin blockchain cryptocurrency in an industrial finance system. Computers
price fluctuation prediction using social media and web search data. & Electrical Engineering, 81, 106527.
In 2019 Twelfth International Conference on Contemporary Computing Poyser, O. (2019). Exploring the dynamics of bitcoin's price: A Bayesian
(IC3), IEEE, pp. 1–6. structural time series approach. Eurasian Economic Review, 9(1), 29–60.
Mittal, R., Arora, S., & Bhatia, M. P. S. (2018). Automated cryptocurren- Radityo, A., Munajat, Q., & Budi, I. (2017). Prediction of bitcoin exchange
cies prices prediction using machine learning. ICTACT Journal on Soft rate to American dollar using artificial neural network methods. In 2017
Computing, 8, 2229–6956. International Conference on Advanced Computer Science and Information
Mnif, E., Jarboui, A., Hassan, M. K., & Mouakhar, K. (2020). Big data tools Systems (ICACSIS), IEEE, pp. 433–438.
for Islamic financial analysis. Intelligent Systems in Accounting, Finance Rane, P. V., & Dhage, S. N. (2019). Systematic erudition of bitcoin price
and Management, 27(1), 10–21. prediction using machine learning techniques. In 2019 5th International
Mohamed, K., Aziz, A., Mohamed, B., Abdel-Hakeem, K., Mostafa, M., Conference on Advanced Computing & Communication Systems (ICACCS),
& Atia, A. (2019). Blockchain for tracking serial numbers in money IEEE, pp. 594–598.
34 KHEDR ET AL.

Rebane, J., Karlsson, I., Papapetrou, P., & Denic, S. (2018). Seq2Seq RNNs Uras, N., Marchesi, L., Marchesi, M., & Tonelli, R. (2020). Forecasting Bitcoin
and ARIMA models for cryptocurrency prediction: A comparative study. closing price series using linear regression and neural networks models.
In SIGKDD Fintech'18, London, UK, August 19–23. Peer J Computer Science, 6, e279.
Rivas, V. M., Parras-Gutiérrez, E., Merelo, J. J., Arenas, M. G., & Vaddepalli, S., & Antoney, L. (2018). Are economic factors driving bitcoin
García-Fernández, P. (2017). Time series forecasting using evolutionary transactions? An analysis of select economies. Finance Research Letters,
neural nets implemented in a volunteer computing system. Intelligent 163(12), 106–109.
Systems in Accounting, Finance and Management, 24(2-3), 87–95. Valencia, F., Gómez-Espinosa, A., & Valdés-Aguirre, B. (2019). Price move-
Roy, S., Nanjiba, S., & Chakrabarty, A. (2018). Bitcoin price forecasting using ment prediction of cryptocurrencies using sentiment analysis and
time series analysis. In 2018 21st International Conference of Computer machine learning. Entropy, 21(6), 589.
and Information Technology (ICCIT), IEEE, pp. 1–5. Vapnik, V. N. (1999). An overview of statistical learning theory. IEEE
Sarlin, P., & Marghescu, D. (2011). Neuro-genetic predictions of currency Transactions on Neural Networks, 10(5), 988–999.
crises. Intelligent Systems in Accounting, Finance and Management, 18(4), Vieira, P. J. M. (2017). Price analysis of bitcoin: Volatility, key drivers and
145–160. evolution.
Saxena, A., Sukumar, T., Nadu, T., & Nadu, T. (2018). Predicting bit-
Virk, D. S. (2017). Prediction of bitcoin price using data mining (Doctoral
coin price using LSTM and compare its predictability with ARIMA
dissertation). National College of Ireland, Dublin.
model. International Journal of Pure and Applied Mathematics, 119(17),
Walther, T., Klein, T., & Bouri, E. (2019). Exogenous drivers of bitcoin
2591–2600.
and cryptocurrency volatility—A mixed data sampling approach to
Shah, D., & Zhang, K. (2014). Bayesian regression and bitcoin. In 2014 52nd
forecasting. Journal of International Financial Markets, Institutions and
annual Allerton conference on communication, control, and computing
Money, 63, 101133.
(Allerton), IEEE, pp. 409–414.
Wang, Y., & Chen, R. (2020). Cryptocurrency price prediction based on mul-
Silva de Souza, M. J., Almudhaf, F. W., Henrique, B. M., Negredo, A. B.
tiple market sentiment. In Proceedings of the 53rd Hawaii International
S., Ramos, D. G. F., Sobreiro, V. A., & Kimura, H. (2019). Can artificial
Conference on System Sciences.
intelligence enhance the bitcoin bonanza. The Journal of Finance and
Wiedmer, J. (2018). The price of cryptocurrencies: an empirical analysis
Data Science, 5(2), 83–98.
(MSc thesis). Univerität Bern, Bern, Switzerland.
Sin, E., & Wang, L. (2017). Bitcoin price prediction using ensembles of neural
networks. In 2017 13th International conference on natural computation, Wu, C. H., Lu, C. C., Ma, Y. F., & Lu, R. S. (2018). November. A new
fuzzy systems and knowledge discovery (ICNC-FSKD), IEEE, pp. 666–671. forecasting framework for bitcoin price with LSTM. In 2018 IEEE
Snihovyi, O., Ivanov, O., & Kobets, V. (2018). Cryptocurrencies prices International Conference on Data Mining Workshops (ICDMW), IEEE, pp.
forecasting with anaconda tool using machine learning techniques. In 168–175.
CEUR Workshop Proceedings, Vol. 2105, pp. 453–456. Yao, Y., Yi, J., Zhai, S., Lin, Y., Zhang, T. K. G., & Yoonjae, L. (2018). Predictive
Sovbetov, Y. (2018). Factors influencing cryptocurrency prices: Evidence analysis of cryptocurrency price using deep learning. International
from bitcoin, ethereum, dash, litcoin, and monero. Journal of Economics Journal of Engineering & Technology, 7(3.27), 258–264.
and Financial Analysis, 2, 1–27. Yaya, O. S., Ogbonna, E. A., & Olubusoye, O. E. (2018). How persistent and
Sovia, R., Yanto, M., Budiman, A., Mayola, L., & Saputra, D. (2019). dependent are pricing of bitcoin to other cryptocurrencies before and
Backpropagation neural network prediction for cryptocurrency bitcoin after 2017/18 crash?.
prices. Journal of Physics: Conference Series, 1339, 012060. Zamuda, A., Crescimanna, V., Burgillo, J. C., Dias, J. M., Wegrzyn-Wolska,
Sreedharan, M., Khedr, A. M., & El-Bannany, M. (2020a). A multi-layer per- K., Rached, I., … Bracciali, A. (2019). Forecasting cryptocurrency value
ceptron approach to financial distress prediction with genetic algorithm. by sentiment analysis: An HPC-oriented survey of the state-of-the-art
Automatic Control and Computer Sciences, 57, 475–482. in the cloud era. In J. Kołodziej & H. González-Vélez (Eds),
Sreedharan, M., Khedr, A. M., & El-Bannany, M. (2020b). A comparative High-performance modelling and simulation for big data applications, Lec-
analysis of machine learning classifiers and ensemble techniques in ture Notes in Computer Science (Vol. 11400), Cham: Springer. https://
financial distress prediction. In 2020 17th International Multi-Conference doi.org/10.1007/978-3-030-16272-6_12
on Systems, Signals & Devices (SSD'20), pp. 653-657. Zhang, L., & Wang, J. (2015). Optimizing parameters of support vec-
Stosic, D., Stosic, D., Ludermir, T. B., & Stosic, T. (2018). Collective behavior tor machines using team-search-based particle swarm optimization.
of cryptocurrency price changes. Physica A: Statistical Mechanics and its Engineering Computations, 32(5), 1194–1213.
Applications, 507, 499–509. Zhang, Y., Chan, S., Chu, J., & Nadarajah, S. (2019). Stylised facts for high
Sun, J., Zhou, Y., & Lin, J. (2019). Using machine learning for cryptocurrency frequency cryptocurrency data. Physica A: Statistical Mechanics and Its
trading. In 2019 IEEE International Conference on Industrial Cyber Physical Applications, 513, 598–612.
Systems (ICPS), IEEE, pp. 647–652.
Sun, X., Liu, M., & Sima, Z. (2020). A novel cryptocurrency price trend
forecasting model based on LightGBM. Finance Research Letters, 32,
101084. How to cite this article: Khedr A, Arif I, P V PR,
Trimborn, S., & Härdle, W. K. (2019). CRIX an index for cryptocurrencies. El-Bannany M, Alhashmi S, S M. Cryptocurrency price
Journal of Empirical Finance, 49, 107–122.
prediction using traditional statistical and machine learning
Troster, V., Tiwari, A. K., Shahbaz, M., & Macedo, D. N. (2019). Bitcoin
techniques: A survey. Intell Sys Acc Fin Mgmt. 2021;28:3–34.
returns and risk: A general GARCH and GAS analysis. Finance Research
Letters, 30, 187–193. https://doi.org/10.1002/isaf.1488
Tupinambás, T. M., Cadence, R. A. L., & Lemos, A. P. (2018). Cryptocur-
rencies transactions advisor using a genetic Mamdani-type fuzzy rules
based system. In 2018 IEEE International Conference on Fuzzy Systems
(FUZZ-IEEE), IEEE, pp. 1–7.

You might also like