0% found this document useful (0 votes)

133 views26 pages

Crypto Predictions

This document appears to be a report submitted by two students, Amirhossein Alimohammadi and Ahmad Nosratbakhsh, to their lecturer Dr. Mohammad Javad Faraji on May 9, 2023. The report discusses predicting cryptocurrency prices using machine learning models, and presents the results of applying various models including Random Forest, LSTM, GRU, Orbit, ARIMA, SARIMAX, Prophet, and XGBoost to cryptocurrency data. It introduces a library called CryptoPredictions that was developed to facilitate comparing different models and metrics for cryptocurrency price forecasting.

Uploaded by

son.nh98bg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

133 views26 pages

Crypto Predictions

Uploaded by

son.nh98bg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Sharif University of Technology

FACULTY OF COMPUTER ENGINEERING

BACHELOR’S DISSERTATION

REPORT 1
Cryptocurrencies Price Prediction

Student Name Student ID

1. Amirhossein Alimohammadi 97110166
2. Ahmad Nosratbakhsh 97110217

Lecturer in charge:
Dr. Mohammad Javad Faraji

Submission Date : 05/09/2023

Contents
1 Introduction 2

2 Machine Learning Technology 3

3 CryptoPredictions 3

4 Models 4
4.1 Random Forest . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4.2 LSTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4.3 GRU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.4 Orbit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.5 Arima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.6 SARIMAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.7 Prophet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.8 XGBOOST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

5 Model Performance 11
5.1 Cross Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.2 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.2.1 Mean Absolute Error (MAE) . . . . . . . . . . . . . . . . 13
5.2.2 Mean Squared Error (MSE) . . . . . . . . . . . . . . . . . 13
5.2.3 Root Mean Squared Error (RMSE) . . . . . . . . . . . . . 13
5.2.4 Mean Absolute Percentage Error (MAPE) . . . . . . . . . 14
5.2.5 Symmetric Mean Absolute Percentage Error (SMAPE) . . 14
5.2.6 Mean Absolute Scaled Error (MASE) . . . . . . . . . . . . 14
5.2.7 Mean Squared Logarithmic Error (MSLE) . . . . . . . . . 15

6 Result 15
6.1 Accuracy Score & F1-Score . . . . . . . . . . . . . . . . . . . . . 15
6.2 Recall Score & Precision Score . . . . . . . . . . . . . . . . . . . . 15
6.3 MAPE, SMAPE, MASE, and MSLE . . . . . . . . . . . . . . . . 17
6.4 Results in Bitcoin . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
6.5 Deduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

7 Conclusion 23

8 References 23
1 Introduction
Cryptocurrency is a form of digital currency that regulates the generation of
currency units and verifies the transfer of funds using encryption techniques. No-
tably, cryptocurrencies are not governed by a central authority and operate on a
decentralized structure. Since the launch of Bitcoin in 2009, cryptocurrencies have
revolutionized the way people transfer money. Cryptocurrency was first proposed
in 1998 by a computer scientist, Wei Dai, who developed a cryptography-based
system that could be used to ease payments between parties. This system, called
”b-money,” laid the groundwork for future cryptocurrencies.

The systematic structural specification of Bitcoin [1] was published in Novem-

ber 2008 by an unknown individual or group using the alias Satoshi Nakamoto.
Bitcoin was the first cryptocurrency to be decentralized. Since the introduction
of Bitcoin in 2009, cryptocurrencies have transformed how money is sent and
received. Bitcoin is still the most popular and valuable cryptocurrency in the
world, despite the creation of thousands of other cryptocurrencies and several
price fluctuations since then. At the time of this writing, Bitcoin’s market capi-
talization exceeds 475 billion US dollars. In addition, the market capitalization
of all active cryptocurrencies, including Bitcoin, reaches 1.17 trillion US dollars
[2].

Due to the decentralized nature of the majority of cryptocurrencies, their

prices are not influenced by interest rates, inflation rates, or monetary policies,
but rather by the perception of users based on news, websites, and other non-
fundamental elements [3]. The stock markets are influenced by a variety of factors
that create uncertainty, including political and economic issues that have a local
or global impact. Understanding the success keys, or factors that provide accurate
predictions, is a difficult task. We can examine the market using any technique,
including technical indicators, price fluctuations, and market technical analysis.
There is thus a need for automated prediction tools to assist investors in deciding
whether to invest in bitcoin or other cryptocurrencies. Modern stock market
predictions typically include automation technologies, and we could apply the
same approach and strategy to this realm of cryptocurrency.

2
2 Machine Learning Technology
Machine Learning is a powerful and effective choice for trading strategies [4].
Its ability to uncover hidden data relationships that may elude human observation
makes it invaluable in predicting numeric outputs like price or volume and iden-
tifying categorical outputs such as trends. By providing the model with heuristic
input data, traders can leverage a wide array of machine learning models to gain
insights and make informed trading decisions.
Several machine learning models have proven successful in trading. Regression
models, including linear regression [5] and support vector regression [6], offer ac-
curate price movement estimation based on historical data. Classification models
like decision trees [7] and random forests [8] excel at identifying market trends and
making categorical predictions. Neural networks, such as deep learning models
[9], are highly adept at capturing complex patterns in financial data.
Extensive research has demonstrated the efficacy of machine learning in trad-
ing, with studies showing superior performance compared to traditional strategies
and higher returns [10] [11]. Furthermore, machine learning techniques have been
employed to analyze alternative data sources like social media sentiment [12] and
news articles [13] to gain a competitive edge in the market.
Machine learning provides traders with a diverse set of models and techniques
that enhance trading strategies. As technology continues to advance and more
data becomes available, the role of machine learning in the financial markets is
expected to grow significantly.

3 CryptoPredictions
In order to provide community with a platform in which different models and
cryptocurrencies are available, we have designed a library named CryptoPredic-
tions. Previous cryptocurrencies price forecasting papers used different metrics
and dataset settings, which caused ambiguities and interpretation problems. To
reduce those differences, we created a CryptoPredictions (a library with 8 models,
30 indicators, and 10 metrics).

CryptoPredictions could be a significant help because:

1. At the outset of our work, we faced a serious challenge of dataset scarcity.

Many papers and repos fetched the data through different websites, such as
Yahoo Finance. However, we have overcome this obstacle by using platforms
such as Bitmex, which offer a common structure for different currencies.

2. Before the advent of our library, users had to run different codes for different
models, making it difficult to compare them fairly. Fortunately, CryptoPre-
dictions has made it possible to conduct a unified and equitable evaluation
of different models.

3. With Hydra, users can easily structure and understand arguments, making
it easier to run codes on different settings and check results. By using
Hydra, users have a better understanding of the arguments. Furthermore,
it is far easier to run a code on different settings and check the result.

3
4. While some models may perform exceptionally well in terms of accuracy,
they often require a well-defined strategy for successful trading. Our back-
tester can help users determine the effectiveness of the used model in real-
world scenarios.

5. We understand that evaluating models can be challenging, which is why

we offer a variety of metrics to help users measure progress towards accom-
plishing their tasks. By analyzing multiple metrics, it is possible to identify
areas for improvement and correct what is not working. We will explain
the pros and cons of each metric later in this report.

6. At CryptoPredictions, we do not fetch indicators from different websites,

because it leads to problems such as null rows and the lack of information on
indicators for all cryptocurrencies. Instead, CryptoPredictions calculates
them in a way that doesn’t carry the mentioned problems and could be
generalized to other datasets.

4 Models
In this section, the information about different models that are used in the
library can be obtained.

4.1 Random Forest

Random Forest is a machine learning technique that employs decision trees to
define a prediction model [14]. Several decision trees are utilized by the Random
Forest technique to forecast Bitcoin prices [15]. Random forests have been used
to forecast the direction of a stock price trend. In 2016, Khaidem et al. employed
random forests to forecast trends in the values of Apple, Samsung Electronics,
and General Electric shares traded on the NASDAQ.
Specifically, Random Forest is an ensemble technique consisting of decision
trees and bagging [16]. It causes different data samples to be used to train each
tree for the same problem. Different trees view distinct chunks of the data while
employing bagging. No tree is exposed to all training data. So, when integrating
their results, certain inaccuracies are compensated for others, resulting in a more
generalizable prediction [17].
Overfitting is a major issue that can have devastating effects on results; how-
ever, if there are enough trees in the forest, the Random Forest algorithm will
not overfit the model. Moreover, missing values can be handled by the random
forest. As previously stated, the random forest is based on a different technique
known as bagging, which entails running a bootstrap on the experiment’s data
sampling on the input variables and employing a fixed number of variables.
A random sample of N cases from the training set is taken with replacement.
This sample constitutes the training set to build tree i. Given that the number of
input variables is M, the number of variables selected for each node is m (m < M),
which remains constant during block generation. The block is then associated
with the node by splitting it and utilizing the optimal division of its m prop-
erties. The number of predictors evaluated in each division is roughly equal to
the square root of the total number of predictors. The most effective method for
determining the appropriate value is to analyze the out-of-bag MSE for various

4
values of m. In general, if the variables chosen at each node are highly connected,
small values of m result in favorable outcomes. As a result, at each node, m
observations are used for training and M-m observations are used for testing.
Globally, given a training set D of size n, m new training sets Di=1, ..., m of size n′
are created from m samples with replacement. Afterwards, each decision tree is
trained using the data set Di . In addition, it enters the nodes of each tree in or-
der to make a forecast when it has a new observation, as shown shown in Figure 1.

Figure 1: Structure of a random forest https://www.paradigmadigital.com/

techbiz/machine-learning-dummies/

4.2 LSTM
LSTM (Long Short Term Memory) is another form of RNN module. Hochre-
iter and Schmidhuber (1997) [19] created LSTM, which was later developed and
popularized by several researchers. The LSTM network consists of modules with
recurrent consistency, similar to an RNN. The distinction between LSTM and
RNN is the connectivity between the hidden layers of RNN. The RNN expla-
nation structure is depicted in Figure. The only distinction between RNN and
LSTM is the memory cell of the structure’s hidden layer. And the design of
three unique gates efficiently resolves gradient issues. Figure depicts the LSTM
memory structure of the hidden layer [20].
Figure 2 explains that the RNN has deficiencies, which may be observed in
the input. This problem was discovered by Bengio, et al. (1994) [21]. X0 , X1
have a very large range of information Xt , Xt+1 , so that when ht+1 requires
information, those that are relevant to X0 , X1 in the RNN are unable to learn
to link information. Because the old memory that is saved becomes increasingly
useless over time given that it is overwritten or replaced by new memory.
As it is shown in Figure 3, the LSTM’s special units(recurrent hidden lay-
ers) contain memory blocks. In addition to memory cells with self-connections
that store the network’s temporal state, the memory blocks also contain multi-
plicative units called gates that regulate the flow of information. In the original
architecture, each memory block comprised an input gate and an output gate.
Controlling the flow of input activations into the memory cell is the input gate.
The output gate regulates the flow of cell activations from the cell to the remain-
der of the network. Subsequently, the memory block received the forget gate [22].

5
Figure 2: The Expanded Structure of RNN [18]

This addressed a shortcoming of LSTM models that prevented them from pro-
cessing continuous input streams that were not divided into subsequences. The
forget gate scales the internal state of the cell prior to adding it as input to the
cell via its self-recurrent link, thereby forgetting or resetting the cell’s memory in
an adaptive manner. In addition, the contemporary LSTM architecture includes
peephole connections from its internal cells to the gates in the same cell in order
to learn precise output timing [23].

Figure 3: LSTM architecture: a single memory block is shown for clarity.

A mapping from an input sequence x = (X0 , X1 , ..., Xt ) to an output se-

quence y = (Y0 , Y1 , ..., Yt ) is computed by an LSTM network through calculating
the network unit activations iteratively from i=1 to t as follows:
In these formulas, σ, i, f, o and c are respectively the logistic sigmoid function,
input gate, forget gate, output gate, and cell activation vectors, all of which are
the same size as the cell output activation vector m. W denote as weight matrices
(e.g. Wix is the matrix of weights from the input gate to the input, Wic , Wf c ,
Wpc are diagonal weight matrices for peephole connections), the b terms denote
bias vectors (e.g. bi is the input gate bias vector). Furthermore, g and h are the
cell input and cell output activation functions, respectively, and tanh and are the
network output activation functions, respectively, and softmax is present.

4.3 GRU
A gated recurrent unit (GRU) was presented [24] to enable each recurrent unit
to capture adaptive dependencies on several time scales. Similar to the LSTM
unit, the GRU possesses gating units that influence the flow of information inside
the unit, but without distinct memory cells.

6
The architecture of the Gated Recurrent Unit:
In Figure 4, we have a GRU cell that is comparable to an LSTM cell or RNN
cell.

Figure 4: GRU architecture

At each timestamp t, it receives an input Xt and the preceding timestamp’s

hidden state Ht−1 . Subsequently, it outputs a new hidden state Ht , which is
consequently sent to the subsequent timestamp. Currently, a GRU cell consists
mostly of two gates, as opposed to three gates in an LSTM cell. The initial gate
is the Reset gate, while the second is the update gate [25].
Reset Gate (short-term memory)
The reset gate is responsible for the short-term memory of the network i.e the
hidden state Ht . Here is the equation of the reset gate.

rt = σ(xt ∗ Ur + Ht−1 ∗ Wr )

It resembles the LSTM gate equation. The sigmoid function limits rt to (0,1) of
weight matrices Ur and Wr .
Update Gate (long-term memory)
Similarly, we have an update gate for long-term memory and the equation of the
gate is shown below.

ut = σ(xt ∗ Uu + Ht−1 ∗ Wu )

The only difference is in the weight metrics, i.e., Uu and Wu .

Current Memory Gate
During a normal explanation of Gated Recurrent Unit Network, it is frequently
disregarded. It is a component of the reset gate, just as the Input Modulation
Gate is a subcomponent of the Input Gate, and is used to introduce nonlinearity
to the input and make it zero-mean. A second reason to include it as a sub-
part of the Reset Gate is to lessen the impact that past information has on
the information that is being transmitted into the future. The calculation is as
follows:

Ĥt = tanh(xt ∗ Ug + (rt ◦ Ht−1 ) ∗ Wg )

4.4 Orbit
Uber’s Orbit is an open source package designed to ease time series infer-
ences and forecasts using structural Bayesian time series models for real-world
applications and scientific study [26].It employs probabilistic programming lan-
guages such as Stan [27] and Pyro [28] while providing a familiar and intuitive
initialize-fit-predict interface for time series workloads.

7
It introduces a collection of refined Bayesian exponential smoothing models
with a wide range of priors, model type specifications, and noise distribution
options. The model includes a novel global trend term that is effective for short-
term time series. Most significantly, it includes a well-crafted Python compute
software/package named Orbit (Object-oriented Bayesian Time Series). The un-
derlying MCMC sampling process and optimization are handled using the proba-
bilistic programming languages Stan (Carpenter et al., 2017) and Pryo (Bingham
et al., 2019). Pyro, created by Uber researchers, is a universal probabilistic pro-
gramming language (PPL) built in Python and backed on the backend by PyTorch
and JAX. Orbit presently has a subset of the available prediction and sampling
algorithms for Pyro estimating.

4.5 Arima
The Autoregressive Integrated Moving Average (ARIMA) method was de-
veloped in 1970 by George Box and Gwilyn Jenkins and is also known as the
BoxJenskins method [29]. The ARIMA method completely ignores independent
variables while predicting, making it suited for interconnected statistical data
(dependent) and requiring some assumptions such as autocorrelation, trend, or
seasonality. The ARIMA method can predict historical data with the influence
of difficult-to-understand data, has a high degree of accuracy in short-term fore-
casting, and can deal with seasonal data variations.
The ARIMA method is classified into four categories: Autoregressive (AR),
Moving Average (MA), Autoregressive Moving Average (ARMA), and Autore-
gressive Integrated Moving Average (ARIMA) [30][31].

1. Autoregressive (AR)
It was introduced by Yule in 1926 and expanded by Walker in 1932. This
model assumes that data from prior periods is currently influencing current
data. It’s called autoregressive since it’s rebuilt against the variable’s prior
values in this model. The AR method is used to calculate the order value
of the coefficient p, which represents a value’s dependence on its previous
nearest value [32].
The the general form of an AR model with order p (AR (p)) or an ARIMA
model (p, 0, 0)is as follows:

Xt = µ + ϕ1 Xt−1 + ϕ2 Xt−2 + ... + ϕp Xt−p + et

2. Moving Average (MA)

It was first presented by Slutzky in 1973. The MA approach is used to
calculate the order coefficient q, which accounts for the varied movement of
the prior residual value [33]. The following is the generic form of the MA
model with order q (MA(q) ) or ARIMA model (0, 0, q).

Xt = et − ϕ1 et−1 − ϕ2 et−2 − ... − ϕq Xt−q

3. Autoregressive Moving Average (ARMA)

This model combines the AR and MA models. Assume that the current pe-
riod data is influenced by prior period data and the preceding period’s forced

8
value. The following are common forms of the AR and MA or ARIMA pro-
cesses’ models (p, 0, q).
Xt = µ + ϕ1 Xt−1 + ϕ2 Xt−2 + ... + ϕp Xt−p + et −
ϕ1 et−1 − ϕ2 et−2 − ... − ϕq Xt−q

4. Autoregressive Integrated Moving Average (ARIMA)

The ARIMA Model assumes that the data used must be stationary, which
means that the average variation of the data is constant. Nonstationary
data must first be transformed into stationary data using a differencing
procedure. The ARIMA technique is a statistical viewpoint method repre-
sented by three parameters, the first of which is the data AR process of the
previous period, which is taken and maintained later in the Integrated pro-
cess making the data easier to predict. The common version of the ARIMA
model (p, d, q) is as follows:
Xt = µ + Xt−1 + Xt−d + ... + ϕ1 Xt−1 + ϕ2 Xt−2 + ... +
ϕp Xt−p + et − ϕ1 et−1 − ϕ2 et−2 − ... − ϕq et−q

4.6 SARIMAX
ARMA is the combination of the AR and MA models. Adding an integration
operator to an ARMA model produces an ARIMA model. A SARIMAX model
incorporates exogenous variables assessed at time t that influence the value of
input data at time t and integer multipliers of seasonality [34]. The parameters
required to define the SARIMAX model are listed in Table.

Figure 5: SARIMAX, List of symbols and model parameters.

The formula for the Autoregressive Model AR(p) is:

yt = Θ(L)p ∗ yt + ϵt
The Moving Average Model MA(d) is represented as follows:
yt = ϕ(L)q ∗ ϵt + ϵt
Autoregressive Moving Average Model ARMA(p,q) can be written as:
yt = Θ(L)p ∗ yt + ϕ(L)q ∗ ϵt + ϵt
and the Autoregressive Integrated Moving Average Model ARIMA(p,d,q) is
stated as:
[d] [d−1] [d−1]
yt = ∆ d ∗ yt = yt − yt−1
∆d ∗ yt = Θ(L)p ∗ ∆d ∗ yt + ϕ(L)q ∆d ∗ ϵt + ∆d ∗ ϵt

9
Finally, Seasonal Autoregressive Integrated Moving Average Model with Ex-
plicative Variable SARIMAX((p, d, q) * (P, D, Q) is stated as follows:

n
X
p s p
Θ(L) ∗ θ(L ) ∗ ∆D
s
d q
∗ ∆ ∗ yt = ϕ(L) ∗ ϕ(L ) ∗ ∆ ∗ s Q d
∆D
s ∗ ϵt + βi ∗ xit
i=1

4.7 Prophet
Prophet is a method for forecasting time series data based on an additive
model in which non-linear trends are fitted with annual, weekly, and daily sea-
sonality, in addition to holiday effects. It is most effective when applied to time
series with substantial seasonal effects and multiple seasons of historical data.
Prophet is robust to missing data and fluctuations in the trend, and it typically
handles outliers well [35].

p(yt , yt−1 , ...., y1 ) =

ℵ(yt |m(t), σ ).ℵ(yt−1 |m(t − 1), σ 2 )...ℵ(y1 |m(1), σ 2 )
2

As long as the Prophet accurately displays the conditional mean and con-
ditional variance, it should function adequately. Mathematically, we have this
formula:

mprophet (t + h) ≈ E[yt+h |yt , ..., y1 ]

vprophet (t + h) = σ 2 ≈ Var[yt+h |yt , ..., y1 ]
F or all f orecast periods t + h

This could be the case if the underlying system is in a condition of equilibrium,

such as when the economy is stable. Therefore, once a significant shock occurs,
the variance criterion will very certainly be violated. This is precisely what we
observed in the preceding time series example.

4.8 XGBOOST
Gradient boosting (GBM) trees do unsupervised learning, in which they learn
from data without a specified model. XGBoost is a popular gradient-boosting
library. It can be used for GPU training, distributed computing, and paral-
lelization. It is accurate, adaptable to all forms of data and situations, well-
documented, and extremely user-friendly.
XGBoost is an abbreviation for Extreme Gradient Boosting. It is a properly
parallelized and optimized version of the gradient boosting technique. Paralleliz-
ing the entire boosting procedure drastically reduces training time.
Rather than training the best possible model on the data (as is the case with
conventional approaches), they trained hundreds of models on various subsets of
the training dataset and then conducted a vote to determine the model with the
best performance.
In many situations, XGBoost is superior to conventional gradient-boosting
methods. The Python implementation provides access to a huge array of inner
parameters that can be modified to improve precision and accuracy.
Parallelization, regularization, non-linearity, cross-validation, and scalability
are some of XGBoost’s most essential characteristics.

10
The XGBOOST algorithm works in such a way that it considers or estimates
a function. To begin, we generate a sequence based on the function gradients.
The following equation models a specific type of gradient descent. It specifies
the direction in which the function decreases, as it represents the loss function to
minimize. corresponds to the learning rate in gradient descent and is the rate of
change fitted to the loss function. is anticipated to replicate the loss’s behavior
adequately.
∂F
Fxt +1 = Fxt + ϵxt (xt )
∂x
To iterate over the model and determine its optimal formulation, we must
describe the entire formula as a sequence and identify a function that will converge
to the function’s minimum. This function will serve as an error metric to help
us minimize losses and sustain performance over time. The series approaches the
minimal value of the function. This specific notation denotes the error function
that applies while evaluating a gradient boosting regressor [36].
X
f (x, θ) = l(F ((Xi , θ), yi ))

5 Model Performance
In this section, the information about a common validation method for ad-
justing hyperparameters and different metrics for validating the prediction are
provided.

5.1 Cross Validation

Cross-validation (CV) is a well-known method for adjusting hyperparam-
eters and generating accurate measures of model performance. k-fold cross-
validation and hold-out cross-validation are two of the most prevalent types of
cross-validation [37].
First, we divided the dataset into the training set and the test set. If any
parameters require tuning, we divide the training set into a training subset and a
validation subset. On the training subset, the model is trained, and the parame-
ters that minimize error on the validation set are selected. Finally, the model is
trained on the complete training set using the selected parameters, and the error
on the test set is recorded.
But there is a serious difference when we want to use cross validation in time
series. Because of two reasons:
1. Temporal Dependencies
Particular care must be used while separating time series data in order to
prevent data leaking. To accurately imitate the ”actual world forecasting
environment, in which we stand in the present and predict the future”[27]
, the forecaster must withhold any data regarding events that occur after
the events used for fitting the model. Instead of k-fold cross-validation,
we employ hold-out cross-validation for time series data, where a portion
of the data (temporally separated) is reserved for assessing the model’s
performance. For example, observe the Split 4 in the above Figure 6 where
the validation data which is Fold 4 comes before the Fold 5 which is the
part of the training data.

11
Figure 6

2. Arbitrary Choice of Test Set

You may notice that the selection of the test set in Hold-out validation is
somewhat random, which may imply that our test set error is an inadequate
estimate of error on a separate test set.

Figure 7

To overcome this, we employ the Nested Cross-Validation approach. Nested

CV has an outside loop for error estimates and an inner loop for parame-
ter adjustment (see the Figure below). The inner loop operates precisely as
previously described: the training set is divided into a training subset and a
validation set, the model is trained on the training subset, and the parame-
ters that minimize error on the validation set are selected. Nevertheless, we
now include an outside loop that divides the dataset into numerous train-
ing and test sets, and the error on each split is averaged to obtain a robust
estimate of model error. This is beneficial because a nested cross-validation
procedure provides an almost unbiased estimate of the true error [38].

Method: After setting the number of splits, the training dataset is splitted
to equal subsets. For instance, there are 6 equal subsets in the above figure. In
the k th iteration of the outer loop,the k first subsets are considered as a training
subset and k + 1th is considered as a validation subset.

12
Figure 8

5.2 Metrics
After the obtaining the final predictions of the model, validating the data will
usually be carried out by calculating [?]:

5.2.1 Mean Absolute Error (MAE)

It is the mean of the absolute value of the differences between the forecasting
price and the actual value. It is easy to interpret and it benefits you by offering
errors in the units of the data and the prediction. However, it does not penalize
outliers(which could be not very important in price prediction). The most notable
drawback of this metric is that it is scale dependent so we can not compare
different cryptocurrencies with different units.
n
1X
M AE = |Ai − Fi |
n i=1

5.2.2 Mean Squared Error (MSE)

It is the mean of the square of the differences between the forecasting price
and the actual value. In this metric, outliers are heavily punished. On the other
hand, as the error is not in the original units of the data and prediction, it is
harder to interpret. It is scale dependent as well so we have the similar issue like
in MAE.
n
1X
M SE = (Ai − Fi )2
n i=1

5.2.3 Root Mean Squared Error (RMSE)

It is the same as MSE apart from at the end we square root the result.In this
metric, outliers are heavily punished like MSE and it has the strong point of being
in the units of the data and prediction. It is kind of the best of both MSE and

13
MAE worlds. But, since you square the error, it could be still less interpretable.
Furthermore, it is scale dependent.
v
u n
u1 X
RM SE = t (Ai − Fi )2
n i=1

5.2.4 Mean Absolute Percentage Error (MAPE)

Mean absolute percentage error is the mean of the percentage difference be-
tween the actual value and prediction. This is often used as the baseline metric
to measure most forecasting models. MAPE is not only easy to interpret but also
scale independent which allows us to compare different cryptocurrencies. Never-
theless, in the cryptocurrencies with actual value near zero, we could have infinite
error. In this metric, lower forecasts are bound to 100% error but higher forecasts
could increase to infinite error, thus, it is biased to under-forecast.

n
1X |Ai − Fi |
M AP E = (100 × )
n i=1 Ai

5.2.5 Symmetric Mean Absolute Percentage Error (SMAPE)

It is an extension of MAPE. It is the mean of the 200 times the difference be-
tween the actual value and prediction divided by the sum of their absolute values.
It no longer favors the under forcastings. It is now fully bounded between 0% and
20%. Since the denominator could be still around zero, there is still a chance of
infinite values. Additionally, interpreting a metric between 0% and 200% could
be difficult. One supposed problem with SMAPE is that it is not symmetric
since over and under forecasts are not treated equally. This is illustrated by the
following example by applying the SMAPE formula:
• Over-forecasting: At = 100 and Ft = 110 give SMAPE = 9.09%
• Under-forecasting: At = 100 and Ft = 90 give SMAPE = 10.52%

n
1X |Ai − Fi |
SM AP E = (200 × )
n i=1 |Ai | + |Fi |

5.2.6 Mean Absolute Scaled Error (MASE)

It has two modes. In case of no seasonality:

1
Pn
n i=1 Ai − Fi M AE
M ASE = 1
PT = 1
PT
T −1 t=2 At − At−1 T −1 t=2 At − At−1

In case of no seasonality:

1
Pn
n i=1 Ai − Fi M AE
M ASE = 1
PT = 1
PT
T −m t=m+1 At − At−m T −m t=m+1 At − At−m

14
This is the mean absolute scaled error for both seasonal and non-seasonal time
series and is probably the best and most fair metric to use. This metric compares
the output to the naive forecast.
Naive forecasts are the most cost-effective forecasting model, and provide
a benchmark against which more sophisticated models can be compared. This
forecasting method is only suitable for time series data. Using the naive approach,
forecasts are produced that are equal to the last observed value. This method
works quite well for economic and financial time series, which often have patterns
that are difficult to reliably and accurately predict. If the time series is believed
to have seasonality, the seasonal naive approach may be more appropriate where
the forecasts are equal to the value from last season.
In time series notation: ŷT +h|T = yt
In MASE, if the error is less than one, then it can be concluded that the
forecast is better than an averaged naive forecast. On the contrary, if there
is more than one, the forecast is worse than an averaged naive forecast. The
advantages of this metric are scale independency and penalizing under and over
forecasting equally.

5.2.7 Mean Squared Logarithmic Error (MSLE)

It measures the ratio or relative difference between the actuals and the fore-
casted values by calculating the mean of the square of the logarithm of the actual
value plus one divided by the forecasted value plus one. This metric punishes
the under forecasting more than over forecasting. Nonetheless, it is not easy to
interpret.
n
1X Ai + 1 2
M SLE = (log( ))
n i=1 Fi + 1

6 Result
To have a fair evaluation, we decided to compare different models on different
cryptocurrencies and metrics. The training dataset is equal in all models which is
from ’2022-11-13 13:30:00’ to ’2023-01-01 9:30:00’. Also, the test dataset is equal
in all models as well and is from ’2023-01-01 10:30:00’ to ’2023-02-16 10:30:00’.
We used hourly data and report the results in the below graphs.

6.1 Accuracy Score & F1-Score

As it can be seen in the Figure 9 and Figure 10, different cryptocurrencies’
graphs are close to each other. On roughly all the models, ARIMA and SARIMAX
have the best results. In the second place, we have the Prophet, which has a
superb result near ARIMA and SARIMAX. After that, we have Orbit whose
result is not as good as other three models, however it is acceptable. Finally,
we have XGBoost, Random Forest, LSTM, and GRU which have typical results
close to each other around 55% in terms of accuracy and 0.5 in terms of F1-Score.

6.2 Recall Score & Precision Score

It is evident in Figure 11 & Figure 12, the results of SARIMAX, ARIMA,
Prophet and Orbit indicate a considerable difference with others, and so we ob-

15
Figure 9

Figure 10

serve the same graph to the former. Meanwhile in these two metrics the results
for ADA, AVAX and AXS show the lower score than others and it makes the task
harder for prediction.

Figure 11

16
Figure 12

Figure 13

6.3 MAPE, SMAPE, MASE, and MSLE

Nevertheless, we have a different results with regard to MAPE, SMAPE,
MASE, and MSLE. The results are shown in Figure 14 & Figure 15 & Fig-
ure 16. Despite the stunning result in terms of Accuracy and F1-Score, SARI-
MAX demonstrates a poor result. So it might not be a proper choice for price
prediction. GRU and Random Forest has a poor result in the second place in
these metrics. It can be seen that prophet shows the best result in all these met-
rics. On the other hand the results of Orbit, ARIMA, LSTM, and XGBOOST
are placed between SARIMAX and Prophet.

Figure 14

17
Figure 15

Figure 16

6.4 Results in Bitcoin

In this section, we conduct previous experiences particularly on BTC. The
detail of the experiments is evident in Table 1.

Table 1: Experimental result for Second order low pass filter

Accuracy F1 Recall Precision MAE RMSE MAPE SMAPE MASE MSLE
Prophet 0.71 0.73 0.74 0.73 83 116 0.40 0.40 1.7 0.00004
Orbit 0.68 0.70 0.69 0.70 137 184 0.65 0.66 2.7 0.00010
SARIMAX 0.70 0.70 0.69 0.71 1119 1391 5.48 5.74 24.6 0.00846
ARIMA 0.72 0.73 0.71 0.75 1009 1163 4.47 4.55 16.6 0.00277
XGBOOST 0.56 0.58 0.57 0.59 796 1115 3.60 3.76 15.1 0.00352
LSTM 0.54 0.54 0.53 0.56 1140 1434 5.56 5.81 25.1 0.00816
GRU 0.55 0.56 0.56 0.57 1140 1424 5.60 5.82 25.2 0.00804
Random Forest 0.58 0.59 0.58 0.61 787 1098 3.55 3.71 14.8 0.00347

In the BTC case, we can see similar results. The detail of the results is shown
in Figure 14 to Figure 26.

6.5 Deduction
With respect to the previous graphs, it can be concluded that despite the
superb result of SARIMAX and ARIMA, they have problems with predicting
the price accurately. Meanwhile, Orbit works better in terms of MAPE, MAE,

18
Figure 17

Figure 18

Figure 19

SMAPE, MASE, and MSLE. Lastly, Prophet not only shows a good performance
in accuracy and F1-score but also demonstrates a stunning result in MAPE,
MAE, SMAPEm and MSLE compared to other methods.

19
Figure 20

Figure 21

Figure 22

20
Figure 23

Figure 24

Figure 25

21
Figure 26

22
7 Conclusion
In summary, this report explored various aspects of cryptocurrency forecast-
ing, machine learning models, and evaluation metrics. The introduction provided
an overview of cryptocurrencies, their decentralized nature, and their significant
impact on the financial landscape.
The section on machine learning technology highlighted the suitability of ma-
chine learning models for cryptocurrency trading strategies, emphasizing their
ability to uncover hidden data relationships.
The CryptoPredictions library was introduced as a valuable platform for cryp-
tocurrency price forecasting, to overcome challenges such as dataset scarcity and
the need for unified evaluation of different models. The library’s features, includ-
ing data collection, model evaluation, and indicator calculation, were outlined.
The models section covered several prominent models used for cryptocurrency
forecasting, including Random Forest, LSTM, GRU, Orbit, ARIMA, SARIMAX,
Prophet, and XGBOOST. Each model was briefly described, showcasing their
unique characteristics and applications.
Lastly, the discussion delved into model performance evaluation, highlighting
the importance of cross-validation for hyperparameter tuning and model selec-
tion. Various metrics, such as MAPE, MAE, SMAPE, MASE, MSLE, accuracy,
and F1-score, were identified as essential tools for assessing the accuracy and
effectiveness of the forecasting models.
Overall, while different models showed varying levels of performance in terms
of accuracy and metrics, it was observed that Orbit and particularly Prophet con-
sistently demonstrated strong results across multiple evaluation criteria. These
models exhibited the potential to provide accurate and reliable cryptocurrency
price predictions.
It is worth noting that the field of cryptocurrency forecasting is dynamic
and evolving, and further research and experimentation are necessary to con-
tinually improve prediction accuracy and adapt to changing market conditions.
The CryptoPredictions library and the models discussed in this exploration pro-
vide valuable tools and insights for researchers, traders, and investors seeking to
navigate the world of cryptocurrency with greater confidence and understanding.

8 References
[1] S. Nakamoto, ”Bitcoin: A Peer-to-Peer Electronic Cash System,” 2008.

[2] ”Coin Market Cap,” [Online]. Available: https://coinmarketcap.com/

currencies/bitcoin/. [Accessed 09 02 2023].

[3] Makarov, I., & Schoar, A. (2020). Trading and arbitrage in cryptocurrency
markets. Journal of Financial Economics, 135(2), 293–319.

[4] McNally, Sean, et al. “Predicting the Price of Bitcoin Using Machine Learn-
ing.” 2018 26th Euromicro International Conference on Parallel, Distributed
and Network-Based Processing (PDP), 2018, pp. 339–43.

[5] Geron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and
TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems.
O’Reilly Media.

23
[6] Drucker, H., Burges, C. J., Kaufman, L., Smola, A., Vapnik, V. (1997). Sup-
port vector regression machines. Advances in neural information processing
systems, 9, 155-161.

[7] Breiman, L., Friedman, J., Stone, C. J., Olshen, R. A. (1984). Classification
and regression trees. CRC press.

[8] Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.

[9] LeCun, Y., Bengio, Y., Hinton, G. (2015). Deep learning. Nature, 521(7553),
436-444.

[10] Zheng, H., Shi, J., Zhang, X., Li, F., Li, G. (2020). Deep reinforcement
learning for stock trading: From models to reality. IEEE Transactions on
Neural Networks and Learning Systems, 32(6), 2563-2575.

[11] Grootveld, M., Hallerbach, W. (2018). Machine learning for trading. The
Journal of Portfolio Management, 44(3), 113-125.

[12] Bollen, J., Mao, H., Zeng, X. (2011). Twitter mood predicts the stock
market. Journal of computational science, 2(1), 1-8.

[13] Ma, J., Gao, W., Fan, Y. (2020). News-driven stock market prediction
using multi-scale deep neural networks. Expert Systems with Applications,
150, 113274.

[14] Mitchell, T. M. (1999). Machine learning and data mining. Communications

of the ACM, 42(11), 30–36.

[15] Breiman, L. (2001). Random forests. Machine learning, 45(1), 5–32.

[16] Breiman, L. (1996). Bagging predictors. Mach. Learn. 24, 123–140

[17] Biau, G. (2012). Analysis of a random forests model. The Journal of Machine
Learning Research, 13(1), 1063–1095.

[18] M. W. P. Aldi, J. Jondri, and A. Aditsania, “Analisis Dan Implementasi

Long Short Term Memory Neural Network Untuk Prediksi Harga Bitcoin,”
eProceedings Eng., vol. 5, no. 2, 2018.

[19] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Com-

put., vol. 9, no. 8, pp. 1735– 1780, 1997.

[20] F. Qian and X. Chen, “Stock Prediction Based on LSTM under Different
Stability,” in 2019 IEEE 4th International Conference on Cloud Computing
and Big Data Analysis (ICCCBDA), 2019, pp. 483–486.

[21] Y. Bengio, P. Simard, and P. Frasconi, “Learning Long-Term Dependencies

with Gradient Descent is Difficult,” IEEE Trans. Neural Networks, 1994

[22] E. Kristensen, S. Østergaard, M. A. Krogh, and C. Enevoldsen, “Technical

Indicators of Financial Performance in the Dairy Herd,” J. Dairy Sci., 2008.

[23] C. Scheier and W. Tschacher, “Appropriate algorithms for nonlinear time

series analysis in psychology,” in Nonlinear dynamics in human behavior,
World Scientific, 1996, pp. 27–43.

24
[24] Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, Yoshua Bengio “Em-
pirical Evaluation of Gated Recurrent Neural Networks on Sequence Model-
ing”

[25] https://www.analyticsvidhya.com/blog/2021/03/introduction-to-gated-
recurrent-unit-gru/

[26] Edwin Ng, Zhishi Wang, Huigang Chen, Steve Yang, and Slawek Smyl. 2021.
Orbit: Probabilistic Forecast with Exponential Smoothing. arXiv:2004.08492
[stat.CO]

[27] Bob Carpenter, Andrew Gelman, Matthew D. Hoffman, Daniel Lee. “Stan :
A Probabilistic Programming Language”

[28] Eli Bingham, Jonathan P. Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj
Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall,
Noah D. Goodman. “Pyro: Deep Universal Probabilistic Programming”

[29] E. B. Dagum, “The X-II-ARIMA seasonal adjustment method,” 2005.

[30] A. Hendranata, “ARIMA (Autoregressive Integrated Moving Average),”

2003.

[31] Y. S. Lee and L. I. Tong, “Forecasting time series using a methodology

based on autoregressive integrated moving average and genetic program-
ming,” Knowledge-Based Syst., vol. 24, no. 1, pp. 66–72, Feb. 2011.

[32] Hillmer, S. Craig, and George C. Tiao, “An ARIMA-Model-Based Approach

to Seasonal Adjustment,” vol. 10, no. 1, pp. 5–24, 2017.

[33] ] S. E. Said and D. A. Dickey, “Testing for unit roots in autoregressivemoving

average models of unknown order.” pp. 599–607, 1984.

[34] From ar to sarimax: Mathematical definitions of time series models,

https://phosgene89.github.io/sarima.html, journal=phosgene89.github.io.

[35] https://facebook.github.io/prophet/

[36] Tianqi Chen, Carlos Guestrin, “XGBoost: A Scalable Tree Boosting System”

[37] L. J. Tashman. Out-of-sample tests of forecasting accuracy: an analysis and

review. International Journal of Forecasting, 16(4):437–450, 2000.

[38] S. Varma, R. Simon ”Bias in error estimation when using cross-validation

for model selection”, 2006.

Mastering Crypto Trading Ebook Rev1
No ratings yet
Mastering Crypto Trading Ebook Rev1
4 pages
25 01 14 Future Payments Bitcoin Lessons From Its Biggest Rallies
No ratings yet
25 01 14 Future Payments Bitcoin Lessons From Its Biggest Rallies
13 pages
Cryptocurrency Investment Trends 2024-2025
No ratings yet
Cryptocurrency Investment Trends 2024-2025
13 pages
Cyber Security
No ratings yet
Cyber Security
3 pages
Facing Lifes Challenges
No ratings yet
Facing Lifes Challenges
2 pages
Global Health and Sustainability: HCR6001D
No ratings yet
Global Health and Sustainability: HCR6001D
8 pages
"Smacking Never Hurt Me
No ratings yet
"Smacking Never Hurt Me
9 pages
Cryptocurrency Price Prediction Using Linear Regression and Long Short-Term Memory (LSTM)
No ratings yet
Cryptocurrency Price Prediction Using Linear Regression and Long Short-Term Memory (LSTM)
10 pages
A Review of Machine Learning Algorithms For Cryptocurrency Price Prediction
No ratings yet
A Review of Machine Learning Algorithms For Cryptocurrency Price Prediction
9 pages
Review of Deep Learning Models For Crypto Prices Prediction
No ratings yet
Review of Deep Learning Models For Crypto Prices Prediction
29 pages
Day 1 Crypto Technical Analysis Module
No ratings yet
Day 1 Crypto Technical Analysis Module
16 pages
Cryptocurency Articol
No ratings yet
Cryptocurency Articol
30 pages
CRYPTOCURRENCY PPT by Swaleha Khan
No ratings yet
CRYPTOCURRENCY PPT by Swaleha Khan
14 pages
29 Analysis of Darknet Sales-eCrime 2020-Preconference Version
No ratings yet
29 Analysis of Darknet Sales-eCrime 2020-Preconference Version
9 pages
ITC Financial Model
No ratings yet
ITC Financial Model
18 pages
The Complete Guide To Day Trading A Practical Manual From A Professional Day Trading Coach First Edition Markus Heitkoetter Instant Download
100% (5)
The Complete Guide To Day Trading A Practical Manual From A Professional Day Trading Coach First Edition Markus Heitkoetter Instant Download
81 pages
UltimateTrader+G P@FB
No ratings yet
UltimateTrader+G P@FB
23 pages
The Psychology of Cryptocurrency Trading Risk and Protective Factors
No ratings yet
The Psychology of Cryptocurrency Trading Risk and Protective Factors
7 pages
Crypto Industry in India
No ratings yet
Crypto Industry in India
63 pages
Decentralized Finance
No ratings yet
Decentralized Finance
3 pages
Top Down Analysis (ICT Core Content)
No ratings yet
Top Down Analysis (ICT Core Content)
13 pages
Cryptocurrency - Wikipedia
No ratings yet
Cryptocurrency - Wikipedia
31 pages
Complete Course On Blockchain and Crypto Currency
No ratings yet
Complete Course On Blockchain and Crypto Currency
97 pages
Rafael Gamzo. 323979. Research Report. Algorithmic Trading, Market Efficiency and The Momentum Effect
No ratings yet
Rafael Gamzo. 323979. Research Report. Algorithmic Trading, Market Efficiency and The Momentum Effect
122 pages
1.5.3.6 Packet Tracer - Communicating in A Cyber World
No ratings yet
1.5.3.6 Packet Tracer - Communicating in A Cyber World
6 pages
Final Year Project-1
No ratings yet
Final Year Project-1
55 pages
Definition of Cryptocurrency
No ratings yet
Definition of Cryptocurrency
13 pages
David's 4H Trendline Auto-Trading Robot User Guide
No ratings yet
David's 4H Trendline Auto-Trading Robot User Guide
5 pages
Global Crypto Index 2021
No ratings yet
Global Crypto Index 2021
69 pages
Algorithmic Trading 2023ap
No ratings yet
Algorithmic Trading 2023ap
4 pages
Advance Learning by Cartel
No ratings yet
Advance Learning by Cartel
16 pages
Shah Services List Dec 2024
0% (1)
Shah Services List Dec 2024
4 pages
How To Use RSI For Day Trading
No ratings yet
How To Use RSI For Day Trading
5 pages
Ict's - MMXM - How - Thread - by - Tradngpatiently - Mar 30, 24 - From - Rattibha
No ratings yet
Ict's - MMXM - How - Thread - by - Tradngpatiently - Mar 30, 24 - From - Rattibha
10 pages
Daily Trading Plan For Consistent Profits
No ratings yet
Daily Trading Plan For Consistent Profits
3 pages
Forex 101
No ratings yet
Forex 101
18 pages
Momentum Based Investing V1 Feb 19 2024
No ratings yet
Momentum Based Investing V1 Feb 19 2024
125 pages
ICT - Fair Value Gap (FVG) - ICT Mentorship Core Co
No ratings yet
ICT - Fair Value Gap (FVG) - ICT Mentorship Core Co
11 pages
Microsoft Learning Exam Cram Marathon
No ratings yet
Microsoft Learning Exam Cram Marathon
22 pages
Cryptocurrency Prediction
0% (1)
Cryptocurrency Prediction
8 pages
Data Trading Profile Exercise Book
No ratings yet
Data Trading Profile Exercise Book
134 pages
PPSC 100 Edition (Imtiaz Shahid)
100% (1)
PPSC 100 Edition (Imtiaz Shahid)
136 pages
Stock Prediction Using Twitter Sentiment Analysis: Anshul Mittal Anmittal@stanford - Edu Arpit Goel Argoel@stanford - Edu
No ratings yet
Stock Prediction Using Twitter Sentiment Analysis: Anshul Mittal Anmittal@stanford - Edu Arpit Goel Argoel@stanford - Edu
5 pages
Stock Trading A3
No ratings yet
Stock Trading A3
4 pages
Ethereum Investor Guide
No ratings yet
Ethereum Investor Guide
17 pages
Cyber Security Initiatives by The Government of India
No ratings yet
Cyber Security Initiatives by The Government of India
10 pages
FTU Crypto - Complete Guide On Following Trades With Us
No ratings yet
FTU Crypto - Complete Guide On Following Trades With Us
8 pages
How To Become A Successfull Prop Trader
No ratings yet
How To Become A Successfull Prop Trader
1 page
BitcoinForensics ATutorial
No ratings yet
BitcoinForensics ATutorial
18 pages
Financially Traded Products Readings PDF
No ratings yet
Financially Traded Products Readings PDF
210 pages
Blockchain Report
No ratings yet
Blockchain Report
23 pages
Pitch Deck - Quadmatics
No ratings yet
Pitch Deck - Quadmatics
12 pages
Cryptocurrency: An Analysis of Growth of Bitcoin and Its Future
100% (1)
Cryptocurrency: An Analysis of Growth of Bitcoin and Its Future
4 pages
Webpage and Telegram Bot Controlled Home Automation System Using Raspberry Pi3
No ratings yet
Webpage and Telegram Bot Controlled Home Automation System Using Raspberry Pi3
5 pages
DFC Strategy - B1
No ratings yet
DFC Strategy - B1
18 pages
High Frequency Trading Final Pres Slides
No ratings yet
High Frequency Trading Final Pres Slides
43 pages
Hamid Jazayeriy Spa Bot Smart Price Action Trading Bot
No ratings yet
Hamid Jazayeriy Spa Bot Smart Price Action Trading Bot
5 pages
Forex
No ratings yet
Forex
62 pages
A Beginners Guide To Bitcoin and Cryptocurrencies: Learn How To Buy And Mine Bitcoin, Advantages and Disadvantages of Investing in Bitcoin, How Bitcoin and Other Currencies Works And More
From Everand
A Beginners Guide To Bitcoin and Cryptocurrencies: Learn How To Buy And Mine Bitcoin, Advantages and Disadvantages of Investing in Bitcoin, How Bitcoin and Other Currencies Works And More
Aquilia Adams
No ratings yet
Crypto A Beginner's Guide
From Everand
Crypto A Beginner's Guide
Jake Masterfield
No ratings yet
Finallldo
No ratings yet
Finallldo
29 pages
Cryptocurrency Price Forecasting Using Xgboost Regressor and Technical Indicators
No ratings yet
Cryptocurrency Price Forecasting Using Xgboost Regressor and Technical Indicators
9 pages
Kcn Đồng Văn I, Ii, Iii, Iv: Nguyễn Thế Nghĩa Full description
0% (1)
Kcn Đồng Văn I, Ii, Iii, Iv: Nguyễn Thế Nghĩa Full description
45 pages
Canec2287663283 Aw0331f Face
No ratings yet
Canec2287663283 Aw0331f Face
11 pages
CANEC2287631034 - AW0331F (Release Liner)
No ratings yet
CANEC2287631034 - AW0331F (Release Liner)
11 pages
Haorui Profile (English Ver)
No ratings yet
Haorui Profile (English Ver)
23 pages
Research Project 2023
No ratings yet
Research Project 2023
69 pages
Mixed Methods 2
No ratings yet
Mixed Methods 2
54 pages
Propuesta de Programa de Capacitación Ipn
No ratings yet
Propuesta de Programa de Capacitación Ipn
14 pages
Improving The Students' Speaking Ability Through Silent Way Method at Smu Negeri 12 Makassar
No ratings yet
Improving The Students' Speaking Ability Through Silent Way Method at Smu Negeri 12 Makassar
10 pages
BRM Module 3
No ratings yet
BRM Module 3
80 pages
Arbuthnott 2000
No ratings yet
Arbuthnott 2000
12 pages
Reverse Engineering: Dr. Ahmed Bahgat
No ratings yet
Reverse Engineering: Dr. Ahmed Bahgat
47 pages
Yo-Yo Test
No ratings yet
Yo-Yo Test
11 pages
Financial Resilience and Growth Prospects of Food Startups in Tuguegarao City: A Study of A Startup'S Survivability and Potential
No ratings yet
Financial Resilience and Growth Prospects of Food Startups in Tuguegarao City: A Study of A Startup'S Survivability and Potential
42 pages
Action Research
No ratings yet
Action Research
33 pages
41 SchedaBando LawScienceTechnology
No ratings yet
41 SchedaBando LawScienceTechnology
4 pages
From Genreal To Sepcific English: A Case Study of NUML
No ratings yet
From Genreal To Sepcific English: A Case Study of NUML
15 pages
Basic Element of Research Design
No ratings yet
Basic Element of Research Design
30 pages
English 10 q4 m4 Research-Report
No ratings yet
English 10 q4 m4 Research-Report
16 pages
The Analysis On The Rise of Mental Health Challenges Among The Youths in Yemen
No ratings yet
The Analysis On The Rise of Mental Health Challenges Among The Youths in Yemen
55 pages
Activities in Research
No ratings yet
Activities in Research
13 pages
Homework Chapter 1 - Solutions
No ratings yet
Homework Chapter 1 - Solutions
5 pages
Interim Report/Mid-Thesis Structure: Data Science and AI/ML
No ratings yet
Interim Report/Mid-Thesis Structure: Data Science and AI/ML
10 pages
MIL MODULE Week 2
100% (1)
MIL MODULE Week 2
9 pages
Risk Analysis and Management - Nokia-Siemens
No ratings yet
Risk Analysis and Management - Nokia-Siemens
7 pages
Consumer Food Safety Perceptions and Self-Reported Practices
No ratings yet
Consumer Food Safety Perceptions and Self-Reported Practices
8 pages
Essay
No ratings yet
Essay
2 pages
Training and Developemnt Needs Assessment
No ratings yet
Training and Developemnt Needs Assessment
4 pages
Study The Relationship Between Emotional Intelligence and Academic Achievement of School Students
No ratings yet
Study The Relationship Between Emotional Intelligence and Academic Achievement of School Students
9 pages
Systematic Literature Review Discussion
100% (2)
Systematic Literature Review Discussion
8 pages
Food Chemistry: Sciencedirect
No ratings yet
Food Chemistry: Sciencedirect
9 pages
Tiktok Addiction Scale Development
No ratings yet
Tiktok Addiction Scale Development
26 pages
Eps 310-400 Na Mairinai Philipo-1
No ratings yet
Eps 310-400 Na Mairinai Philipo-1
48 pages

Crypto Predictions

Uploaded by

Crypto Predictions

Uploaded by

Sharif University of Technology

FACULTY OF COMPUTER ENGINEERING

Student Name Student ID

Submission Date : 05/09/2023

2 Machine Learning Technology 3

The systematic structural specification of Bitcoin [1] was published in Novem-

Due to the decentralized nature of the majority of cryptocurrencies, their

CryptoPredictions could be a significant help because:

1. At the outset of our work, we faced a serious challenge of dataset scarcity.

5. We understand that evaluating models can be challenging, which is why

6. At CryptoPredictions, we do not fetch indicators from different websites,

4.1 Random Forest

Figure 1: Structure of a random forest https://www.paradigmadigital.com/

Figure 3: LSTM architecture: a single memory block is shown for clarity.

A mapping from an input sequence x = (X0 , X1 , ..., Xt ) to an output se-

Figure 4: GRU architecture

At each timestamp t, it receives an input Xt and the preceding timestamp’s

The only difference is in the weight metrics, i.e., Uu and Wu .

Ĥt = tanh(xt ∗ Ug + (rt ◦ Ht−1 ) ∗ Wg )

Xt = µ + ϕ1 Xt−1 + ϕ2 Xt−2 + ... + ϕp Xt−p + et

2. Moving Average (MA)

Xt = et − ϕ1 et−1 − ϕ2 et−2 − ... − ϕq Xt−q

3. Autoregressive Moving Average (ARMA)

4. Autoregressive Integrated Moving Average (ARIMA)

Figure 5: SARIMAX, List of symbols and model parameters.

The formula for the Autoregressive Model AR(p) is:

p(yt , yt−1 , ...., y1 ) =

mprophet (t + h) ≈ E[yt+h |yt , ..., y1 ]

This could be the case if the underlying system is in a condition of equilibrium,

5.1 Cross Validation

2. Arbitrary Choice of Test Set

To overcome this, we employ the Nested Cross-Validation approach. Nested

5.2.1 Mean Absolute Error (MAE)

5.2.2 Mean Squared Error (MSE)

5.2.3 Root Mean Squared Error (RMSE)

5.2.4 Mean Absolute Percentage Error (MAPE)

5.2.5 Symmetric Mean Absolute Percentage Error (SMAPE)

5.2.6 Mean Absolute Scaled Error (MASE)

5.2.7 Mean Squared Logarithmic Error (MSLE)

6.1 Accuracy Score & F1-Score

6.2 Recall Score & Precision Score

6.3 MAPE, SMAPE, MASE, and MSLE

6.4 Results in Bitcoin

Table 1: Experimental result for Second order low pass filter

[2] ”Coin Market Cap,” [Online]. Available: https://coinmarketcap.com/

[8] Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.

[14] Mitchell, T. M. (1999). Machine learning and data mining. Communications

[15] Breiman, L. (2001). Random forests. Machine learning, 45(1), 5–32.

[16] Breiman, L. (1996). Bagging predictors. Mach. Learn. 24, 123–140

[18] M. W. P. Aldi, J. Jondri, and A. Aditsania, “Analisis Dan Implementasi

[19] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Com-

[21] Y. Bengio, P. Simard, and P. Frasconi, “Learning Long-Term Dependencies

[22] E. Kristensen, S. Østergaard, M. A. Krogh, and C. Enevoldsen, “Technical

[23] C. Scheier and W. Tschacher, “Appropriate algorithms for nonlinear time

[29] E. B. Dagum, “The X-II-ARIMA seasonal adjustment method,” 2005.

[30] A. Hendranata, “ARIMA (Autoregressive Integrated Moving Average),”

[31] Y. S. Lee and L. I. Tong, “Forecasting time series using a methodology

[32] Hillmer, S. Craig, and George C. Tiao, “An ARIMA-Model-Based Approach

[33] ] S. E. Said and D. A. Dickey, “Testing for unit roots in autoregressivemoving

[34] From ar to sarimax: Mathematical definitions of time series models,

[37] L. J. Tashman. Out-of-sample tests of forecasting accuracy: an analysis and

[38] S. Varma, R. Simon ”Bias in error estimation when using cross-validation

You might also like