0% found this document useful (0 votes)
26 views

Bitcoin Price Prediction Using Machine Learning Models

Uploaded by

I.Ebrahimi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Bitcoin Price Prediction Using Machine Learning Models

Uploaded by

I.Ebrahimi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Journal of Xidian University https://doi.org/10.37896/jxu14.

12/043 ISSN No:1001-2400

BITCOIN PRICE PREDICTION USING MACHINE


LEARNING MODELS

ARSHDEEP KAUR JASWINDER SINGH


DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

PUNJABI UNIVERSITY PATIALA, PUNJAB, INDIA

Abstract- In November 2020, Bitcoin one of the famous crypto-currencies in the


world jumped out at $17000, a three-year higher.The digital currency has undergone
significant price fluctuations since it was launched in 2008.The investors flooded to
crypto-currencies due to volatility caused by pandemic on global equity markets.This
research focuses on evaluating the machine learning models in predicting Bitcoin
price relative to the US dollar.The models include Linear regression, Random Forest,
Support Vector Machine,ARIMA, LSTM and RNN.The findings will further help to
understand the behavior of the new and challenging cryptocurrency.

1. INTRODUCTION
1.1 Bitcoin

The decentralized crypto-currency Bitcoin was developed by Satoshi Nakamoto in


2008. The main objective was to enable the participants to negotiate directly with
each other without any intermediary [1]. All the transactions in the Bitcoin network
are verified with nodes present in the network. Verified transactions are stored in a
public ledger called the Blockchain. To maintain Blockchain Bitcoin mining is
performed. Any willing individual can gain Bitcoins by using the CPU computing
power to solve complex mathematical functions.

1.2 COVID-19 and Bitcoin

The outbreak of COVID-19 has impacted capital markets, including virtual currency
markets.[2] Describes the economic and social implications of the pandemic. The
study highlights the damage to global financial markets caused by the COVID-19
crisis and the role of the pandemic in the financial sector. Crypto-currency (i.e. BTC,
ETH, XRP) is associated with COVID-19 cases/fatalities. Initially, there was an
inverse correlation between Bitcoin and reported deaths/cases, but later on the
correlation becomes positive [3]. The primary question is whether Bitcoin is a Risky
hazard or a safe haven? What is the nature of Bitcoin during its destruction in

VOLUME 14, ISSUE 12, 2020 418 http://xadzkjdx.cn/


Journal of Xidian University https://doi.org/10.37896/jxu14.12/043 ISSN No:1001-2400

financial markets due to the COVID-19 pandemic? Conlon and McGee [4] addresses
both questions and show that Bitcoin does not behave as a safe haven, in fact, when
the crisis began, the movement of Bitcoin was similar to that of the S&P 500.

1.3 Price Prediction

Price forecasting is not an emerging phenomenon. A lot of research is being done on


predicting stock prices. [5] Anticipated stock prices using machine learning models
and achieved greater reliability and precision. Regression and LSTM models were
used for prediction. Both models have shown improved precision, but LSTM results
have been more efficient. [6] Predicted share prices by observing the relation between
the news feed and stock price movement.

This paper uses various machine learning models to forecast daily Bitcoin prices. The
models used are Support Vector Machine (SVM), Linear Regression, Recurrent
Neural Network (RNN), Random Forest, Auto-Regressive Integrated Moving
Average (ARIMA), and Long Short Term Memory (LSTM). The overview of
research is shown in Figure 1

Linear Regression

Random Forest

Support Vector
Machine Machine Learning Bitcoin Price
Forecast
Models
RNN

LSTM

ARIMA Figure 1 Research Overview

VOLUME 14, ISSUE 12, 2020 419 http://xadzkjdx.cn/


Journal of Xidian University https://doi.org/10.37896/jxu14.12/043 ISSN No:1001-2400

2. RELATED WORK

Due to the young age and high volatility, the study related to price prediction of
Bitcoin is lacking. Bitcoin because of its peer to peer system and decentralization has
attracted a large number of users all over the world.[7] The author found the link
between Twitter and Google search and Bitcoin price. The accuracy of Bitcoin price
prediction using polynomial regression is 77% per tweet volume and 66.66% with
Google trends. Chen [8] introduced the "latent source pattern" that has been
implemented by [9] to forecast the price of Bitcoin. There was an 89% return in fifty
days with a 4.10 Sharpe ratio. The research is also carried out to discover the effect of
mass media platforms on digital currency namely Bitcoin.[10] Analyzed the behavior
of a support vector machine, Artificial Neural Network, and Ensemble algorithm (k-
means clustering and recurrent neural networks). The support vector machine
algorithm has got the best results in price predictions. Machine learning patterns like
support vector machine, Random Forest, Long Short Term Memory, Quadratic
Discriminant Analysis, XGBoost were used for daily Bitcoin price prediction[11].
The support vector machine performed best with 65.5% accuracy. [12] studied the
influence of the most common edges of Bitcoin on price prediction and shows that a
single hidden layer feed-forward neural networks (SLFNs) obtained an accuracy of
60.05% (approximately).

[13] Predicted Bitcoin price using recurrent neural network and linear regression
model. The RNN model obtained the mean square error value below that of the
regression model due to its power to recognize long-term dependencies. In [14] the
linear regression, RNN and Random forest are compared. The RNN with LSTM has
increased the efficiency of the model for predicting Bitcoin prices, as the data is
highly volatile. The model resulted in an MAE of 0.0043, less than that of linear
regression and random forest. McNally compared the accuracy of the Automated
Regressive Integrated Moving Average (ARIMA), Recurrent Neural Network (RNN)
and Long Short Term Model (LSTM) models and achieved the highest accuracy of
52% using the Long Short Term Memory network. Ladislav[16] studied the relation
between the price of Bitcoins and Wikipedia and Google searches. The results also
showed the imbalance between the heightened interest in crypto-currency while in
trends the value is lower or higher. [17] used the ARIMA model for short-term
forecasts.ARIMA (4,1,4) predicts the Bitcoin price more accurately. The average
absolute error generated was 0.87 for the Day One forecast and 5.98 for the Day
Seven forecasts. Consequently, ARIMA has produced better short-term outcomes.

VOLUME 14, ISSUE 12, 2020 420 http://xadzkjdx.cn/


Journal of Xidian University https://doi.org/10.37896/jxu14.12/043 ISSN No:1001-2400

3. METHODOLOGY

3.1 Data Collection

The Bitcoin data set comes from an online community called Kaggle. This dataset is
from October 8, 2015 to April 10, 2020. Each row includes Close, Date, High,
Volume BTC, Low, Volume USD and Opening price of BTC.

Features Explanation
Date Bitcoin price for particular Date
Open Opening Price
Close Closing Price at that day
High High Price
Low Low Price
Volume Volume from top Exchange
Table 1 Features of dataset

3.2 Feature Engineering and Evaluation

Feature Engineering refers to extraction of beneficial information from data to


improve predictions of machine learning algorithms. For satisfactory results in future
tasks, it can be considered an important element of the data mining methodology. The
distribution of closing prices is shown in Figure1.

Figure 2:Bitcoin Closing Price Distribution

VOLUME 14, ISSUE 12, 2020 421 http://xadzkjdx.cn/


Journal of Xidian University https://doi.org/10.37896/jxu14.12/043 ISSN No:1001-2400

3.3 Modelling

Data Mining is about extracting useful information from the data. The technical basis
for data extraction is given by machine learning. Thus, Machine Learning(ML) is a
subset of Artificial Intelligence(AI) in which the machine learns automatically from
the previous experience but is not explicitly programmed. A data set is composed of
instances consisting of one or several attributes. Machine learning is generally two-
fold: supervised machine learning and non-supervised machine learning. Supervised
Learning models the dataset with labels. x and y can be used as a representation of
each instance, x is a set of independent variables, and y is a set of dependent
attributes. The target variable may be discrete or continuous. If the target attribute is
continuous then the regression model is used else if discrete then the classification
model is applied. Examples of supervised learning are support vector machines and
neural networks. Unsupervised learning is where the model learns from observations
and discovers structures within the data set. It is used for modeling the data set where
the attribute is unknown. The objective of this research is to predict Bitcoin prices. As
such, the purpose is known so that supervised machine learning will be used. The
algorithms used are Random Forest, Linear regression, SVM, ARIMA, LSTM and
RNN.

3.3.1. Linear Regression

Linear regression reveals the relation between dependent and independent variables.
The equation for line fitting data points is as follow:

Y=a+bX (1)

‘X’ denotes an independent variable, ‘Y’ denotes a dependent variable, ‘b’ denotes
the slope, and ‘a’ denotes intercept.

3.3.2. Random Forest

Random Forest is one of the popular regression and classification problems. It merges
multiple decision trees for better outcomes. Decision trees address a number of
classification issues. The decision tree is like a tree structure where feature space is
partitioned recursively. Recursion is terminated when partitioning adds no value to
the forecast or until single class samples are present in each node. In case of decision
trees the dataset can be linked to leaf nodes.

VOLUME 14, ISSUE 12, 2020 422 http://xadzkjdx.cn/


Journal of Xidian University https://doi.org/10.37896/jxu14.12/043 ISSN No:1001-2400

3.3.3. Support Vector Machine (SVM)

SVM model is used in Binary classification issues [11]. The idea is to discover a
hyperplane so that there is a maximum margin between two classes of data samples.
Assume there is a binary classification issue. m is the number of data points within
the actual n-dimensional space.

Rn is represented by matrix A (xi is ith row of A). xi class can be designated by yi €


{1,-1}. Hence the one-dimensional SVM search for the best decision hyperplane is
set out below:

x.y + a = 0 (2)

Where ‘y’ belongs to Rn, ‘a’ belongs to R,’y’ is the weight vector that is normal to the
hyperplane, ‘a’ is bias. The best hyperplane is achieved with minimization of ||w||.

3.3.4 Recurrent Neural Network (RNN)

J.L.Elman developed Recurrent Neural Networks (RNNs). These are the kinds of
neural networks in which signals can move both backward and forward in a repetitive
manner. These networks consist of the context layer. Every layer output is transferred
to the context layer and then fed to the adjacent layer as input. At every timestamp,
the state is rewritten.

Figure 3: Recurrent Neural Network

VOLUME 14, ISSUE 12, 2020 423 http://xadzkjdx.cn/


Journal of Xidian University https://doi.org/10.37896/jxu14.12/043 ISSN No:1001-2400

3.3.6 Long Short Term Memory (LSTM)

The Long Short Term Memory (LSTM) model solves the vanishing gradient problem
of the RNN. LSTM contains three gates, namely: output gate, forget gate, and input
gate [11].In the case of sequence data,LSTM performs better in extracting long term
dependencies and in representing both future and historic information.

ht-1 ht ht+1

A tanh A
σ σ tanh σ
h

Xt-1 Xt
Xt+1

Figure 4 Structure of LSTM

LSTM is expressed as follows:


𝑥𝑡
X=[ ] (3)
ℎ𝑡 − 1
ft = £ (Wf. X + bf) (4)

it = £(Wi . X + bi) (5)

ot = £ (Wo. X + bo) (6)

Ĉt = tanh (Wc.[ht-1,xt] + bc ) (7)

Ct = ft ʘ C t-1 +it ʘ Ĉt (8)

VOLUME 14, ISSUE 12, 2020 424 http://xadzkjdx.cn/


Journal of Xidian University https://doi.org/10.37896/jxu14.12/043 ISSN No:1001-2400

ht = ot ʘ tanh (Ct) (9)

Here xt is input at time t; ht is a hidden state at time t; Wo, Wc, Wf, Wi are weight
matrix, bc, bi, bf, and bo are the offset of LSTM. £ is the function of activation; ʘ is
the dot matrix multiplying operator.

3.3.7 Autoregressive Integrated Moving Average (ARIMA)

ARIMA model is used to forecast and examine the time series.

AR(p)+MA(q)=ARIMA(p, d, q)

p, d, q are the three orders of ARIMA. p is auto regression order, d is differencing


order and q is moving average order.

Auto Regression (AR) is used to look for the correlation between the previous and the
current period. The ARIMA (p,0,0) is:

Xt =µ+Ø1Xt-1+ Ø2Xt-2+……+ ØpXt-p+ et (10)

Integration (I) is mainly made to make the temporal series stationary.

The moving mean (MA) indicates the movement of the previous error values.
Therefore, the ARIMA (0,0,q) is :

Xt= µ+ et - Ø1 et-1- Ø2et-2+……+ Øq et-q (11)

The ARIMA (p,d,q) is set out below:

Xt = µ + Xt-1 + Xt-d + .....+ Ø1 Xt-1 + Ø2 Xt-2+.....+

Øp Xt-p + et - Ø1 et-1- Ø2et-2+…+ Øq et-q (12)

VOLUME 14, ISSUE 12, 2020 425 http://xadzkjdx.cn/


Journal of Xidian University https://doi.org/10.37896/jxu14.12/043 ISSN No:1001-2400

4. IMPLEMENTATION

The dataset was divided into 80% training and 20% testing and the Bitcoin price was
predicted for the next thirty days. The actual and predicted graphs are shown
following:

Figure 5:Actual and Predicted Price (Linear Regression)

Figure 6:Actual and Predicted Price (Random Forest)

VOLUME 14, ISSUE 12, 2020 426 http://xadzkjdx.cn/


Journal of Xidian University https://doi.org/10.37896/jxu14.12/043 ISSN No:1001-2400

Figure 7 Actual and Predicted Price (SVM)

Figure 8 Actual and Predicted Price (RNN)

VOLUME 14, ISSUE 12, 2020 427 http://xadzkjdx.cn/


Journal of Xidian University https://doi.org/10.37896/jxu14.12/043 ISSN No:1001-2400

Figure 9 Actual and Predicted Price (LSTM)

Figure 10 Actual and Predicted Price (ARIMA)

VOLUME 14, ISSUE 12, 2020 428 http://xadzkjdx.cn/


Journal of Xidian University https://doi.org/10.37896/jxu14.12/043 ISSN No:1001-2400

5. EVALUATION

To evaluate the performance of the models following metrics are used:

 Mean Absolute Percentage Error (MAPE)

MAPE is the mean of the absolute difference between original and predicted values.
It is the measurement of the correctness of a forecasting system.
100 𝑎𝑐𝑡𝑢𝑎𝑙 𝑖−𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝑖
MAPE = ∑𝑛𝑖=1 | |
𝑛 𝑎𝑐𝑡𝑢𝑎𝑙 𝑖

 Root Mean Square Percentage Error (RMSPE)

RMSPE tells about the percentage of error with respect to actual values.

RMSPE= (np.Sqrt (np.Mean (np.Square ((actual-predicted) /actual)))) *100

6. RESULTS

The purpose of research is to evaluate different models of the machine learning for
Bitcoin price prediction. Table 2 shows the results by running Linear regression,
Random Forest, Support Vector Machine, ARIMA, LSTM and RNN models. The
recurrent neural network achieved the best result with the Mean Absolute Percentage
Error of 0.3174 and RMSPE of 0.8853.

Model MAPE RMSPE


Linear Regression 2.0795 1.8279

Random Forest 29.6536 39.84291

Support Vector 15.6801 22.4580


Machine

Recurrent Neural 0.3174 0.8853


Network

Long Short Term 6.7365 13.9277


Memory

VOLUME 14, ISSUE 12, 2020 429 http://xadzkjdx.cn/


Journal of Xidian University https://doi.org/10.37896/jxu14.12/043 ISSN No:1001-2400

Model MAPE RMSPE

Autoregressive 2.5931 3.7452


Integrated Moving
Average

Table 2 Model comparison

45

40

35

30

25
MAPE
20
RMSPE
15

10

0
Linear Random SVM RNN LSTM ARIMA
Regression Forest

Figure 11 MAPE and RMSPE

7. CONCLUSION AND FUTURE WORK

The crypto-currency market is booming and has drawn attention from entrepreneurs
and investors in recent years.By providing comparative studies and conclusions based
on Bitcoin price data, it will further assist in understanding the difficult and rapidly
expanding market.In conclusion, this study focuses on the use of machine learning
models to forecast Bitcoin price.The Google Colaboratory was used for the
implementation of the Bitcoin dataset.Further work can examine other datasets,
including more features that could help predict most precise and reliable Bitcoin price

VOLUME 14, ISSUE 12, 2020 430 http://xadzkjdx.cn/


Journal of Xidian University https://doi.org/10.37896/jxu14.12/043 ISSN No:1001-2400

References

[1] S. Nakamoto, Bitcoin: A peer to peer electronic cash system, 2008

[2] Goodell, J.W,”COVID-19 and finance: Agendas for future research”, Finance
Research Letters 35 (2020), pp. 101512

[3] Ender Demir,”The relationship between cryptocurrencies and COVID 19


pandemic”, Eurasian Economic Review 10 (2020),pp. 349-360

[4]Colon & Mc Gee, R.J. (2020),”Safe haven or risky hazard? Bitcoin during the
COVID-19 bear market”, Finance Research Letters, 35 (2020), pp. 101607

[5] Palmer et al.,”Stock Market Prediction Using Machine Learning,” First


International Conference on Secure Cyber Computing and Communication
(ICSCCC), 2018,pp. 574-576

[6] Z. Wang,S.Ho and Z.Lin,”Stock Market Prediction Analysis by Incorporating


Social and News Opinion and Sentiment,” IEEE International Conference on Data
Mining Workshops (ICDMW),2018,pp. 1375-1380

[7] A. Mittal, V. Dhiman, A. Singh and C. Prakash,”Short –Term Bitcoin Price


Fluctuation Prediction Using Social Media and Web Search, Data”, Twelfth
International Conference on Contemporary Computing (IC3), 2019, pp. 1-6

[8]G.H.Chen,S.Nikolov,and D. Shah,”A latent source model for non-parametric time


series classification,” Advances in Neural information Processing Systems, 2013, pp.
1088-1096

[9]D.Shah and K. Zhang,”Bayesian Regression and Bitcoin”, Communication,


Control and Computing (Allerton), 2014,52nd Annual Allerton Conference on IEEE,
2014, pp. 409-414.

[10] D.C.A. Mallqui and R.A.S. Fernandes, ”Predicting the direction, maximum,
minimum and closing prices of daily Bitcoin exchange rate using machine learning
techniques”, Applied Soft Computing Journal (2018),
https://doi.org/10.1016/j.asoc.2018.11.038

[11] Zheshi Chen, Chunhong Li, Wenjun Sun,”Bitcoin price prediction using
machine learning: An approach to sample dimension engineering”, Journal of
Computational and Applied Mathematics 365 (2020) 112395

VOLUME 14, ISSUE 12, 2020 431 http://xadzkjdx.cn/


Journal of Xidian University https://doi.org/10.37896/jxu14.12/043 ISSN No:1001-2400

[12] Marcell Tamas Kurbucz,”Predicting the price of Bitcoin by the most frequent
edges of its transaction network.” Economics Letters 184 (2019) 108655.

[13] H. Kavitha, U.K. Sinha and S.S. Jain, “Performance Evaluation Of Machine
learning Algorithms for Bitcoin Price Prediction”, Fourth International Conference on
Inventive Systems and Control (ICISC), 2020, pp. 110-114

[14] S. Tandon, S. Tripathi, P. Saraswat and C. Dabas, ”Bitcoin Price Forecasting


using LSTM and 10 Fold Cross Validation”, International Conference on Signal
Processing and Communication (ICSC), 2019, pp. 323-328

[15] S. McNally, J. Roche and S. Caton,“ Predicting the price of Bitcoin Using
Machine Learning”, 26th Euromicro International Conference on Parallel, Distributed
and Network-based Processing (PDP), 2018, pp. 339-343

[16]L.Kristoufek,”Bitcoin meets google trends and Wikipedia: quantifying the


relationship between the phenomena of the internet era”, Sci.Rep. 3 (2013) pp. 3415

[17]I.M. Wirawan, T. Widiyaningtyas and M. M. Hasan, “Short Term Prediction on


Bitcoin Price using ARIMA method”, International Seminar on Application for
Technology of information and communication, 2019, pp. 260-265

VOLUME 14, ISSUE 12, 2020 432 http://xadzkjdx.cn/

You might also like