Rathan 2019

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Proceedings of the Third International Conference on Trends in Electronics and Informatics (ICOEI 2019)

IEEE Xplore Part Number: CFP19J32-ART; ISBN: 978-1-5386-9439-8

Crypto-Currency price prediction using Decision Tree and Regression


techniques
Karunya Rathan1, Somarouthu Venkat Sai2, Tubati Sai Manikanta3
1
Dept. Of Information Technology, Sathyabama Institute Of Science And Technology,Chennai,TamilNadu
2
Dept. Of CSE(3511537), Sathyabama Institute Of Science And Technology, Chennai, TamilNadu.
3
Dept. of CSE(3511593), Sathyabama Institute Of Science And Technology,Chennai, TamilNadu.

Abstract - Crypto-currency such as Bitcoin is more popular prediction for crypto-currency through machine learning
these days among investors. In the proposed work, it is studied techniques is inevitable. The objectives of this study to the price
to forecast the Bitcoin price precisely considering different prediction of bitcoin by feature selection of different machine
parameters that influence the Bitcoin price. This study first learning techniques.
handles, it is identified the price trend on day by day changes in Volatility as a proportion of value fluctuations, it significantly
the Bitcoin price while it gives knowledge about Bitcoin price affects exchange methodologies and investment choices just as
trends. The dataset till current date is taken with open, high, low on alternative estimating and proportions of fundamental risk.
and close price details of Bitcoin value. Exploiting the dataset Bitcoin, as a pioneer in the blockchain money, it assumes a
machine learning module is introduced for prediction of price predominant job in an entire cryptographic money showcase
values. The aim of this work is to derive the accuracy of Bitcoin capitalization. Subsequently, it is of extraordinary interest are
prediction using different machine learning algorithm and growing these days on data mining and machine learning
compare their accuracy. Experiment results are compared for network to most likely predict Bitcoin value variances
decision tree and regression model. While senders of conventional electronic installments are
generally distinguished, investors of bitcoin work in anonymous.
Key Words: Bitcoin, Crypto-currency, Decision tree, Regression No central authority is available for bitcoin, investors don't have
model, machine learning. to recognize themselves when sending bitcoin to another client.
Investors are identified by the address of wallet. It is mandatory
1.INTRODUCTION for Exchanges to check identity of the clients handled by them,
they are not permitted to make buy or sell trade without
Bitcoin is a crypto-currency which is utilized worldwide for checking their identity. Bitcoin is more secure for investors.
computerized investment. Bitcoin is decentralized for example it The objective of the proposed study is for price prediction of
isn't possessed by anybody. Trades made by Bitcoins are simple bitcoin by feature selection of different machine learning
as they are not fixing to any nation. Investment is possible techniques. Intuitively, idea is to first transform order book data
through different commercial centers known as bitcoin trades. into features over time, referred as feature series and then to
These enable individuals to trade on Bitcoins utilizing various develop prediction models to consume volatility and feature
currencies. Mt. Gox is the biggest Bitcoin exchange, where series simultaneously.
bitcoin is stored as a virtual digital bank. The record of the In the following chapter 2, we discuss about the price prediction
considerable number of exchanges, the timestamp information and related work handled by different authors. In chapter 3, we
handled in this market is called Blockchain. Each record of described our proposed methodology and brief details about our
blockchain information is encrypted. Trades done by the client's implementation. Under chapter 4, we discuss the results and
name are made private only wallet ID is made open. analysis. In chapter 5, this work is concluded with future
The Bitcoin's simply and similar like a stock but in different enhancement options.
way. There are various algorithm using machine learning are
utilized on price prediction on stock value. The features 2. RELATED WORK
influencing Bitcoin are unique. For investors it is mandatory to Many existing techniques have been studied by the researchers
predict Bitcoin prices. Bitcoin price do not affect by business on crypto-currency market like fluctuations on its prices, social
announcements or government announcements and it is not at all media sentiments, etc.
like securities exchange. Thus, we exploit machine learning Fluctuations on crypto-currency prices attracted many
techniques to foresee the cost of Bitcoin. researchers, Tian et.al [1] discussed the fluctuation on bitcoin
prices by its execution orders such as buy or sell. They handled
Bitcoin is an effective crypto currency brought into the monetary regression techniques and moving average values. They derived
market dependent on its one of a kind convention and model for time series, which also applied Gaussian time model
Nakamoto's structure. Bitcoin try to achieve decentralization in to predict the bitcoin values. However, they proved their model
currency market. Investors in Bitcoin market establish trust is efficient on time series data, in our proposed model, we
connections through the development of Blockchain using considered dataset of various years and we evaluated based on
cryptography strategies. Bitcoin now-a-days gaining more close price of bitcoin.
interest due to innovations in Blockchain and machine learning. Connor et.al [2] studied the bitcoin price through sentiments of
various users provided on news columns and social media. Apart
Highly volatile significantly affects exchange methodologies for from bitcoin, they handled two more crypto-currency for
crypto-currency market. Bitcoin is the major crypto-currency is prediction study. They applied feature selection and
still facing the volatility problem. Thus, there should be a classification algorithm on the collected dataset along with token
significant research efforts needed for an efficient way of price weights with positive and negative values. They used three
978-1-5386-9439-8/19/$31.00 ©2019 IEEE 190
Proceedings of the Third International Conference on Trends in Electronics and Informatics (ICOEI 2019)
IEEE Xplore Part Number: CFP19J32-ART; ISBN: 978-1-5386-9439-8

models namely, Naïve Bayes, regression models and SVM 3. PROPOSED WORK
(Support Vector machine). For bitcoin, their experiments shown
We have collected the dataset for the document with following
regression model outperform the others.
details from quandl.com and we applied machine learning
algorithm such as decision tree and regression for prediction and
Young Bin Kim [3] studied the price fluctuation model in
price forecast.
crypto-currency using comments given by users. Apart from
bitcoin, they considered Litecoin and Ripple, which are the next DATASET DETAILS
two major currencies. User crawled data is taken for their study
As Bitcoin is a kind of stock traded in stock market, dataset will
and they segregated sentiment opinions of five types such as
be available in plenty with all time intervals. Live data from
very positive, positive, neutral, negative and very negative.
2011 to till date is collected from quandl.com, which provided
HUISU JANG et.al [4] on his study discussed Bayesian neural
us the most comprehensive bitcoin price in date wise data.
networks considering block chain environment. They did
Dataset is extracted to CSV file.
machine learning with linear regression model. Their study
identifies the bitcoin price and fluctuations. Though there are many authorize websites are available for
Anshul et.al [5] used LSTM for bitcoin price forecast. LSTM, collecting bitcoin dataset for study, CoinMarketCap is one of the
one of recurrent neural network algorithm will allow training other authorized websites, which provides the transactions that
bitcoin prices as time series data efficiently. In their study, bitcoin traded for the 24 hours of a day. These data are fed from
though the time taken for compilation of LSTM model is high various exchanges handling crypto currency.
than existing ARIMA model, the accuracy is found to be high
for LSTM. Quandl.com has dataset related to finance, economic data from
Jethin et.al [6] handled Ethereum price and bitcoin price using five hundred publishers. Data published in Quandl.com can be
Google Trends data and twitter texts. Twitter data considered exploited for different development platforms and analysis tools.
main form of source for arriving decisions on price prediction. It In this proposed work, we have collected the Quandl.com data
is necessary to understand the effect of tweets on price forecast, with name mentioned as “BITSTAMPUSD”.
which can give quick assessment on buy or sell suggestion to The data collected with following features and stored as data.csv
bitcoin traders. The author collected dataset from twitter using “Time_stamp, Open, High, Low, Close, Volume_btc,
hashatg #btc and Ethereum dataset using hashtag #eth. They also Volume_currency, Weighted_price”. The figure, Figure1,
extracted sales volume index data from Google. They pre- represents the view of extracted dataset sample in comma
processed data and this data is given as input for linear separated file.
regression model for bitcoin price prediction.
The author in [7], proposed a model to identify whether Bitcoin
price is depends on volume of tweets or posts by various authors
in social media. The author used Google price trend data and
positive sentiment tweets as the study is related on price
increase. The author collected around two months data of tweets,
which is around one million data along with time line is
extracted. Tweet Timeline is compared with price of bitcoin on
the same time line, which has given correlations on this work.
The work proposed by author [8] is finding the sentiment using
multiple machine learning techniques. The author collected
twitter dataset and performed the analysis, they handle various
stages of pre-processing techniques such as removing URL,
spelling corrections, and emoticons are replaced with their
corresponding polarity values. They used classification
techniques such as Naive Bayes algorithm and Support Vector Figure1: Data set Sample View
Machine. Using NLTK package, they find the polarity of tweets, The dataset variable names are described below
from which they arrived positive and negative sentiment.
The author [9] proposed finding anomalies on market. They Variable Attribute Description
observed that movement varies throughout the week, over the name
course of the year and after some time. Strikingly, request and
Date Trading Date
supply increase/decline with the goal that costs are somewhat
consistent after some time. They observed that costs are by and Open Bitcoin Open price for particular time
large lower on Sundays with the goal that imminent purchasers
should move their interest right up 'til today of the week. High Bitcoin High price achieved for particular time
In the existing work [10], the authors considered daily price Low Bitcoin Low price achieved for particular time
trend of crypto currency, particularly on bitcoin market,
considering various features. They applied more than three Close Bitcoin Close price for particular time
normalization techniques on dataset, which is collected from Volume Coin volume traded
quandl.com. Next, they proposed the feature selection problem, (BTC)
in which they considered five features fed into machine learning
algorithm such as Bayesian regression, random forest. Volume Coin value traded
(Currency)

978-1-5386-9439-8/19/$31.00 ©2019 IEEE 191


Proceedings of the Third International Conference on Trends in Electronics and Informatics (ICOEI 2019)
IEEE Xplore Part Number: CFP19J32-ART; ISBN: 978-1-5386-9439-8

Weighted Price per coin traded


price
Table 1: Data set variable description
The dataset collected from 2011 to till data is plotted on the
below chart, Figure 2, which clearly depicts that bitcoin is a
positive market.

Figure2: Data Visualization of Bitcoin price Figure 3: Overall architecture of cryto currency price
prediction
The below architecture, Figure 3, shows the overall flow of
proposed work, which has the following advantages The machine learning algorithm models used here are decision
tree and linear regression is briefly explained below.
o Implement more than one machine learning to predict
the value. a. Decision Tree
o Accuracy compared to show best algorithm for Decision tree is one of the learning models that are generally
prediction. utilized in classifications. In this strategy, we split dataset into at
least two sets. Internal nodes in Decision tree indicate a test on
The execution is carried out with following steps
the features, branch portrays the result and leafs are decisions
a. Dataset collection from quandl.com made after subsequent processing.
b. Dataset pre-processing Decision Tree works as follows
c. Split dataset as train and test set  Place the best feature of the dataset as root of tree.
d. Apply Machine learning Decision tree and linear  Split the dataset into train and test set. Subsets ought to
regression be made so that every subset contains information with
a similar feature attribute.
e. Train the model
 Above steps repeated on every subset until we get leaf
f. Give test set and predict values
in the tree.
g. Price forecast for five days
The class label prediction for a record in decision tree is started
Bitcoin dataset collected live and stored as bitcoin.csv file is from root of tree. The values are compared with the root
considered as dataset, which is split into train set and test attributes with next record attributes. This comparison arrives us
set. We have considered 80% of data as training input for the corresponding value for the next node to go. The decision
our machine learning algorithm model to train the model. tree applied on our dataset is depicted in below figure, Figure 4,
The remaining 20% of data is considered as test for result in which the input column is considered for the tree is Open,
prediction. High, Low, Close value of dataset. This is the sample decision
taken by our model. The figure shows the three layers of the tree
We exploited Lasso and Regression to predict the price
for our dataset.
trend for 20% test input and the predicted values to plotted
and compared for accuracy.

978-1-5386-9439-8/19/$31.00 ©2019 IEEE 192


Proceedings of the Third International Conference on Trends in Electronics and Informatics (ICOEI 2019)
IEEE Xplore Part Number: CFP19J32-ART; ISBN: 978-1-5386-9439-8

4. RESULTS AND DISCUSSIONS


The result shows that bitcoin price prediction is efficient using
liner regression algorithm. Linear regression achieves around
97.5% accuracy in price prediction, whereas decision tree
achieves 95.8% accuracy. Our proposed work outperforms
existing works accuracy.
The following table shows the accuracy arrived in our
experimental study.
Algorithm Accuracy
Decision Tree 95.88013
Regression 97.59812
Table 2: Experimental Analysis
The below figure, Figure 6, shows the accuracy comparison of
our proposed work, in which regression method outperforms
Figure 4: Decision Tree model on Bitcoin dataset decision tree.
b. Linear Regression
In basic, regression, anticipate scores on one attribute from the
scores on a second attribute. The attribute that anticipated is
known as the model variable and is named as Y. The attribute
base for forecasts is known as the prediction attribute and is
named as X. At the point when there is just a single prediction
attribute, the prediction strategy is called linear regression. In
regression model, the subject of prediction of Y and plotted as
an element of X frame is a straight line.
The proposed work is implemented in Python 3.6.4 with
libraries scikit-learn, pandas, matplotlib and other mandatory
libraries. We downloaded dataset from quandl.com with
necessary authentication keys. The data downloaded contains
up-to date data. The dataset is 80% considered as train set and
20% considered as test set. Machine learning algorithm is
applied such as decision tree and regression. Five days forecast
price prediction is done using decision tree and regression. The
values are compared. Figure 6: Accuracy Comparison
The implementation of regression on pre-processed dataset is Price forecast is done for 5 days using machine learning
done, the predicted price for the given test dataset is plotted techniques such as Decision tree and regression. The result is
against the id. The following figure, Figure 5 shows the compared with the score value to identify the accuracy value and
predicted versus the original value of bitcoin price for the test plotted. Predicted price forecast using our proposed method is
set. shown in the below figure, Figure 7.

Figure 7: Price Forecast for five days


Figure 5: Predicted price Vs Original price
193
Proceedings of the Third International Conference on Trends in Electronics and Informatics (ICOEI 2019)
IEEE Xplore Part Number: CFP19J32-ART; ISBN: 978-1-5386-9439-8

5. CONCLUSIONS
Bitcoin is a booming crypto-currency market, and various
researches have been studied in fields of economics and price
prediction. In our proposed work, Bitcoin dataset is considered
from 2011 to till date price and applied machine learning models
such as Decision Tree and Linear regression models. Also the
price forecast for five days is done using Decision Tree and
Linear regression models. The proposed learning method
suggest the best algorithm to choose and adopt for crypto
currency prediction problem. The experimental study results
show that linear regression outperforms the other by high
accuracy on price prediction.

REFERENCES

[1] Tian Guo, Nino Antulov-Fantulin, "Predicting short-term


Bitcoin price fluctuations from buy and sell orders"
Machine Learning , arXiv:1802.04065, Feb 2018
[2] Connor Lamon, Eric Nielsen, Eric Redondo,
"Cryptocurrency Price Prediction Using News and Social
Media Sentiment" semantic scholar, 2017
[3] Young Bin Kim, Jun Gi Kim, Wook Kim, Jae Ho Im, Tae
Hyeong Kim, Shin Jin Kang, Chang Hun Kim, "Predicting
Fluctuations in Cryptocurrency Transactions Based on User
Comments and Replies" Interdisciplinary Program in
Visual Information Processing, Korea University, Seoul,
Korea Aug 2016
[4] HUISU JANG AND JAEWOOK LEE"An Empirical Study
on Modeling and Prediction of Bitcoin Prices With
Bayesian Neural Networks Based on Blockchain
Information," IEEE access, December 4, 2017,
[5] Anshul Saxena, T.R. Sukumar, "Predicting bitcoin price
using lstm And Compare its predictability with arima
model", International Journal of Pure and Applied
Mathematics Volume 119 No. 17 2018
[6] Jethin Abraham, Daniel Higdon, John Nelson, Juan Ibarra,
"Cryptocurrency Price Prediction Using Tweet Volumes
and Sentiment Analysis", SMU Data Science Review: Vol.
1 : No. 3 , Article 1.
[7] Matta, Martina & Lunesu, Maria Ilaria & Marchesi,
Michele, "Bitcoin Spread Prediction Using Social And Web
Search Media, 2015.
[8] A, Vishal & Sonawane, Sheetal, "Sentiment Analysis of
Twitter Data: A Survey of Techniques", International
Journal of Computer Applications, 2016, 139. 5-15.
10.5120/ijca2016908625.
[9] Nofer, M, “The value of social media for predicting stock
returns: Preconditions, instruments and performance
analysis”, 2015,10.1007/978-3-658-09508-6.
[10] Siddhi Velankar, Sakshi Valecha, Shreya Maji, "Bitcoin
Price Prediction using Machine Learning", International
Conference on Advanced Communications
Technology(ICACT), 2018

978-1-5386-9439-8/19/$31.00 ©2019 IEEE 194

You might also like