Paper Demo
Paper Demo
In recent years, the utilization of Cryptocurrencies has significantly increased with the
emergence of blockchain technology. Despite this growth, Cryptocurrencies are not
commonly viewed as a viable investment option due to the unpredictable nature of the market
and its high price volatility. Many existing solutions for forecasting Cryptocurrency prices, as
outlined in literature, may not be suitable for real-time price predictions due to their
deterministic characteristics. To address these challenges, we introduce a stochastic neural
network model for predicting Cryptocurrency prices. This model is inspired by the random
walk theory, a widely accepted method in financial markets for stock price modeling.
Additionally, our approach incorporates a technique to analyze market reactions and learn
patterns, enhancing the accuracy of price predictions. We conducted training using Multi-
Layer Perceptron (MLP) and Long Short-Term Memory (LSTM) models for Bitcoin,
Ethereum, and Litecoin. Our findings demonstrate that the proposed model outperforms
deterministic models in price prediction accuracy.
Introduction
Bitcoin, the most valuable cryptocurrency in the world, is traded on more than 40 exchanges
globally, accepting over 30 different currencies. According to https://www.blockchain.info/,
its current market capitalization stands at 9 billion USD, and it witnesses over 250,000
transactions per day. Being a relatively young currency, Bitcoin offers a unique opportunity
for price prediction due to its inherent volatility, which surpasses that of traditional fiat
currencies. Additionally, Bitcoin distinguishes itself from fiat currencies by its transparent
nature, as complete data on cash transactions or money in circulation is not available for fiat
currencies.
Extensive research has been conducted on predicting mature financial markets like the stock
market. Bitcoin presents an intriguing parallel to this, as it involves time series prediction in a
market that is still in its transitional phase. Traditional time series prediction methods, such as
Holt-Winters exponential smoothing models, rely on linear assumptions and require data that
can be decomposed into trend, seasonal, and noise components to be effective. However,
these methods are not well-suited for the Bitcoin market due to its lack of seasonality and
high volatility. Given the complexity of the task, deep learning emerges as an interesting
technological solution, given its success in similar domains. The recurrent neural network
(RNN) and the long short-term memory (LSTM) models are preferred over the multilayer
perceptron (MLP) due to the temporal nature of Bitcoin data.
The objective of this study is to explore the accuracy of machine learning in predicting the
price of Bitcoin and compare parallelization methods implemented on multi-core and GPU
environments. This research contributes to the existing literature by focusing on machine
learning for prediction, which is a relatively underexplored area in the context of Bitcoin. Out
of approximately 653 papers published on Bitcoin, only 7 (at the time of writing) are
dedicated to machine learning for prediction.
LITERATUIRE SURVEY
Research on forecasting Bitcoin prices using machine learning algorithms is currently
limited. A latent source model was utilized to predict Bitcoin prices, resulting in an 89%
return over 50 days with a Sharpe ratio of 4.1. Some studies have explored the use of text
data from social media platforms and other sources to predict Bitcoin prices. Sentiment
analysis was conducted using support vector machines, Wikipedia views frequency, and
network hash rate. The relationship between Bitcoin price, tweets, and Google Trends views
for Bitcoin was also investigated. Another study focused on predicting trading volume using
Google Trends views. However, a common limitation in these studies is the small sample size
and the potential for misinformation to influence prices through social media channels like
Twitter or Reddit. Due to limited liquidity in Bitcoin exchanges, the market is at a higher risk
of manipulation, leading to the exclusion of social media sentiment in further analysis.
An analysis of the Bitcoin Blockchain was conducted to forecast Bitcoin prices using support
vector machines (SVM) and artificial neural networks (ANN), achieving a 55% accuracy in
price direction prediction with a standard ANN. The study concluded that Blockchain data
alone has limited predictability. Another study used Blockchain data with SVM, Random
Forests, and Binomial GLM, reporting a prediction accuracy of over 97%. However, the lack
of cross-validation in their models limits the generalizability of the results. Wavelets have
also been employed to predict Bitcoin prices, showing positive correlations between search
engine views, network hash rate, mining difficulty, and Bitcoin price. Building on these
findings, data from the Blockchain, including hash rate and difficulty, were incorporated into
the analysis along with data from major exchanges provided by Coin.
The prediction of Bitcoin prices can be likened to other financial time series forecasting tasks
such as forex and stock prediction. Various studies have utilized the Multilayer Perceptron
(MLP) for stock price prediction. However, the MLP only examines one data point at a time.
On the other hand, a recurrent neural network (RNN) stores the output from each layer in a
context layer to be fed back in with the output from the next layer, giving the network a form
of memory unlike the MLP. The length of the network is referred to as the temporal window
length. It is important to note that the temporal relationship of the series is explicitly modeled
by the internal states, which significantly contributes to the effectiveness of the model.
successfully applied this method in predicting stock returns by combining an RNN with a
genetic algorithm for network optimization.
Another type of RNN is the Long Short-Term Memory (LSTM) network. Unlike Elman
RNN, LSTMs can selectively remember or forget data based on the weight and importance of
each feature. implemented an LSTM for a time series prediction task and found that the
LSTM performed equally well as the RNN for this particular task. This model type is also
implemented here. One challenge in training both RNNs and LSTMs is the substantial
computational resources required. For instance, training a network for 50 days is equivalent
to training 50 individual MLP models. Since the introduction of the CUDA framework by
NVIDIA in 2006, there has been significant growth in the development of applications that
leverage the highly parallel capabilities of GPUs, particularly in the field of machine
learning. reported a training and testing speed improvement of over three times when their
ANN model was implemented on a GPU compared to a CPU. Similarly, reported a
significant increase in classification speed, up to eighty times, when implementing an SVM
on a GPU compared to an alternative SVM algorithm run on a CPU.
Motivation:
Bitcoin's value fluctuates similarly to a stock, but with different influencing factors. While
stock market data relies on various algorithms for price prediction, Bitcoin operates under
different parameters. Therefore, utilizing machine learning technology to forecast Bitcoin's
value is crucial for making informed investment decisions. Unlike the stock market, Bitcoin's
price is not influenced by business events or government interventions.
Objective:
Following the surge in popularity of bitcoins, numerous researchers have endeavored to
develop prediction models. Constructing a prediction model for a machine learning problem
is a challenging endeavor, as there is no definitive best fit solution that can be determined
without extensive empirical testing tailored to each specific scenario.
Scope:
The primary aim of this groundbreaking undergraduate project is to demonstrate how a well-
trained machine model can accurately predict the price of a cryptocurrency given sufficient
data and computational resources. By showcasing a graph with projected values, this project
highlights the potential of technology to aid in forecasting future events. With the vast
amount of data being generated daily, we are on the brink of an era where predictions can be
reliably generated based on concrete factual data.
PROJECT ANALYSIS
Existing System
Cryptocurrency poses an intriguing challenge as it involves predicting time series in a market
that is still in its early stages. Traditional methods for time series prediction, such as Holt-
Winters exponential smoothing models, rely on linear assumptions and require data that can
be broken down into trend, seasonal patterns, and noise in order to be effective. It has been
reported that implementing the ANN model on a GPU instead of a CPU resulted in training
and testing speeds that were over three times faster.
Disadvantages of the Existing System:
1. As the number of bitcoins in circulation approaches its limit, mining them becomes
increasingly difficult.
2. Analysts and researchers faced challenges in accurately predicting cryptocurrency prices.
Proposed System
For this study, the price of Cryptocurrency in USD from the Coin desk Bitcoin Price Index is
considered as a separate variable. I have calculated the average price. The main objective of
my project is to investigate the accuracy of predicting Bitcoin prices using machine learning
techniques. To provide a comparison with more traditional approaches in financial
forecasting, an ARIMA time series model is developed for performance illustration purposes
alongside the neural network models. The closing prices from five major Bitcoin exchanges,
namely Bitstamp, Bitfinex, Coinbase, OKCoin, and itBit, are utilized.
RESULTS AND DISCUSSION
This document adheres to the CRISP data mining methodology. The preference for CRISP-
DM over the more traditional KDD is due to the business context of the prediction task. The
Bitcoin dataset utilized covers the period from August 19, 2013, to July 19, 2016. Data prior
to August 2013 was excluded as it no longer accurately reflects the network. Along with the
Open, High, Low, Close (OHLC) data from CoinDesk, the difficulty and hash rate were
sourced from the Blockchain. The data was standardized to have a mean of 0 and a standard
deviation of 1. Standardization was chosen over normalization as it aligns better with the
activation functions employed by the deep learning models.