0% found this document useful (0 votes)
148 views10 pages

Cryptocurrency Price Prediction Using Linear Regression and Long Short-Term Memory (LSTM)

Predicting future events is difficult, particularly with regards to cryptocurrency, where the media, influential people and governments have a sharp and vital impact on worth. Cryptocurrency market analysis is a method through which the realworld data of the cryptocurrency market is used to predict where it will go next.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
148 views10 pages

Cryptocurrency Price Prediction Using Linear Regression and Long Short-Term Memory (LSTM)

Predicting future events is difficult, particularly with regards to cryptocurrency, where the media, influential people and governments have a sharp and vital impact on worth. Cryptocurrency market analysis is a method through which the realworld data of the cryptocurrency market is used to predict where it will go next.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

10 XII December 2022

https://doi.org/10.22214/ijraset.2022.48286
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XII Dec 2022- Available at www.ijraset.com

Cryptocurrency Price Prediction Using Linear


Regression and Long Short-Term Memory (LSTM)
Atharva Dhande1, Shoumyadeep Dhani2, Shivang Parnami3, K. P. Vijayakumar4
School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, India

Abstract: Predicting future events is difficult, particularly with regards to cryptocurrency, where the media, influential people
and governments have a sharp and vital impact on worth. Cryptocurrency market analysis is a method through which the real-
world data of the cryptocurrency market is used to predict where it will go next. If foretold accurately, it helps investors to invest
when the value is low (purchasing in bulk when the price is dipping) and sell once it's high so as to gain a profit. This research
provides two machine learning algorithms which are Long Short-Term Memory (LSTM) and Linear Regression for predicting
the values of six different types of crypto currencies such as Bitcoin (BTC), Dash coin (DASH), Lite coin (LTC), Dogecoin
(DOGE), Ethereum (ETH), and Monero (XMR). The accuracy of the models is analyzed using mean squared error.
Keywords: Future; Market; Price; Prediction; Profit

I. INTRODUCTION
Today, all economies have embraced particular currencies (money) as a means of exchange. The money supply generates inflation
and deflation in economies due to its excess supply and contraction, governments manage currencies in order to counteract inflation
and deflation. Many governments throughout the world are focused on digital currencies and transactions these days. Also, majority
of the people in the world don’t want their transactions to be regulated by the government. This resulted in more innovation in a new
currency, crypto currency, which is one of the most sophisticated, ambiguous, and regulation-free currencies. Transactions are rapid,
digital, safe, and international, which essentially allows the preservation of records without the fear of data being pirated, as some
may imagine, thus reducing fraud to its minimum.
The first step will be collecting the real-world information from the cryptocurrency market and plotting it the data to analyze the
trend and predict whether it will be bullish or bearish. If foretold properly, this enables to invest when the value is low (buying on
the dip) and sell once it's high in order to gain profit. Technical analysis of a cryptocurrency helps to read the market. It involves
observation and analysis of price charts and graphs from various perspectives and finding a consensus within that information to
help to predict where the market is going. Market prediction is done using machine learning techniques namely Long Short-Term
Memory (LSTM) and Linear regression (LR), etc. These approaches can mimic the simultaneous dynamic interaction of several
components, allowing for the study of complexity; they may also derive conclusions on an individual basis rather than as average
trends.
In the digital market, there are hundreds of crypto currencies, but Bitcoin is the most prominent, and it is influenced by a lot of
factors such as the news and social media. Bitcoin’s usage of open-source code and a censorship-resistant architecture has led it to
become the main source of reference for many cryptocurrencies and their developers. Many crypto currencies have gained
importance other than bitcoin. Dogecoin, for example, was a meme-based joke coin that was popularized when CEO of Tesla Elon
Musk promoted the crypto currency on social media. Other examples are Ethereum, Solana, monero, avalanche etc. In this paper,
six well known cryptocurrencies are used to predict bitcoin and Ethereum as these are two of the most popular and largest
cryptocurrencies in terms of volume. Also, monero, dash coin, lite coin is also used as these are the coins which were the first to
enter the market are comparatively easier to predict than the newer coins and alt coins. Last but not the least, for a little challenge
Doge coin is also used. Because the coin launched as a satire on the cryptocurrency space, and it is highly influenced by Elon musk
tweets.
The aim of this paper is to predict the future prices of the Bitcoin, Ethereum. Dogecoin, Monero, Dashcoin and Litecoin with the
help of Long Short-Term Memory and Linear Regression models and evaluate the results using mean squared error.
The motivation behind this paper is to facilitate cryptocurrency investors to invest at appropriate time by predicting future prices for
six cryptocurrencies, thus improving their portfolio. Also, this paper compares the accuracy of the output of the LSTM and Linear
Regression models, helping the reader to appropriately select model for prediction.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1591
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XII Dec 2022- Available at www.ijraset.com

The paper is organized into various sections as follows: Section 2 describes about the literature review. System model is explained
in section 3. The proposed system is illustrated in section 4. Section 5 discusses about the dataset used in this paper. Result and
evaluation of the proposed system is provided in section 6. Section 7 summarizes the paper with a conclusion and discusses about
future work.
II. RELATED WORK
Paper [1] compares the performances of different Machine Learning algorithms such as SVM, Boosted NN, ANN and DL for a
wide variety of cryptocurrencies for their forecasting and also talks about different time series data of cryptocurrency in detail. The
paper [2] shows how incorporation of a cryptocurrency into a portfolio improves profit gain by: (i) reducing the standard deviation
and, (ii)Use asset portfolio allocation for different cryptocurrencies. It also talks about the performances of cryptocurrencies with
respect to stocks and market their returns as well. Paper [3] implements a combined ensemble of Random Forest and Stochastic
gradient-based model. The ensemble is then used on a variety of coins such as bitcoin, ripple and Ethereum.
The paper [4] presents a hybrid form of cryptocurrency price prediction system based on Long Short-Term Memory (LSTM) and
Gated Recurrent Unit (GRU), mainly focused Litecoin and Monero cryptocurrencies. The paper [6] shows the implementation of
traditional SVMs and linear regression methods to predict the price of Bitcoin. This research considers a forecast made up of closing
price of Bitcoin every day for the creation of prediction models. The paper [7] implements algorithms such as linear regression,
gradient descent, random forest and a classified deep learning algorithm for the price prediction of bitcoin.
Paper [8] analyses the price fluctuations of Bitcoin, Ethereum, and Ripple. The authors utilize multiple neural network frameworks
such as ANN and LSTM. The authors found out that ANN relies more on the future history, whereas LSTM relies more on short-
term data, The paper [9] uses a slew of different algorithms such as LSTM, SVM, random forest, XGBoost, Linear discriminant
analysis (LDA), Quadratic discriminant analysis (QDA) for predicting the price of bitcoin and judges them on the parameters of
precision and accuracy with LDA having the topmost accuracy of 66%.
The paper [10] studies the implementation of random forest (RF), neural networks (NN), and support vector machines (SVM). It is
also found out that machine learning and sentiment analysis can also be used to know future of cryptocurrency markets and that
neural network was found to be the best among the models mentioned previously. The paper [11] implements long short-term
memory (LSTM) to predict and find ways to forecast price of Bitcoin on the stock market through Yahoo Finance. Thus, after the
review of all the papers, analysts are still trying to find out the perfect algorithms which are suitable for forecasting by testing out
new algorithms with robust mechanisms and modifying the old ones.

III. SYSTEM MODEL


A. Long Short-Term Memory
Long Short-Term Memory, also popularly known as LSTM is machine learning model based on Neural Network. It is an
improvised version of Recurrent Neural Network. LSTM cell is pretty similar to RNN cell having three parts: Forget Gate, Input
Gate and Output Gate, (shown in figure 1).

Fig. 1 The three parts of the LSTM cell

Fig. 2 Shows the short term and long-term memory

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1592
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XII Dec 2022- Available at www.ijraset.com

We can observe the construction of an LSTM cell from figure 2. The Forget Gate helps in removing useless information from the
LSTM cell. The information is removed by multiplication of a filter. The Input Gate takes the responsibility of adding information
to the LSTM cell through three steps : (i) Value regulation using sigmoid function, (ii) Vector creation using tanh function, and (iii)
Multipliacation of the created vector with the regulatory function. The useful information from the current state is shown with the
help of the output gate.
1) Working of LSTM

Fig. 3 Shows the workflow of LSTM

Recurrent Unit Working Principle (a diagrammatic representation of the working is shown in the figure 3):
a) Step1: Get the following inputs: Current Input, previous hidden state and previous internal cell state.
b) Step 2: The calculation of the values of the gates is given by: (i) Calculation of parameterized vectors for the previous hidden
state and current input, (ii) application of respective activation function for every gate.
c) Step 3: Calculation of current internal cell state.
d) Step 4: Calculation of hidden state.

B. Linear Regression
The linear regression is one of the most fundamental and widely used algorithm for forcasting and prediction. It is used for
modelling a relationship between two or more variables with the help of a linear equation. The researcher often attempts to
comprehend or relate at least two independent (predictor) variables with a dependent variable to see the result. Both correlation and
regression give this chance to comprehend the "risk factors-illness" relationship. While correlation gives a quantitative approach to
estimating the degree or strength of a relation between two variables, regression analysis numerically depicts this relationship. The
linear regression is represented by the following equation:- y = mx + c.
The dependent variable must be continuous whereas the independent variable may or may not be continuos. Generally, the
relationship between continuous variables is represented by scatter plot. This sort of plot will show whether the relationship is linear
or nonlinear as shown in Figures 4 and 5 respectively.

Fig. 4 A Scatter plot

Fig. 5 A scatter plot showing an exponential relationship

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1593
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XII Dec 2022- Available at www.ijraset.com

Fig. 6 A scatter plot showing the corresponding regression line and regression equation between the dependent variable (body
weight in kg) and the independent variable (height in m).

In the figure 5, a univariable linear regression depicts the linear relationship between a single independent variable X and a
dependent variable Y. The line of regression allows a person to predict the value of the dependent variable Y from the value of the
independent variable X as shown in Figure 6.

IV. PROPOSED SYSTEM


Long Short-Term Memory and Linear Regression will be used to predict the future prices of the cryptocurrencies. Both the machine
learning models are a series of algorithms that are used to forecast data, which are trained using time series data.

A. Long Short-Term Memory


In the proposed system using Long Short-Term Memory, following steps are carried out:
1) Feature Selection: From the dataset, only two features are considered for prediction using LSTM: Date and Closing price.
2) Train-Test split: After attribute filtering, the data is divided into training data and testing data with percentage records of 80%
and 20% respectively.
3) Formatting of Training Data: The training data is formatted to the shape of window size of 5 and 2 features.
4) Building LSTM Model: The formatted data is used to build lstm model along with dropout as 0.2, density as 1, number of
neurons as 100, active function as ‘linear’, loss as ‘mse’ and optimiser as ‘adam’.
5) Training the LSTM Model: After building the model, it is trained with hyper parameters as follows: number of epochs as 20,
batch size as 32, verbose as 1 and shuffling condition as true.
6) Prediction and Error Analysis: After training the model, the future data is predicted with the help of testing data and the error
(mean squared error) is calculated between predicted data and test data.
7) Plotting: After prediction, the predicted data and testing data both are plotted on the same graph.
8) Repeating for all Dataset: The steps from 1 to 6 are repeated for every dataset in a loop.

B. Linear Regression
In the proposed system using Linear Regression, following steps are carried out:
1) Feature Selection: From the dataset, only two features are considered for prediction using LSTM: Date and Closing price.
2) Date Formatting: The date in the dataset is converted into timestamp as normal date notation don’t work with Linear
Regression algorithms.
3) Train-Test Split: Initially, the input data is split into training data and testing data with percentage records of 80% and 20%
respectively.
4) Building and Training Linear Regression Model: After the data splitting, Linear Regression model is built with no extra hyper
parameters. Then the model is trained using the training dataset.
5) Prediction and Error Analysis: After training the model, the future data is predicted with the help of testing data and the error
(mean squared error) is calculated between predicted data and test data.
6) Plotting: After prediction, the predicted data and testing data both are plotted on the same graph.
7) Repeating for all Dataset: The steps from 1 to 6 are repeated for every dataset in a loop.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1594
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XII Dec 2022- Available at www.ijraset.com

V. DATASET
The dataset was collected from [12]. The original dataset consisted 6 sheets of data (1 sheet for every coin), each sheet consisting of
384 records; for simplicity, every sheet was converted into a ‘.csv’ (Comma Separated Value) file. Sample data of every
cryptocurrency is shown in Figures 7-12. Table 1 shows the description of dataset.

TABLE I
DATASET SPECIFICATION
Feature Name Feature Description Feature Type
Unix Timestamp Timestamp of the record generated in Number
Unix format.
Date Date of the record generated. Date
Symbol Symbolic name of the coin. String
Open It refers to the price at 12:01 AM UTC Number
of any given day for the quoted
cryptocurrency.
High It refers to the highest price reached Number
during the last 24 hours for the quoted
cryptocurrency.
Low It refers to the Lowest price reached Number
during the last 24 hours for the quoted
cryptocurrency.
Close It refers to the price at 11:59 PM UTC of Number
any given day for the quoted
cryptocurrency.
Volume (coin name) Total quantity of a traded asset Number
(disclosed in cryptocurrency)
Volume USD Total quantity of a traded asset Number
(disclosed in US Dollar currency)

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1595
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XII Dec 2022- Available at www.ijraset.com

Table 2 shows the training and testing split up of dataset considered in the model.

TABLE II
TRAIN-TEST SPLIT OF DATA
Model Amount of Training Amount of Testing data
data
LSTM 80% (307 records for 20% (76 records for
every cryptocurrency) every cryptocurrency)
Linear 80% (307 records for 20% (76 records for
Regression every cryptocurrency) every cryptocurrency)

VI. RESULTS
A. Output using LSTM
Figure 13 shows the closing price of Bitcoin. In the start it increases but at around middle, it takes major drop in price as bitcoin and
almost all cryptocurrencies fell due to China’s crackdown on the cryptocurrency and Governments around the world proposing to ban
cryptocurrencies. Eventually bitcoin and other coins will skyrocket in price due to the upcoming metaverse.

Figures 14 and 15 show closing prices of DASH coin and Lite coin. These two are heavily influenced by BTC so the rise and fall of
these two are very similar to BTC. DASH and LTC are also predicted to skyrocket in some months due to introduction of metaverse.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1596
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XII Dec 2022- Available at www.ijraset.com

Figure 16 shows closing price of the Ethereum coin within the targeted collected dataset. As ETH is the second biggest
cryptocurrency in terms of volume so the trend of ETH is really similar to BTC. Figure 17 shows the closing price of the DOGE coin
within the targeted collected dataset. It his highly influenced by social media mainly Elon musk tweets. So, the prediction of DOGE is
really inconsistent. In 2021, Elon musk tweeted “Doge” which made the price of DOGE skyrocket. Figure 18 shows XMR closing
price within the targeted collected dataset. Similar to LTC and DASH, XMR is also influenced by BTC and is showing similar trends
to BTC.

B. Output using Linear Regression


Figures 19-21 show the output of linear regression applied on the dataset. A single linear regression line is plotted on graph for every
cryptocurrency, from which the trend of data can be analysed and future closing prices of respective cryptocurrency can be predicted.

From the evaluation metrics of LSTM and LR models, it is observed that LSTM is much more efficient in predicting the trend of each
crypto.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1597
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XII Dec 2022- Available at www.ijraset.com

The Mean Square Error (MSE) of each coin is given in Table 3, for most of them the MSE is quite average but for BTC it is really
high due to high volatile and independent nature of BTC and for DOGE it’s really low as there are not many external factors
influencing DOGE that could predict. And also, it shows that the coefficient of determination is ultra-low because of linear regression
not being as efficient as LSTM.
TABLE III
EVALUATION METRIC FOR ALGORITHMS
Coins MSE Linear MSE for Long
Regression Short-Term
Memory
Bitcoin (BTC) 9052114.61 0.003315
Dash coin (DASH) 87153.95 0.005315
Doge coin (DOGE) 0.00 0.007844
Ethereum coin 34902.86 0.007146
(ETH)
Lite coin (LTC) 3711.05 0.007742
Monero coin 6770.64 0.005358
(XMR)

VII. CONCLUSION
Prediction helps in making important economic decision cautiously and preventing any economic disaster due to miscalculation. Six
cryptocurrencies are considered in this work by applying LSTM and Linear regression for predicting the trend. In future work, it is
planned develop a model using various machine learning algorithms that will use huge amount of data and incorporate more
cryptocurrencies for testing the accuracy of the model. In addition to this, attempt to build an application programming interface
(API) to pipeline the real-time data to the model that can get real time predictions for cryptocurrency trades.

REFERENCES
[1] N. A. Hitam and A. R. Ismail, “Comparative per-formance of machine learning algorithms for cryptocurrency forecasting,” Indonesian Journal of Electrical
Engineering and Computer Science, vol. 11, no. 3, p. 1121, 2018.
[2] Y. Andrianto, “The effect of cryptocurrency on investment portfolio effectiveness,” Jour-nal of Finance and Accounting, vol. 5, no. 6, p. 229, 2017.
“Comparative performance of machine learn-ing ensemble algorithms for forecasting cryp-tocurrency prices,” International Journal of Engineering, vol. 34, no.
1, 2021.
[3] M. M. Patel, S. Tanwar, R. Gupta, and N. Kumar, “A deep learning-based cryptocurren-cy price prediction scheme for Financial Insti-tutions,” Journal of
Information Security and Applications, vol. 55, p. 102583, 2020.
[4] R. Miura, L. Pichl, and T. Kaizoji, “Artificial Neural Networks for realized volatility pre-diction in cryptocurrency time series,” Ad-vances in Neural Networks
– ISNN 2019, pp. 165–172, 2019.
[5] S. Karasu, A. Altan, Z. Sarac, and R. Hacioglu, “Prediction of bitcoin prices with machine learning methods using time series data,” 2018 26th Signal
Processing and Com-munications Applications Conference (SIU), 2018.
[6] M. Saad and A. Mohaisen, “Towards charac-terizing blockchain-based cryptocurrencies for highly-accurate predictions,” IEEE INFOCOM 2018 - IEEE
Conference on Com-puter Communications Workshops (INFOCOM WKSHPS), 2018.
[7] W. Yiying and Z. Yeze, “Cryptocurrency price analysis with artificial intelligence,” 2019 5th International Conference on Infor-mation Management (ICIM),
2019.
[8] Z. Chen, C. Li, and W. Sun, “Bitcoin price prediction using machine learning: An ap-proach to sample dimension engineering,” Journal of Computational and
Applied Math-ematics, vol. 365, p. 112395, 2020.
[9] F. Valencia, A. Gómez-Espinosa, and B. Val-dés-Aguirre, “Price movement prediction of cryptocurrencies using sentiment analysis and machine learning,”
Entropy, vol. 21, no. 6, p. 589, 2019.
[10] F. Ferdiansyah, S. H. Othman, R. Zahilah Raja Md Radzi, D. Stiawan, Y. Sazaki, and U. Ependi, “A LSTM-method for bitcoin price prediction: A case study
yahoo finance stock market,” 2019 International Conference on Electrical Engineering and Computer Science (ICECOS), 2019.
[11] F. Ferdiansyah, S. H. Othman, R. Zahilah Raja Md Radzi, D. Stiawan, Y. Sazaki, and U. Ependi, “A LSTM-method for bitcoin price prediction: A case study
yahoo finance stock market,” 2019 International Conference on Electrical Engineering and Computer Science (ICECOS), 2019.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1598

You might also like