Hybrid LSTM and GRU For Cryptocurrency Price Forecasting Based On Social Network Sentiment Analysis Using FinBERT

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Received 5 September 2023, accepted 19 September 2023, date of publication 13 October 2023,

date of current version 2 November 2023.


Digital Object Identifier 10.1109/ACCESS.2023.3324535

Hybrid LSTM and GRU for Cryptocurrency Price


Forecasting Based on Social Network Sentiment
Analysis Using FinBERT
ABBA SUGANDA GIRSANG AND STANLEY
Computer Science Department, BINUS Graduate Program-Master of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia
Corresponding author: Abba Suganda Girsang ([email protected])
This work was supported by Bina Nusantara University, Jakarta, Indonesia.

ABSTRACT Cryptocurrencies are digital assets that are widely used for trading and investing. One of the
characteristics that traders take advantage of for profit is the high volatility of the price. Its volatile and
rapidly changing prices have made cryptocurrency price predictions a challenging and highly sought-after
research topic. Cryptocurrency price predictions usually only use historical prices on the dataset, while
price movements are also influenced by other aspects such as sentiment contained in social media. This
study proposes a new machine learning method to predict Ethereum and Solana cryptocurrency price, which
integrates cryptocurrency historical price data and social media sentiment as inputs of the prediction model.
FinBERT, a pre-trained sentiment analysis model is used to extract the sentiment implied in social network
tweets into daily sentiment score, which are then combined with the historical market price data. The hybrid
model of LSTM-GRU model is used to train the dataset and perform cryptocurrency price prediction. The
experiment results show that the presented method can successfully predict the Ethereum and Solana price
movement and has superior performance than all the benchmark models.

INDEX TERMS FinBERT, social network, sentiment analysis, hybrid LSTM-GRU, Ethereum prediction,
Solana pediction.

I. INTRODUCTION records of transactions that have occurred [3]. The virtual


Cryptocurrency is a digital and financial asset that has rev- currency that is exchanged or transacted using blockchain
olutionized the financial sector with its rapid development, technology is called cryptocurrency. In the above case, the
especially in recent years [1]. Bitcoin as a pioneer of cryp- cryptocurrency is bitcoin itself.
tocurrencies, can overcome some of the problems faced by There are many cryptocurrencies that are used in many
traditional economic systems. In the traditional economic blockchain projects. Those cryptocurrencies have their own
system, payments or transactions can only be made with the function and usage, but one of the natures of cryptocurrency
help of third parties such as banks, so there can be some is its always changing price. Cryptocurrency prices are rel-
shortcomings in aspects of trust, transparency, flexibility, and atively much more volatile than other traded assets such as
security. With bitcoin, online transactions from both parties stocks, foreign exchange, gold, etc. Due to its price volatility,
can occur directly without intermediaries using blockchain traders use it to seek profits from the difference between
technology [2]. Every transaction is permanently recorded buying and selling prices [4]. The price of cryptocurrency
in chain of blocks and new transaction is validated using has piqued the interest of many researchers around the world.
proof-of-work consensus mechanisms, which can prevent Researchers have been trying various methods to predict the
parties from creating fake transactions or trying to manipulate volatile price of cryptocurrencies, especially in the field of
machine learning [5]. Machine learning has been known as
The associate editor coordinating the review of this manuscript and one of the excellent methods for prediction. Some machine
approving it for publication was Pasquale De Meo . learning algorithm like XGBoost and LSTM (Long Short
2023 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
120530 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ VOLUME 11, 2023
A. S. Girsang, Stanley: Hybrid LSTM and GRU for Cryptocurrency Price Forecasting

Term Memory) has exceed the performance of moving aver- processes sequential data, it calculates gradients that indicate
age method to predict cryptocurrency price [6]. how much each weight should be adjusted to minimize the
Performance of various machine learning algorithms have error in the predictions. These gradients are backpropagated
been compared in dealing with cryptocurrency price predic- through time to update the network’s parameters. The issue
tion or forecasting problem [7], [8], [9]. Recurrent Neural occurs when the gradients become extremely small as they
Network (RNN) algorithm such as LSTM and GRU has are backpropagated from the later time steps back to the
scored a better cryptocurrency price prediction performance initial time steps. This happens because the gradients are
than ARIMA (Autoregressive Integrated Moving Average), calculated through the multiplication of many intermediate
one of the machine learning methods commonly used in gradients through time. As these gradients get multiplied,
price forecasting [10]. Hybrid model of LSTM-GRU have the their values decrease exponentially, causing them to van-
highest performance score than the comparison algorithms in ish or become very close to zero. LSTM and GRU are
regards to handle foreign exchange price prediction [11]. RNN architecture designed specifically to address the van-
Cryptocurrency price prediction or forecasting has been ishing gradient and exploding gradient problems. LSTM and
a challenging and difficult subject of research due to its GRU hybrid architecture is one of the best machine learning
unstable and rapidly changing price. Cryptocurrency price algorithm for time-series forecasting such as cryptocurrency
volatility is caused by several factors such as popularity, price prediction [21].
transaction cost and speed, market trends, news, public senti- Although capable of solving the vanishing gradient and
ment and some other factors [12]. Sentiment in social media exploding gradient problems, they have different gates.
has an important correlation and influence on cryptocurrency LSTM has three gates: the input gate, the forget gate, and the
price movements [13], [14], [15]. One example is a tweet output gate. GRU has two gates: the reset gate and the update
from Elon Musk about one of the famous cryptocurrencies, gate. In certain cases, GRU may be simpler and require fewer
doge coin. Bitcoin and doge coin price has gone up and down parameters, while LSTM can have a larger memory capacity
rapidly mainly because of that particular tweet [16]. Many but be more computationally complex [11]. Therefore, com-
traders make buying and selling decisions based on the macro bining these two algorithms, it is expected to create a dynamic
and micro public sentiment on social media. gate which optimizes model.
Without a doubt, bitcoin is the most famous cryptocur- Moreover, the influence of sentiment analysis is considered
rency and most of the research regarding cryptocurrency price to be able to influence stock price movements as seen from
prediction revolves around bitcoin. As mentioned before, several previous studies [22], [23].
cryptocurrency is the currency for a specific blockchain
project. There are several aspects of blockchain that make B. CONTRIBUTION
many parties want to build their projects using blockchain Combining LSTM and GRU is designed to solve the van-
technology, one of which is the decentralized nature of ishing gradient and exploding gradient problems. It creates
blockchain so that no single large party owns or can manage a balance between the complexity and memory capacity
every activity and manipulate the data that recorded on the required by the model. It helps control the flow of information
blockchain [17]. With smart contract feature, users can build and gradient flow within the network. The attributes used
their application or program on top of the blockchain net- for these cryptocurrency price predictions are daily closing,
work, so the application will be decentralized and protected opening, high, and low price.
from hacker attacks [18]. Ethereum is a blockchain network This research also takes advantage of the sentiments con-
with smart contract features that has the highest number tained in the social network ’s tweets and combined with
of adoptions, and its cryptocurrency, Ether (ETH) is the historical cryptocurrency data such as daily closing, opening,
cryptocurrency with the second largest market capitalization high, and low price to predict the daily closing price of
after Bitcoin [19]. Another widely adopted smart contract the cryptocurrency in the daily timeframe. Tweet sentiment
blockchain network that offer faster transaction speed than data is obtained using FinBERT, a model from BERT that
ethereum is Solana, with its token, SOL [20]. focuses on dealing with Natural Language Processing (NLP)
problems in the financial context.
A. MOTIVATION Therefore, the main contribution of this paper is to build
In this research, the daily closing price of ethereum and a crypto currency price prediction model based on the
solana cryptocurrency (ETH and SOL) will be predicted LSTM-GRU algorithm and including social network senti-
using hybrid RNN machine learning architecture, LSTM- ment data extracted with FinBERT.
GRU. Recurrent Neural Networks (RNNs) are a type of The evaluation of LSTM, GRU, LSTM-GRU are con-
neural network architecture commonly used for sequential ducted and comparing between combining with sentiment
data processing. However, they suffer from a significant and without sentiment data. The performance is calculated
weakness known as the ‘‘vanishing gradient’’ problem. The in MSE, RMSE, MAE, and MAPE.
vanishing gradient problem arises during the training phase The remainder of this paper is organized as follows.
of RNNs, particularly when using backpropagation through Section II gives a brief related works of this study.
time (BPTT) to update the network’s weights. As the network Section III provides a detailed description of the proposed

VOLUME 11, 2023 120531


A. S. Girsang, Stanley: Hybrid LSTM and GRU for Cryptocurrency Price Forecasting

method. Section IV presents the analysis and the perfor-


mance, while conclusions are offered in Section V.

II. RELATED WORKS


A. PREVIOUS RESEARCH
With the rapid developments in the world of cryptocur-
rency and blockchain technology, as well as the increasing
number of cryptocurrency traders and investors, many are
conducting research in the field of prediction, forecasting,
and sentiment analysis on cryptocurrency prices. Traders
often use trading indicators to help them make trading deci-
sions. As for price prediction, machine learning algorithms
such as XGBoost and LSTM (Long Short Term Memory) FIGURE 1. LSTM architecture.
outperform Moving Average, one of the most used trading
indicator [6]. ARIMA (Autoregressive Integrated Moving
Average), a well-known and widely used machine learning
algorithm to perform time-series forecasting, have better
performance and error rate than XGBoost in Bitcoin price
prediction [24].
As machine learning technology developed, an algorithm
that can predict and remember historical data in its pro-
cess, called Recurrent Neural Network (RNN) began to
emerge. RNN algorithm such as LSTM and GRU (Gated
Recurrent Unit) has scored a better cryptocurrency price
prediction performance than ARIMA in predicting Bitcoin
price [10]. Research in Foreign Exchange (Forex) price pre- FIGURE 2. GRU Architecture.
diction combined LSTM and GRU into a hybrid model.
Turns out, the LSTM-GRU model has the best result B. LONG SHORT TERM MEMORY
in performance compared to other several comparing Long ShortTerm Memory, or commonly abbreviated as
algorithms [11]. LSTM, is a neural network architecture derived from the
Evaluation result from a research that conducted social Recurrent Neural Network or RNN. RNN is a neural net-
media sentiment analysis on cryptocurrency price fluctua- work algorithm that can remember information from previous
tions shows that people’s behavior on social media, social inputs. However, it cannot remember information with a
network and search on Google, has a significant correlation period of more than 5-10 discrete time on sequential data
and impact on the price of cryptocurrencies [25]. One of or a long period of time, this problem is called the long
the best algorithms to perform sentiment classification from term dependencies problem [30]. Because of this problem,
text data is Naïve Bayes algorithm, as shown in a study that information that has been lost or forgotten for a long time is
has compared 10 classifier methods. 5 of them using the called a vanishing gradient. LSTM is here to fix the vanishing
lexicon approach, and 5 other methods using the machine gradient problem by adding a memory cell that can learn
learning approach [26]. In an experiment on stock market which information to remember and which to forget [31].
sentiment analysis, algorithm Naïve Bayes algorithm and The main components of an LSTM cell are as follows:
4 other algorithms appears to have lower performance when Cell State (Ct): The cell state acts as a memory and
compared to BERT (Bidirectional Encoder Representations flows through the entire sequence of the data. It can preserve
from Transformers) [27]. information over long periods, making it ideal for handling
BERT has been used by developers to create many long-term dependencies.
pre-trained machine learning algorithms for specific needs, Input Gate (i): This gate determines how much of the new
and one of them is FinBERT. FinBERT is a pre-trained Natu- information should be added to the cell state.
ral Language Processing model aimed at the financial sector Forget Gate (f ): The forget gate decides how much of
that increases 15% accuracy in state of the art NLP in the the previous cell state should be retained and passed along
financial sector [28]. A study that uses several machine learn- to the next time step. This helps the LSTM forget irrelevant
ing algorithms and sentiment analysis in predicting several information.
cryptocurrencies by utilizing market price data and sentiment Output Gate (o): The output gate controls how much of
from social media Social network, resulted in the conclu- the cell state should be exposed to the output and passed to
sion that it is very possible to predict cryptocurrency prices the next hidden state.
using social network sentiment data and historical market As shown in Figure 1, Ct−1 is the cell state of the pre-
prices [29]. vious cell, ht−1 is the output of the previous cell, Ct is the

120532 VOLUME 11, 2023


A. S. Girsang, Stanley: Hybrid LSTM and GRU for Cryptocurrency Price Forecasting

FIGURE 3. Proposed method steps.

current cell, ht is the output current cell, and Xt is the input model that is engaged in the scope of Natural Language
to the current cell. The information that has been obtained Processing (NLP) which also achieves state-of-the-art results
will be filtered by the input gate, forget gate, and output in solving 11 problems in the NLP field [32]. BERT is a
gate, to obtain which information should be remembered modification of Transformer architecture which is designed
and which should be forgotten.The operations within an to pre-train the bidirectional representation of unlabeled text
LSTM cell are based on sigmoid and element-wise multi- data. The pre-trained BERT model can be modified by adding
plication operations, which allow the network to learn when an output layer to solve various problems that have specific
to add new information, forget old information, and out- needs and goals (fine tuning).
put relevant information. The LSTM architecture has been There are two main stages in BERT, namely pre-training
widely used in various applications, such as natural language and finetuning. The pre-training stage is a process where
processing, speech recognition, time series prediction, and BERT is trained to understand the language or text that is
more. Its ability to handle long-term dependencies and pre- inputted and understand the context of each sentence. The
vent gradient-related issues makes it a powerful choice for training process for BERT to understand text data is Masked
sequence modeling tasks. Language Modeling (MLM) and Next Sentence Prediction
(NSP). Before BERT is trained with MLM and NSP, all input
C. GATED RECURRENT UNIT texts must go through the input embeddings stage, the details
Gated recurrent unit or GRU is one of the architectures of of which can be seen in Figure 3. In MLM, 15% of words from
the RNN. Like LSTM, GRU also fixes the vanishing gradient each sentence are omitted and replaced by the [MASK] token.
problem that occurs in RNNs. The GRU has a similar archi- In this MLM, BERT is trained to be able to fill each [MASK]
tecture to the LSTM but is simpler because the GRU does not token with the correct word. The NSP process is a process
have a cell state (Ct ) and there are fewer gates in the GRU. where BERT is trained to understand the order of each word
The GRU has the output of the previous cell, ht−1 and the of each sentence. By understanding the position of each word
input to the current cell, Xt which generates the output of the in the sentence, BERT will be able to understand the meaning
current cell. The GRU has a reset gate for short term memory, of each word according to the context of the sentence. Text
and an update gate for long term memory. The architecture of input will go through the Sentence Embedding and Posi-
the GRU is depicted in Figure 2. tional Embedding processes first to assist the NSP process.
As depicted in Figure 3, each sentence and word are given a
D. BERT specific code to find out each sentence order and word order.
Bidirectional Encoder Representations from Transformers, After BERT has understood the meaning and context
or commonly abbreviated as BERT is a machine learning of each input text, it will proceed to the part where the
VOLUME 11, 2023 120533
A. S. Girsang, Stanley: Hybrid LSTM and GRU for Cryptocurrency Price Forecasting

fine-tuning stage is needed. Fine tuning is the stage of modi- B. DATA COLLECTION FROM MEDIA SOCIAL
fying the output layer of the BERT architecture and adapting 1) TWEET ABOUT SOLANA AND EHTEREUM
it to specific tasks such as sentiment classification, word clas- The data collected is in the form of text from social network ‘s
sification, question and answer engine, and sentence marking, tweets with topics about Solana and Ethereum to be processed
according to the dataset to be used. and used as sentiment data. The other data to be collected is
historical cryptocurrency market data that contains informa-
E. FINBERT tion about price movements in daily timeframe. Tweets data
FinBERT is a specialized language model based on the BERT is obtained directly from Social network by using a python
(Bidirectional Encoder Representations from Transformers) library called SNScrape that will collect social network data
architecture that is specifically designed for financial sen- according to the search query that entered by user. In this
timent analysis and financial text classification tasks. It is case, up to 50 tweets which have a minimum of 20 retweets
trained on a large corpus of financial news articles, earnings will be collected per day. Also, only tweets that contain the
call transcripts, and other financial text data [28]. words ‘‘ethereum’’, ‘‘solana’’, ‘‘eth’’, or ‘‘sol’’ is included,
FinBERT leverages the pretraining capabilities of BERT, and tweets with the word ‘‘giveaway’’ or ‘‘airdrop’’ will be
which is a transformer-based model trained on a massive filtered out to minimize outlier data.
amount of general domain text data. However, FinBERT
further fine-tunes the BERT model on financial text data to 2) TWEETS PRE-PROCESSING
make it more adept at understanding the nuances of financial The tweet data that has been obtained must go through the
language and capturing financial sentiment. pre-processing stage before sentiment extraction is carried
The advantages of FinBERT lie in its ability to understand out. This stage of data preprocessing is very important and
the unique vocabulary, jargon, and context of financial text. can improve the accuracy of sentiment classification. This is
By pretraining on financial data and then fine-tuning on because tweet data has attributes that cannot be considered
specific financial tasks, FinBERT can provide more accurate features and have no influence in determining sentiment.
sentiment analysis and classification for financial text [33]. Therefore, the data must be cleaned first before going to the
next process. The preprocessing stage on social network data
III. PROPESED METHOD includes removing signs or symbols such as ‘‘#’’ and ‘‘@’’,
The proposed method is shown in Figure 3. The detailed steps removing the ‘‘RT’’ retweet sign, deleting newline marks, and
are described in this sub section. removing all URLs or links.

A. DATA COLLECTION FROM HISTORICAL MARKET C. FINBERT SENTIMENT CLASSIFICATION


1) BINANCE’S DATA OF SOL AND ETH 1) INPUT EMBEDDING AND PRE-TRAINING
Cryptocurrency historical market data of solana (SOL) and Tweets data from social network that has been cleaned
ethereum (ETH) are obtained from www.cryptodatadownload. through the preprocessing stage will then be processed by
com which contains data on the 28 cryptocurrencies market one of the pre-trained models from BERT, namely FinBERT.
on Binance, one of the largest cryptocurrency trading sites in By FinBERT, the text data that has been obtained will go
the world. Tweets and market data for Solana and Ethereum through several stages. The first stage is to train text input
will be obtained separately and retrieved daily for a year, from data in a general scope (general corpus). With various text
January 1, 2021, to December 31, 2021. data from various Book Corpus and Wikipedia.

2) FEATURE SELECTION 2) FINE TUNING


Some features can be obtained from www.cryptodatadownlo The second stage is to adapt text data by training it in a
ad.com. However, the feature selection is performed by financial context with a dataset from Reuters TRC-2-financial
choosing from all features. They are closing price, opening containing 1.8 million financial news articles. After the data
price, highest price, and lowest price. has been trained in the financial corpus, the data will be
trained to perform the classification task. This classification
3) NORMALIZATION training was conducted using the Financial PhraseBank sen-
On the dataset that has been formed, the data normalization timent dataset which contains 4845 English sentences from
process or feature scaling will be carried out. The Min- various financial news [28].
MaxScaling function provided by the Scikit-Learn library
will be used. MinMaxScaling will homogenize numeric data 3) AVERAGE DAILY SENTIMENT SCORE
that previously had values with different scales into values Data Finbert will get the sentiment score for each tweet that
between 0 to 1. 0 represents the lowest value in the data ranges from 0 to 1, close to 0 means sentiment is close to neg-
feature and 1 represents the highest value, so that all data has ative and close to 1 means sentiment is close to positive. Since
a uniform value. the dataset that will be used is data for the daily period, the

120534 VOLUME 11, 2023


A. S. Girsang, Stanley: Hybrid LSTM and GRU for Cryptocurrency Price Forecasting

TABLE 1. Features of combined dataset. TABLE 2. Training parameters.

enthusiasm among traders. The historical highest prices can


help models identify potential resistance levels, where the
price might encounter selling pressure in the future. These
sentiment values of various tweets per day will be averaged resistance levels can guide predictions about potential price
so that daily sentiment score will be obtained which will later pullbacks or consolidation phases.
be combined with selected market data. Daily sentiment data that has been obtained will be com-
bined with data on daily closing price, daily opening price,
D. COMBINED DATASET daily highest price, and daily lowest price of certain cryp-
Combined dataset is got from historical market data com- tocurrencies. Therefore, this combined dataset consists of
bining with the average daily sentiment tweet. The historical 5 features variables and dates as the index variable as shown
market data selected for prediction features are closing prices, in Table 1.
opening price, lowest price, and highest price, all in the
daily time. The closing price of a cryptocurrency is the last E. PRICE PREDICTION WITH LSTM-GRU
recorded price at the end of a specific time period, typically a The dataset that has 5 variables will be used as vector form to
day. It is an important feature for price prediction because it be input to the prediction algorithm as time series data. The
encapsulates all the trading activity and market sentiment that algorithm used to make predictions is an algorithm derived
occurred during that time frame. Traders and investors often from the Recurrent Neural Network (RNN), which is a hybrid
use closing prices to identify trends, support and resistance model of LSTM and GRU. Both algorithms are known for
levels, and overall market sentiment. their ability to remember important information from previ-
By analyzing the historical closing prices, predictive mod- ous data iterations. The algorithm derived from the RNN has
els can capture patterns and trends that might repeat in the also been proven to have good performance in making time
future, helping to forecast potential price movements. The series forecasting type predictions.
opening price of a cryptocurrency is the first recorded price As can be seen in figure 4, the data will be trained by
at the beginning of a trading period, usually a day. It’s sig- the LSTM and GRU separately with 30 neurons. The output
nificant because it sets the initial tone for trading activity of both will then be fed to the dropout layer to prevent
and reflects the market sentiment at the start of the day. The overfitting. The result of the dropout layer via GRU will
opening price can be influenced by news, events, and sen- be followed by a dense layer. The results of the dropout
timents from the previous day or during non-trading hours. layer via LSTM will be fed to the LSTM layer again with
Including the opening price as a feature in prediction models 50 neurons, then continued with a dense layer. The output
allows them to capture the immediate reaction of the mar- from the dense layer via LSTM and GRU will then be con-
ket to new information, which can be crucial for short-term catenated and will return through the dense layer with ReLu
price movements. The lowest price of a cryptocurrency as the activation function used. Then the prediction results
during a specific time period represents the point at which using the LSTM-GRU hybrid algorithm are obtained. Hybrid
the market saw the least demand or the most selling pressure. LSTM-GRU and all other models involved in this study use
This feature is valuable for prediction because it indicates the following parameters, as shown in Table 2. Windows
the level at which buyers stepped in to support the price, period values 7 (seven) because the prediction is used for a
potentially acting as a support level in the future. Analyzing week. Batch size refers to the number of training examples
the historical lowest prices can help models identify key used in a single iteration of the training process. In practice,
support levels that traders often pay attention to. Predictive different batch is tried sizes such as 8, 16, 32, 64, so forth.
models can use this information to estimate potential price In this study. a batch size of 16 is used.
rebounds or reversals when the price approaches historical
intraday lows. The highest price of a cryptocurrency during F. EVALUATION
a certain time frame signifies the peak of buying interest or Prediction results using LSTM-GRU on historical data and
positive market sentiment during that period. This feature is sentiment values was compared with LSTM-GRU prediction
crucial for prediction as it shows the level of demand and results on historical data without using sentiment, LSTM

VOLUME 11, 2023 120535


A. S. Girsang, Stanley: Hybrid LSTM and GRU for Cryptocurrency Price Forecasting

FIGURE 4. Hybrid LSTM-GRU prediction model architecture.

TABLE 3. Experiment results (ethereum without sentiment).

with and without sentiment, and also GRU with and without without the sentiment score, and the second part is a com-
sentiment. Non hybrid LSTM and GRU model each have 50 parison between the same several machine learning models,
neurons and follow the same training parameters as described but with the sentiment score added to the historical dataset.
in Table 2. This prediction comparison will use several error Machine learning models that are used in the experiment
metrics, namely Mean Absolute Error (MAE), Mean Abso- are LSTM, GRU, and Hybrid LSTM-GRU. Both Ethereum
lute Percentage Error (MAPE), and Root Mean Squared Error and Solana Cryptocurrency are tested separately, but with the
(RMSE) with the aim of measuring the prediction error com- same treatment.
pared to the actual data. In these three metrics, the smaller Table 3 and Table 4 show the experiment results of both
the error number, the more accurate the prediction results. Ethereum and Solana Cryptocurrency without sentiment. The
The three error measurement methods were chosen because testing is conducted five times for each algorithm. RNN algo-
they are often used as evaluation methods for time series rithms such as LSTM and GRU usually produce inconsistent
forecasting [34]. Previous research related to the comparison results due to their randomization element. To deal with it,
of algorithms for cryptocurrency price predictions also used each experiment is trained and tested five times, and the
these evaluation methods [21]. average value of the experiment will be used as a value for
comparison. They show that GRU is better than LSTM, and
IV. EXPERIMENT RESULT hybrid GRU-LSTM is better than GRU. MAE, MAPE, and
There are two parts experiment in this study. The first part RMSE as the evaluation metrics to determine has the lowest
is a comparison between several machine learning models error rate.

120536 VOLUME 11, 2023


A. S. Girsang, Stanley: Hybrid LSTM and GRU for Cryptocurrency Price Forecasting

TABLE 4. Experiment results (solana without sentiment).

TABLE 5. Experiment results (ethereum with sentiment).

Table 5 and Table 6 show the experiment results of both and LSTM. To get the promising result of the experiment,
Ethereum and Solana Cryptocurrency with sentiment. With- and to make it easier to compare, results from five tests and
out sentiment, GRU-LSTM results are also better than GRU train for Ethereum and Solana price has been averaged and

VOLUME 11, 2023 120537


A. S. Girsang, Stanley: Hybrid LSTM and GRU for Cryptocurrency Price Forecasting

TABLE 6. Experiment results (solana with sentiment).

TABLE 7. Experiment results (ethereum average error value).

TABLE 8. Experiment results (solana average error value).

summarized as shown in table 7 and table 8 above. Table 7 LSTM-GRU model with sentiment dataset is smallest for all
and table 8 shows MAE, MAPE, and RMSE of hybrid test case. It shows that the proposed method is better than the

120538 VOLUME 11, 2023


A. S. Girsang, Stanley: Hybrid LSTM and GRU for Cryptocurrency Price Forecasting

FIGURE 5. Visualized ethereum price prediction result.

hybrid LSTM-GRU which does not use daily sentiment score or solana. Sentiment values extracted using FinBERT will be
for the prediction. Not only for the hybrid LSTM-GRU, but grouped per day into daily sentiment values, which will then
all models that included the sentiment dataset have got a 0.5% be combined with daily historical data. The combined dataset
to 1% MAPE improvement than models that trained without is then fed into the hybrid LSTM-GRU, one of the best models
the sentiment dataset. for conducting time-series forecasting which is sourced from
It’s also shown that although hybrid LSTM-GRU has previous research.
the best performance, GRU has better performance than This research proves that the proposed method of adding
LSTM for experiments using a sentiment dataset or not sentiment score from social network which extracted using
using a sentiment dataset. Another finding is that although FinBERT can improve prediction performance from only
including the sentiment dataset has better performance on using commonly used time series prediction models such as
LSTM, GRU, and hybrid LSTM-GRU, the performance is LSTM, GRU, and even hybrids of the two. Social media is
also highly dependent on the machine learning model used. a platform where traders and investors express their opinions
This is shown in table 7 and table 8, where the perfor- which also affect changes in asset prices. Most of the crypto
mance for hybrid LSTM-GRU without sentiment dataset is price predictions that are usually done do not consider the
better than LSTM or GRU with the sentiment. Line plot sentiments of social media. The use of this sentiment data is
of the price prediction of Ethereum cryptocurrency which expected to help researchers to improve the performance of
was made using Pyplot and Seaborn python library can be the model and help traders and investors to maximize their
seen in figure 5. These plots are made as a visualization profits.
of the prediction results made by the LSTM-GRU hybrid This study has several limitations that can be used for
algorithm using sentiment from social network. The blue future research. Sentiment taken is only sourced from social
and orange lines represent the results of training and testing, network, and there is difficulty in choosing relevant tweets,
with a ratio of 80% of the data for training and 20% for because there are some tweets that can be considered as out-
testing and prediction. The predicted results are represented liers because they do not express sentiment well. For higher
by a purple line. This study does not provide computation performance improvements in future research, sentiment for
cost in this study because we do not calculate computation cryptocurrencies may be taken from several other sources
time. This research combines LSTM and GRU parallel which such as google trends and forums as well as the crypto
will have a higher computation cost than LSTM and GRU community, it is also possible that filtering for sentiment can
individually. be more focused and other sentiment extraction models can
be used.
V. CONCLUSION
This study proposed a new method to predict prices of
two widely known cryptocurrencies, namely Ethereum and ACKNOWLEDGMENT
Solana. Our time-series prediction method utilizes the senti- This work was supported by Bina Nusantara University,
ment value contained in every tweet that discusses ethereum Jakarta, Indonesia.

VOLUME 11, 2023 120539


A. S. Girsang, Stanley: Hybrid LSTM and GRU for Cryptocurrency Price Forecasting

REFERENCES [25] K. Wołk, ‘‘Advanced social media sentiment analysis for short-term cryp-
[1] H. Mittal and S. Goel, ‘‘Economic, legal and financial perspectives on cryp- tocurrency price prediction,’’ Expert Syst., vol. 37, no. 2, Apr. 2020,
tocurrencies: A review on cryptocurrency growth, opportunities and future Art. no. e12493.
prospects,’’ World Rev. Entrepreneurship, Manage. Sustain. Develop., [26] J. Hartmann, J. Huppertz, C. Schamp, and M. Heitmann, ‘‘Comparing
vol. 16, no. 6, p. 611, 2020. automated text classification methods,’’ Int. J. Res. Marketing, vol. 36,
[2] S. Nakamoto, ‘‘Bitcoin: A peer-to-peer electronic cash system,’’ Decen- no. 1, pp. 20–38, Mar. 2019.
tralized Bus. Rev., 2008. [27] M. G. Sousa, K. Sakiyama, L. de Souza Rodrigues, P. H. Moraes,
[3] C. S. Wright, Satoshi’s Vision: The Art of Bitcoin. Kydala Publishing, Inc., E. R. Fernandes, and E. T. Matsubara, ‘‘BERT for stock market senti-
2019. ment analysis,’’ in Proc. IEEE 31st Int. Conf. Tools Artif. Intell. (ICTAI),
[4] N. A. Kyriazis, ‘‘A survey on efficiency and profitable trading opportuni- Nov. 2019, pp. 1597–1601.
ties in cryptocurrency markets,’’ J. Risk Financial Manage., vol. 12, no. 2, [28] D. Araci, ‘‘FinBERT: Financial sentiment analysis with pre-trained lan-
p. 67, Apr. 2019. guage models,’’ 2019, arXiv:1908.10063.
[5] A. D. Yuliyono and A. S. Girsang, ‘‘Artificial bee colony-optimized LSTM [29] F. Valencia, A. Gómez-Espinosa, and B. Valdés-Aguirre, ‘‘Price movement
for Bitcoin price prediction,’’ Adv. Sci., Technol. Eng. Syst. J., vol. 4, no. 5, prediction of cryptocurrencies using sentiment analysis and machine learn-
pp. 375–383, 2019. ing,’’ Entropy, vol. 21, no. 6, p. 589, Jun. 2019.
[6] L. Alessandretti, A. ElBahrawy, L. M. Aiello, and A. Baronchelli, [30] F. A. Gers, J. Schmidhuber, and F. Cummins, ‘‘Learning to forget:
‘‘Anticipating cryptocurrency prices using machine learning,’’ Complexity, Continual prediction with LSTM,’’ Neural Comput., vol. 12, no. 10,
vol. 2018, pp. 1–16, Nov. 2018. pp. 2451–2471, Oct. 2000.
[7] N. A. Hitam and A. R. Ismail, ‘‘Comparative performance of machine [31] A. Graves, ‘‘Long short-term memory,’’ in Supervised Sequence Labelling
learning algorithms for cryptocurrency forecasting,’’ Indonesian J. Electr. With Recurrent Neural Networks, Studies in Computational Intelligence.
Eng. Comput. Sci., vol. 11, no. 3, p. 1121, Sep. 2018. Springer, 2012, pp. 37–45.
[8] E. Akyildirim, A. Goncu, and A. Sensoy, ‘‘Prediction of cryptocurrency [32] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, ‘‘BERT: Pre-training
returns using machine learning,’’ Ann. Oper. Res., vol. 297, nos. 1–2, of deep bidirectional transformers for language understanding,’’ 2018,
pp. 3–36, Feb. 2021. arXiv:1810.04805.
[9] T. Phaladisailoed and T. Numnonda, ‘‘Machine learning models compar- [33] Y. Yang, M. C. S. Uy, and A. Huang, ‘‘FinBERT: A pretrained language
ison for Bitcoin price prediction,’’ in Proc. 10th Int. Conf. Inf. Technol. model for financial communications,’’ 2020, arXiv:2006.08097.
Electr. Eng. (ICITEE), Jul. 2018, pp. 506–511. [34] R. J. Hyndman and G. Athanasopoulos, Forecasting: Principles and Prac-
tice, 2nd ed. Melbourne, VIC, Australia: OTexts.com, 2018.
[10] M. Rizwan, S. Narejo, and M. Javed, ‘‘Bitcoin price prediction using deep
learning algorithm,’’ in Proc. 13th Int. Conf. Math., Actuarial Sci., Comput.
Sci. Statist. (MACS), Dec. 2019, pp. 1–7.
[11] M. S. Islam and E. Hossain, ‘‘Foreign exchange currency rate predic-
tion using a GRU-LSTM hybrid network,’’ Soft Comput. Lett., vol. 3,
Dec. 2021, Art. no. 100009.
[12] Y. Sovbetov, ‘‘Factors influencing cryptocurrency prices: Evidence from
Bitcoin, Ethereum, Dash, Litcoin, and Monero,’’ J. Econ. Financial Anal.,
vol. 2, no. 2, pp. 1–27, 2018.
[13] T. Aste, ‘‘Cryptocurrency market structure: Connecting emotions and eco-
nomics,’’ Digit. Finance, vol. 1, nos. 1–4, pp. 5–21, Nov. 2019. ABBA SUGANDA GIRSANG received the
[14] N. Smuts, ‘‘What drives cryptocurrency prices? An investigation of Google bachelor’s degree from the Department of Electri-
Trends and telegram sentiment,’’ ACM SIGMETRICS Perform. Eval. Rev., cal Engineering, Gadjah Mada University (UGM),
vol. 46, no. 3, pp. 131–134, Jan. 2019. Yogyakarta, Indonesia, in 2000, the master’s
[15] V. Karalevicius, N. Degrande, and J. De Weerdt, ‘‘Using sentiment analysis degree from the Department of Computer Science,
to predict interday Bitcoin price movements,’’ J. Risk Finance, vol. 19, UGM, in 2008, and the Ph.D. degree from the
no. 1, pp. 56–75, Jan. 2018. Department of Electrical Engineering, Institute
[16] S. J. H. Shahzad, M. Anas, and E. Bouri, ‘‘Price explosiveness in cryptocur- of Computer and Communication Engineering,
rencies and Elon Musk’s tweets,’’ Finance Res. Lett., vol. 47, Jun. 2022, National Cheng Kung University, Tainan, Taiwan,
Art. no. 102695. in 2014. He was a Staff Consultant Programmer
[17] Y. Chen and C. Bellavitis, ‘‘Blockchain disruption and decentralized with the Bethesda Hospital, Yogyakarta, in 2001, and a Web Developer,
finance: The rise of decentralized business models,’’ J. Bus. Venturing
from 2002 to 2003. Then, he joined the Department of Informatics Engi-
Insights, vol. 13, Jun. 2020, Art. no. e00151.
neering, Universitas Janabadra, as a Lecturer, from 2003 to 2015. He also
[18] B. K. Mohanta, S. S. Panda, and D. Jena, ‘‘An overview of smart contract
taught some subjects at some universities, from 2006 to 2008. His research
and use cases in blockchain technology,’’ in Proc. 9th Int. Conf. Comput.,
Commun. Netw. Technol. (ICCCNT), Jul. 2018, pp. 1–4.
interests include swarm intelligence, business intelligence, machine learning,
[19] T. Hu, X. Liu, T. Chen, X. Zhang, X. Huang, W. Niu, J. Lu, K. Zhou, and media social text mining.
and Y. Liu, ‘‘Transaction-based classification and detection approach for
Ethereum smart contract,’’ Inf. Process. Manage., vol. 58, no. 2, Mar. 2021,
Art. no. 102462.
[20] M. F. Chania, O. Sara, and I. Sadalia, ‘‘Analisis risk dan return investasi
pada Ethereum dan saham LQ45,’’ Studi Ilmu Manajemen dan Organisasi,
vol. 2, no. 2, pp. 139–150, Oct. 2021.
[21] M. M. Patel, S. Tanwar, R. Gupta, and N. Kumar, ‘‘A deep learning-based
cryptocurrency price prediction scheme for financial institutions,’’ J. Inf.
Secur. Appl., vol. 55, Dec. 2020, Art. no. 102583.
[22] Y. Tanulia and A. S. Girsang, ‘‘Sentiment analysis on Twitter for predicting STANLEY received the bachelor’s and master’s
stock exchange movement,’’ Adv. Sci., Technol. Eng. Syst. J., vol. 4, no. 3, degrees in computer science from Bina Nusantara
pp. 244–250, 2019. University (BINUS), in 2022. His research inter-
[23] N. Das, B. Sadhukhan, T. Chatterjee, and S. Chakrabarti, ‘‘Effect of public ests include data mining and text mining.
sentiment on stock market movement prediction during the COVID-19
outbreak,’’ Social Netw. Anal. Mining, vol. 12, no. 1, p. 92, Dec. 2022.
[24] M. Iqbal, M. Iqbal, F. Jaskani, K. Iqbal, and A. Hassan, ‘‘Time-series
prediction of cryptocurrency market using machine learning techniques,’’
EAI Endorsed Trans. Creative Technol., vol. 8, no. 28, Aug. 2021,
Art. no. 170286.

120540 VOLUME 11, 2023

You might also like