Unveiling Cryptocurrency Conversations Insights From Data Mining and Unsupervised Learning Across Multiple Platforms
Unveiling Cryptocurrency Conversations Insights From Data Mining and Unsupervised Learning Across Multiple Platforms
Unveiling Cryptocurrency Conversations Insights From Data Mining and Unsupervised Learning Across Multiple Platforms
ABSTRACT The rapid growth of the cryptocurrency market has led to an increasing interest in the subject.
Cryptocurrency is now recognized as an asset, and laws and financial regulations have begun to emerge
for supporting its practical use. As a result, it has become essential to perform data mining and attain
knowledge from text data related to cryptocurrency. Previous studies have focused on analyzing data from
a single source such as Twitter. However, there are unique insights to be gained from data across multiple
platforms. In the present study, we utilized data mining techniques to extract insights from LexisNexis,
Web of Science, and Reddit, representing the media, academia, and general public, respectively. Among
unsupervised learning technologies, topic modeling was employed for the analysis. Topic modeling is a
methodology that uncovers hidden meanings within the collected data. Among the diverse topic modeling
techniques available, bidirectional encoder representations from transformers topic was chosen for the
analysis. BERTopic considered to be state-of-the-art in the field of topic modeling. Dynamic topic modeling
was employed to track changes in themes over time. Our experimental results reveal a tendency in the
news to cover major events related to cryptocurrencies, such as regulatory developments and market trends.
Academic papers, on the other hand, tend to focus on the technology behind cryptocurrencies and related
research. Finally, social media conversations center more around information delivery from an investor’s
psychological perspective, such as market sentiment and investment strategies.
INDEX TERMS Bitcoin, cryptocurrency, data mining, machine learning, natural language processing,
unsupervised learning, topic modeling, BERTopic.
level of interest on Bitcoin can serve as an indicator of exemplified. Lastly, the default structure and advantages of
the broader interest in cryptocurrencies [3], [4], [5]. This using BERTopic are illustrated.
approach enabled authors to extract valuable insights from
a wide spectrum of sources, encompassing mass media, A. RESEARCH ON BITCOIN USING NATURAL LANGUAGE
academia, and social media, providing a more comprehensive PROCESSING
perspective on the subject.
NLP is utilized to extract valuable information from texts
Topic modeling is a methodology that automatically clus-
across various fields, with the objective of deriving insights
ters words based on statistics and machine learning [6].
and practical applications. The current section presents litera-
The method can help attain insights by extracting the main
ture reviews on prior studies that utilized NLP to gain insights
theme from a given data source [7]. Although numerous into Bitcoin.
topic modeling methods, such as latent Dirichlet allocation In [10], the relation between social media topic discussion
(LDA) and non-negative matrix factorization (NMF) are and cryptocurrency market price fluctuations was analyzed
available, but in this study, the state-of-the-art (SOTA) model
via statistical and NLP models (i.e., DMR). In [11], NLP
known as bidirectional encoder representations from trans-
algorithms were used to measure the relationship between
formers topic (BERTopic) was applied [8]. BERTopic is a investment sentiment and bitcoin price fluctuations using data
topic modeling technique that employs Bidirectional Encoder
from the subreddits ‘‘r/bitcoin’’ and ‘‘r/investing.’’ In [12],
Representations from Transformers (BERT) embeddings and
a prediction was made on the direction and magnitude of
class-based TF-IDF (c-TF-IDF) to formulate compact clus-
Bitcoin price fluctuations using sentiment analysis and post
ters that are easily explicable while preserving important
volume extracted from Twitter data. A relative accuracy of
words in topic explanations [9]. Additionally, BERTopic is 63% was achieved using a model based on recurrent neural
recognized for its modularity, enabling the incorporation and networks (RNN) and convolutional neural networks (CNN).
utilization of diverse algorithms within its framework. Con- Satarov et al. [13] confirmed that sentiment analysis of
sidering the previously mentioned advantages, the authors tweets related to Bitcoin can be used to predict Bitcoin price
have selected modified BERTopic as the most suitable changes. An accuracy of 62.48% was attained by a random
algorithm for investigating perceptions related to Bitcoin. forest regression model when applying sentiment analysis to
At the beginning of the analysis on each specific plat-
Twitter data. Jung et al. [14] aimed to predict Bitcoin price
form, the authors assessed the coherence values of baseline
trends by analyzing both the volume and sentiment of Reddit
models (LDA, NMF) and the modified BERTopic. This pro-
data, as well as technical indicators of chart analysis. The
cess confirmed the capability of BERTopic to accurately
authors achieved an accuracy of 90.57% and an area under
depict topics in the chosen domain. In summary, modified
the curve (AUC) value of 97.48% using an extreme gradient
BERTopic demonstrated the highest coherence values in all
boosting (XGBoost) model.
three analyses, resulting in the subsequent findings derived
from its application.
The results topic representations revealed that news B. TOPIC MODELING
sources primarily cover major events, academic journals Topic modeling is a statistical model used in the field of
focus on technological advancements, and social media plat- NLP to discover abstract main themes, referred to as topics,
forms discuss the sentiments of cryptocurrency investors, within sets of documents [15]. In other words, it is a text
affirming the distinct characteristics within each domain. mining technique that is utilized to uncover hidden semantic
In summary, our research not only addresses the existing structures within textual data. Examples of representative
research problem but also endeavors to bridge the research topic modeling technologies are DMR and LDA.
gap by offering a comprehensive analysis of Bitcoin-related Yin and Yuan [16] employed LDA topic modeling to ana-
text data from diverse sources. The insights gained from this lyze research subjects and progress trends related to blended
study have the potential to advance understanding of the learning using keyword analysis, with results showing that
challenges and issues faced by blockchain technology, setting the ratio of element topics in blended learning has been
the stage for future developments in this field. In addition, the increasing every year. Moreover, the text analysis provided
utilization of modified BERTopic across various platforms theoretical and methodological reference materials to facil-
constitutes a novel approach that remains notably unexplored itate future research. Polyzos and Wang [17] conducted an
within the existing research. LDA analysis on Twitter data to quantify energy market effi-
ciency. The extracted topic was then applied to a classification
model to measure prediction accuracy for market movements.
II. RELATED WORKS Sharma and Sharma [18] collected research papers related to
The following subsections present brief reviews of literature blockchain technology from various databases and attempted
that aim to provide insight by applying natural language to create a semantic map using the LDA model. Through
processing (NLP) to Bitcoin. Subsequently, the concept of a metadata analysis, an abstract perspective of blockchain
topic modeling is explained, and studies that sought to obtain was attained. Avasthi et al. [19] conducted a comparison
knowledge from various domains using topic modeling are of various topical models, including LDA, correlated topic
model (CTM), hierarchical Dirichlet process (HDP), and This model is known as class-based TF-IDF (c-TF-IDF).
DMR, using adolescent drug use and depression as keywords.
Egger and Yu [20] applied LDA, NMF, and BERTopic to A
Wx,c = tfx,c · log 1 + (1)
Twitter posts and conducted a comparative analysis for each f
topic modeling algorithm.
In the aforementioned equation, tfx,c is the frequency in
class c of word x, fx is the frequency of word x among all
C. DEFAULT BIDIRECTIONAL ENCODER classes, and A is the average number of words per class. Sim-
REPRESENTATIONS FROM TRANSFORMERS TOPIC ilar to the traditional TF-IDF approach, the importance score
(BERTOPIC) STRUCTURE of a word in each class is obtained by multiplying the term
BERTopic is a SOTA framework for topic modeling technol- frequency tfx,c and inverse document frequency log(1 + fAx ).
ogy that consists of five sub models, each of which can be In contrast to traditional topic modeling methodologies,
selected and used independently [9]. In the current section, BERTopic stands out by leveraging pretrained language mod-
the default structure of BERTopic is presented. els for document and word representations, making it adept
at capturing complex relationships between words and con-
1) DOCUMENT EMBEDDING text. Furthermore, its non-linear dimensionality reduction
The default BERTopic approach utilizes sentence- approach and modularity enable BERTopic to enhance the
transformers (SBERT) to convert text data into numerical quality of topic representation compared to conventional
representations, allowing for exclusive semantic similarity methods.
that significantly enhances clustering tasks in comparison to This study is meaningful in that it is the first to combine
LDA [21]. the keyword ‘Bitcoin’ with modified BERTopic, the SOTA
in topic modeling, to obtain knowledge from three distinct
sources - LexisNexis, Web of Science, and Reddit.
2) DIMENSIONALITY REDUCTION
Because clustering models struggle with high-dimensional
III. METHOD
data due to the curse of dimensionality, it is essential to
The following subsections outline the experimental pro-
perform dimensionality reduction after obtaining represen-
cedures followed in this study. Firstly, the sources and
tation [22]. By default, BERTopic utilizes uniform manifold
descriptions of collected data are represented. Secondly, the
approximation and projection (UMAP) for this task. UMAP
preprocessing steps undertaken for the text data are presented.
is a dimensionality reduction procedure that preserves both
Finally, the application of BERTopic for deriving topic mod-
local and global structures in data, thereby enabling the
eling results is explained.
clustering of semantically similar documents [23].
A. DATA COLLECTION
3) CLUSTERING DOCUMENTS AND BAG OF WORDS It is possible to gauge the general sentiment toward cryptocur-
Default BERTopic uses hierarchical density-based spatial rency by analyzing data on Bitcoin, which is representative
clustering of applications with noise (HDBSCAN), a density- of blockchain technology and cryptocurrency [3], [4], [5].
based clustering technique that clusters text data [24]. This Accordingly, all data examined in this study were collected
approach allows for outlier detection and the identification using search queries for ‘‘Bitcoin’’ and ‘‘BTC.’’ To interpret
of different cluster shapes, preventing text data from being the sentiment toward cryptocurrency from media, academic,
forcibly included in the wrong cluster. and public perspectives, data were collected from LexisNexis,
Because HDBSCAN generates clusters with varying densi- Web of Science, and Reddit, respectively. Data obtained from
ties and shapes, centroid-based topic presentation techniques LexisNexis comprise the full body of news articles, whereas
may not be suitable. Instead, BERTopic combines all docu- those collected from Web of Science encompass the abstracts
ments within each cluster into a single document to create a of academic papers, and those collected from Reddit encom-
cluster-level bag-of-words (BoW) that records the frequency pass both posts and comments from the r/Bitcoin and r/BTC
of each word in each cluster. This holds significance since subreddits. The data encompass a period of six years from
topic modeling primarily examines words at the topic (i.e., March 1, 2017, to March 1, 2023. In total, 17,230 news
cluster) level. articles, 9,520 academic papers, and 10,914,149 social media
texts were collected.
4) ATTAINING TOPIC REPRESENTATION
Finally, L1 normalization is applied to the BoW repre- B. DATA PREPROCESSING
sentation to account for clusters of varying sizes. Term First, any instances of data wherein the acronym BTC was
Frequency-Inverse Document Frequency (TF-IDF) for BoW used for extraneous contexts (e.g., Cu-BTC, Biliary tract
must consider topics or clusters, rather than individual doc- cancer) were eliminated. All data that use English spelling
uments. By extracting the most significant words from each to express other languages, as well as duplicates and missing
cluster, it is possible to obtain an explanation of the topic. values, were eliminated. A total of 6,011 academic papers,
1) DOCUMENT EMBEDDING
Although SBERT is the default embedding model, the authors
utilized spaCy as an alternative embedding algorithm [25].
One of the motivations for utilizing spaCy lies in its abil-
ity to deliver rapid processing speed while ensuring higher
quality in capturing the textual semantics and maintain-
ing pertinent information during the embedding generation
process.
[9] M. Grootendorst, ‘‘BERTopic: Neural topic modeling with a class-based [32] E. Livni and O. Lopez, ‘‘El Salvador’s adoption of Bitcoin is off to a
TF-IDF procedure,’’ 2022, arXiv:2203.05794. rocky start,’’ The New York Times, Sep. 7, 2021. Accessed: Aug. 4, 2023.
[10] M. Ortu, S. Vacca, G. Destefanis, and C. Conversano, ‘‘Cryptocurrency [Online]. Available: https://www.nytimes.com/2021/09/07/business/el-
ecosystems and social media environments: An empirical analysis through salvador-Bitcoin.html?searchResultPosition=2
Hawkes’ models and natural language processing,’’ Mach. Learn. Appl., [33] E. Barrett, ‘‘Ukraine already trades more crypto than fiat currency. Now
vol. 7, Mar. 2022, Art. no. 100229, doi: 10.1016/j.mlwa.2021.100229. assets like Bitcoin are officially legal,’’ Fortune, Feb. 18, 2020. Accessed:
[11] B. Mcmillan, J. Myers, A. Nguyen, D. Robinson, and M. Kennard, ‘‘Anal- Aug. 4, 2023. [Online]. Available: https://fortune.com/2022/02/18/
ysis and comparison of natural language processing algorithms as applied ukraine-legalizes-cryptocurrency-Bitcoin-russia-digital-assets/
to Bitcoin conversations on social media,’’ J. Investing, vol. 31, no. 2, [34] Triple A, Singapore. (2021). Cryptocurrency Information About Venezuela.
pp. 38–59, Jan. 2022, doi: 10.3905/joi.2021.1.213. Accessed: Aug. 4, 2023. [Online]. Available: https://triple-a.io/crypto-
[12] J. V. Critien, A. Gatt, and J. Ellul, ‘‘Bitcoin price change and trend ownership-venezuela-2021/
prediction through Twitter sentiment and data volume,’’ Financial Innov., [35] L. Davison and C. Condon, ‘‘Treasury calls for crypto transfers
vol. 8, no. 1, p. 45, May 2022, doi: 10.1186/s40854-022-00352-7. over $10,000 to be reported to IRS,’’ Bloomberg, May 20, 2021.
[13] O. Sattarov, H. S. Jeon, R. Oh, and J. D. Lee, ‘‘Forecasting Bitcoin price Accessed: Aug. 4, 2023. [Online]. Available: https://www.bloomberg.
fluctuation by Twitter sentiment analysis,’’ in Proc. Int. Conf. Inf. Sci. com/news/articles/2021-05-20/treasury-calls-for-crypto-transfers-over-
Commun. Technol. (ICISCT), Karachi, Pakistan, Nov. 2020, pp. 1–4. 10-000-reported-to-irs#xj4y7vzkg
[14] H. S. Jung, S. H. Lee, H. Lee, and J. H. Kim, ‘‘Predicting Bitcoin trends [36] D. A. Liedel, ‘‘The taxation of Bitcoin: How the IRS views cryptocurren-
through machine learning using sentiment analysis with technical indica- cies,’’ Drake Law Rev., vol. 66, no. 1, pp. 107–146, 2018.
tors,’’ Comput. Syst. Sci. Eng., vol. 46, no. 2, pp. 2231–2246, 2023, doi: [37] K. Solodan, ‘‘Legal regulation of cryptocurrency taxation in European
10.32604/csse.2023.034466. countries,’’ Eur. J. Law Public Admin., vol. 6, no. 1, pp. 64–74, Sep. 2019.
[15] D. M. Blei, ‘‘Probabilistic topic models,’’ Commun. ACM., vol. 55, no. 4, [38] M. Lerer, ‘‘The taxation of cryptocurrency: Virtual transactions bring real-
pp. 77–84, Apr. 2012, doi: 10.1145/2133806.2133826. life tax implications,’’ CPA J., vol. 89, no. 1, pp. 40–43, Jan. 2019.
[16] B. Yin and C.-H. Yuan, ‘‘Detecting latent topics and trends in blended [39] Y. Liu, Z. Yang, and Y. Benslimane, ‘‘Bitcoin data analysis using deep
learning using LDA topic modeling,’’ Educ. Inf. Technol., vol. 27, no. 9, learning and statistical modeling,’’ in Proc. IEEE Int. Conf. Ind. Eng. Eng.
pp. 12689–12712, Nov. 2022, doi: 10.1007/s10639-022-11118-0. Manage. (IEEM), Kuala Lumpur, Malaysia, Dec. 2022, pp. 127–131.
[17] E. Polyzos and F. Wang, ‘‘Twitter and market efficiency in energy markets: [40] C. Jones, ‘‘Bitcoin debate: Warren Buffett bear vs. Winklevoss twins
Evidence using LDA clustered topic extraction,’’ Energy Econ., vol. 114, bull,’’ Forbes, Feb. 23, 2018. Accessed: Aug. 4, 2023. [Online].
Oct. 2022, Art. no. 106264, doi: 10.1016/j.eneco.2022.106264. Available: https://www.forbes.com/sites/chuckjones/2018/02/23/Bitcoin-
[18] C. Sharma, S. Sharma, and Sakshi, ‘‘Latent Dirichlet allocation (LDA) debate-warren-buffett-bear-vs-winklevoss-twins-bull/?sh=48452fd51331
based information modelling on BLOCKCHAIN technology: A review of
[41] G. Ahlstrand and E. Gkritsi, ‘‘Binance Singapore drops crypto
trends and research patterns used in integration,’’ Multimedia Tools Appl.,
license plans in city-state,’’ CoinDesk, Dec. 13, 2021. Accessed:
vol. 81, no. 25, pp. 36805–36831, Oct. 2022, doi: 10.1007/s11042-022-
Aug. 4, 2023. [Online]. Available: https://www.coindesk.com/policy/2021/
13500-z.
12/13/binance-singapore-drops-crypto-license-plans-in-city-state/
[19] S. Avasthi, R. Chauhan, and D. P. Acharjya, ‘‘Topic modeling techniques
[42] J. Lim, ‘‘MAS orders crypto exchange platform Binance.com to stop
for text mining over a large-scale scientific and biomedical text corpus,’’
services in Singapore,’’ The Straits Times, Sep. 9, 2021. Accessed:
Int. J. Ambient Comput. Intell., vol. 13, no. 1, pp. 1–18, Jan. 2022, doi:
Aug. 4, 2023. [Online]. Available: https://www.straitstimes.com/
10.4018/ijaci.293137.
business/banking/binancecom-placed-on-mas-investor-alert-list
[20] R. Egger and J. Yu, ‘‘A topic modeling comparison between LDA, NMF,
[43] A. Singh, R. Rajak, H. Mistry, and P. Raut, ‘‘Aid, charity and donation
Top2Vec, and BERTopic to demystify Twitter posts,’’ Frontiers Sociol.,
tracking system using blockchain,’’ in Proc. 4th Int. Conf. Trends Electron.
vol. 7, May 2022, Art. no. 886498, doi: 10.3389/fsoc.2022.886498.
Informat. (ICOEI), Tirunelveli, India, Jun. 2020, pp. 457–462.
[21] N. Reimers and I. Gurevych, ‘‘Sentence-BERT: Sentence embeddings
using Siamese bert-networks,’’ 2019. arXiv:1908.10084. [44] E. Shaheen, M. A. Hamed, W. Zaghloul, E. A. Mostafa, A. E. Sharkawy,
A. Mahmoud, A. Labeb, M. O. A. Enany, and G. Attiya, ‘‘A track donation
[22] I. Assent, ‘‘Clustering high dimensional data,’’ Wiley Interdiscipl. Rev.,
system using blockchain,’’ in Proc. Int. Conf. Electron. Eng. (ICEEM),
Data Mining Knowl. Discovery, vol. 2, no. 4, pp. 340–350, Jun. 2012, doi:
Menouf, Egypt, Jul. 2021, pp. 1–7.
10.1002/widm.1062.
[23] L. McInnes, J. Healy, and J. Melville, ‘‘UMAP: Uniform manifold approx- [45] B. Lindrea, ‘‘Ukraine netted $70M in crypto donations since start of Russia
imation and projection for dimension reduction,’’ 2018, arXiv:1802.03426. conflict,’’ Cointelegraph, Feb. 27, 2023. Accessed: Aug. 4, 2023. [Online].
Available: https://cointelegraph.com/news/ukraine-netted-70m-in-crypto-
[24] L. McInnes, J. Healy, and S. Astels, ‘‘HDBSCAN: Hierarchical density
donations-since-start-of-russia-conflict
based clustering,’’ J. Open Source Softw., vol. 2, no. 11, p. 205, Mar. 2017,
doi: 10.21105/joss.00205. [46] Forbes. (Sep. 20, 2022). What Really Happened to LUNA Crypto?
Accessed: Aug. 4, 2023. [Online]. Available: https://www.forbes.com/
[25] M. Honnibal and I. Montani, ‘‘SpaCy 2: Natural language understanding
sites/qai/2022/09/20/what-really-happened-to-luna-
with Bloom embeddings, convolutional neural networks and incremental
crypto/?sh=63540a624ff1
parsing,’’ To Appear, vol. 7, no. 1, pp. 411–420, 2017.
[26] M. Röder, A. Both, and A. Hinneburg, ‘‘Exploring the space of topic [47] S. Lee, J. Lee, and Y. Lee, ‘‘Dissecting the Terra-LUNA crash: Evidence
coherence measures,’’ in Proc. 8th ACM Int. Conf. Web Search Data from the spillover effect and information flow,’’ Finance Res. Lett., vol. 53,
Mining, New York, NY, USA, Feb. 2015, pp. 399–408. May 2023, Art. no. 103590, doi: 10.1016/j.frl.2022.103590.
[27] C. Meaney, M. Escobar, T. A. Stukel, P. C. Austin, and L. Jaakkimainen, [48] A. Briola, D. Vidal-Tomás, Y. Wang, and T. Aste, ‘‘Anatomy of a Stable-
‘‘Comparison of methods for estimating temporal topic models from pri- coin’s failure: The Terra-Luna case,’’ Finance Res. Lett., vol. 51, Jan. 2023,
mary care clinical text data: Retrospective closed cohort study,’’ JMIR Med. Art. no. 103358, doi: 10.1016/j.frl.2022.103358.
Informat., vol. 10, no. 12, Dec. 2022, Art. no. e40102. [49] M. H. Joo, Y. Nishikawa, and K. Dandapani, ‘‘Cryptocurrency, a successful
[28] R. Grinberg, ‘‘Bitcoin: An innovative alternative digital currency,’’ Hast- application of blockchain technology,’’ Managerial Finance, vol. 46, no. 6,
ings Sci. Technol. Law J., vol. 4, p. 160, Dec. 2011. pp. 715–733, Aug. 2019, doi: 10.1108/mf-09-2018-0451.
[29] G. P. Dwyer, ‘‘The economics of Bitcoin and similar private digital [50] R. B. Fekih and M. Lahami, ‘‘Application of blockchain technology in
currencies,’’ J. Financial Stability, vol. 17, pp. 81–91, Apr. 2015, doi: healthcare: A comprehensive study,’’ in Proc. 18th Int. Conf. Smart Homes
10.1016/j.jfs.2014.11.006. Health Telematics (ICOST), Hammamet, Tunisia, 2020, pp. 268–276.
[30] J. Li, N. Li, J. Peng, H. Cui, and Z. Wu, ‘‘Energy consumption of [51] M. H. Miraz and M. Ali, ‘‘Applications of blockchain technology beyond
cryptocurrency mining: A study of electricity consumption in min- cryptocurrency,’’ 2018, arXiv:1801.03528.
ing cryptocurrencies,’’ Energy, vol. 168, pp. 160–168, Feb. 2019, doi: [52] P. Tasatanattakool and C. Techapanupreeda, ‘‘Blockchain: Challenges
10.1016/j.energy.2018.11.046. and applications,’’ in Proc. Int. Conf. Inf. Netw. (ICOIN), Chiang Mai,
[31] K. J. O’Dwyer and D. Malone, ‘‘Bitcoin mining and its energy foot- Thailand, Jan. 2018, pp. 473–475.
print,’’ in Proc. 25th IET Irish Signals Syst. Conf. China-Ireland Int. [53] D. G. Baur, K. Hong, and A. D. Lee, ‘‘Bitcoin: Medium of exchange
Conf. Inf. Commun. Technol. (ISSC/CIICT), Limerick, Ireland, Jun. 2014, or speculative assets?’’ J. Int. Financial Markets, Inst. Money, vol. 54,
pp. 280–285. pp. 177–189, May 2018, doi: 10.1016/j.intfin.2017.12.004.
[54] E. Bouri, P. Molnár, G. Azzi, D. Roubaud, and L. I. Hagfors, ‘‘On the [75] H. Dang, T. T. A. Dinh, D. Loghin, E.-C. Chang, Q. Lin, and B. C. Ooi,
Hedge and safe haven properties of Bitcoin: Is it really more than a ‘‘Towards scaling blockchain systems via sharding,’’ in Proc. Int. Conf.
diversifier?’’ Finance Res. Lett., vol. 20, pp. 192–198, Feb. 2017, doi: Manage. Data, Amsterdam, The Netherlands, Jun. 2019, pp. 123–140.
10.1016/j.frl.2016.09.025. [76] S. Bonini, T. Shohfi and M. Simaan. (2022). Buy the Dip? [Online].
[55] A. Kliber, P. Marszałek, I. Musiałkowska, and K. Świerczyńska, ‘‘Bitcoin: Available: https://ssrn.com/abstract=3835376
Safe haven, Hedge or diversifier? Perception of Bitcoin in the context [77] N. Sherman and J. Tidy, ‘‘Crypto giant FTX collapses into bankruptcy,’’
of a country’s economic situation—A stochastic volatility approach,’’ BBC, Nov. 11, 2022. Accessed: Aug. 4, 2023. [Online]. Available:
Phys. A, Stat. Mech. Appl., vol. 524, pp. 246–257, Jun. 2019, doi: https://www.bbc.com/news/business-63601213
10.1016/j.physa.2019.04.145. [78] Immunebytes, New Delhi, India. (2022). Mainnet vs Testnet in
[56] S. Corbet, A. Meegan, C. Larkin, B. Lucey, and L. Yarovaya, ‘‘Explor- Blockchain. Accessed: Aug. 4, 2023. [Online]. Available: https://www.
ing the dynamic relationships between cryptocurrencies and other immunebytes.com/blog/mainnet-vs-testnet-in-blockchain/
financial assets,’’ Econ. Lett., vol. 165, pp. 28–34, Apr. 2018, doi: [79] D. M. Blei and J. D. Lafferty, ‘‘Dynamic topic models,’’ in Proc. ICML,
10.1016/j.econlet.2018.01.004. Pittsburgh, PA, USA, 2006, pp. 113–120.
[57] R. Zhang and W. K. V. Chan, ‘‘Evaluation of energy consumption in [80] A.-D. Vo, ‘‘Sentiment analysis of news for effective cryptocurrency price
block-chains with proof of work and proof of stake,’’ J. Phys., Conf. prediction,’’ Int. J. Knowl. Eng., vol. 5, no. 2, pp. 47–52, Dec. 2019, doi:
Ser., vol. 1584, no. 1, Jul. 2020, Art. no. 012023, doi: 10.1088/1742- 10.18178/ijke.2019.5.2.116.
6596/1584/1/012023. [81] S. Boulianne, ‘‘Social media use and participation: A meta-analysis of cur-
rent research,’’ Inf., Commun. Soc., vol. 18, no. 5, pp. 524–538, May 2015,
[58] N. Lasla, L. Al-Sahan, M. Abdallah, and M. Younis, ‘‘Green-
doi: 10.1080/1369118x.2015.1008542.
PoW: An energy-efficient blockchain proof-of-work consensus
[82] S. Bourgi, ‘‘Institutional investors increase their crypto holdings for 5th
algorithm,’’ Comput. Netw., vol. 214, Sep. 2022, Art. no. 109118,
straight week,’’ Cointelegraph, Sep. 20, 2021. Accessed: Aug. 4, 2023.
doi: 10.1016/j.comnet.2022.109118.
[Online]. Available: https://cointelegraph.com/news/institutional-
[59] T. Pano and R. Kashef, ‘‘A complete VADER-based sentiment analysis investors-increase-their-crypto-holdings-for-5th-straight-week
of Bitcoin (BTC) tweets during the era of COVID-19,’’ Big Data Cogn. [83] Y. B. Kim, J. Lee, N. Park, J. Choo, J.-H. Kim, and C. H. Kim, ‘‘When
Comput., vol. 4, no. 4, p. 33, Nov. 2020, doi: 10.3390/bdcc4040033. Bitcoin encounters information in an online forum: Using text mining to
[60] R. H. D. Neves, ‘‘Bitcoin pricing: Impact of attractiveness variables,’’ analyse user opinions and predict value fluctuation,’’ PLoS ONE, vol. 12,
Financial Innov., vol. 6, no. 1, pp. 1–18, Apr. 2020, doi: 10.1186/s40854- no. 5, May 2017, Art. no. e0177630, doi: 10.1371/journal.pone.0177630.
020-00176-3. [84] S. Bibi, S. Hussain, and M. I. Faisal, ‘‘Public perception based recommen-
[61] T. Panagiotidis, T. Stengos, and O. Vravosinos, ‘‘The effects of markets, dation system for cryptocurrency,’’ in Proc. 16th Int. Bhurban Conf. Appl.
uncertainty and search intensity on Bitcoin returns,’’ Int. Rev. Financial Sci. Technol. (IBCAST), Islamabad, Pakistan, Jan. 2019, pp. 661–665.
Anal., vol. 63, pp. 220–242, May 2019, doi: 10.1016/j.irfa.2018.11.002.
[62] J. Korn, ‘‘Record $3.8 billion stolen in crypto hacks last year, report
says,’’ CNN, Feb. 1, 2022. Accessed: Aug. 4, 2023. [Online]. Available:
https://edition.cnn.com/2023/02/01/tech/crypto-hacks-2022/index.html
HAE SUN JUNG is currently pursuing the Ph.D.
[63] D. Nelson and N. De, ‘‘FTX U.S. temporarily froze crypto
withdrawals, adding to chaos of bankruptcy proceedings,’’ Coindesk, degree with the Department of Applied Artifi-
Nov. 12, 2022. Accessed: Aug. 4, 2023. [Online]. Available: cial Intelligence, Sungkyunkwan University. His
https://www.coindesk.com/business/2022/11/11/ftx-us-freezes-crypto- research interests include natural language pro-
withdrawals-sending-millions-in-assets-to-bankruptcy-limbo/ cessing, computer vision, deep learning, and
[64] D. Johnson, A. Menezes, and S. Vanstone, ‘‘The elliptic curve digital machine learning.
signature algorithm (ECDSA),’’ Int. J. Inf. Secur., vol. 1, no. 1, pp. 36–63,
Aug. 2001, doi: 10.1007/s102070100002.
[65] R. Gennaro, S. Goldfeder, and A. Narayanan, ‘‘Threshold-optimal
DSA/ECDSA signatures and an application to Bitcoin wallet security,’’
in Proc. 14th Int. Conf. Appl. Cryptogr. Netw. Secur. (ACNS), vol. 14,
Guildford, U.K., 2016, pp. 156–174.
[66] A. Feign, ‘‘What is an ICO?’’ Coindesk, Dec. 12, 2022. [Online]. Avail- HAEIN LEE is currently pursuing the Ph.D. degree
able: https://www.coindesk.com/learn/what-is-an-ico/ with the Department of Applied Artificial Intel-
[67] D. Boreiko and N. K. Sahdev. (2018). To ICO or Not to ICO-Empirical ligence and the Department of Human-Artificial
Analysis of Initial Coin Offerings and Token Sales. [Online]. Available: Intelligence Interaction, Sungkyunkwan Univer-
https://ssrn.com/abstract=3209180 sity. Her research interests include natural lan-
[68] L. Rhue. (2018). Trust is All You Need: An Empirical Exploration of Initial guage processing, deep learning, and machine
Coin Offerings (ICOs) and ICO Reputation Scores. [Online]. Available: learning.
https://ssrn.com/abstract=3179723
[69] O. A. Karpenko, T. K. Blokhina, and L. V. Chebukhanova, ‘‘The initial coin
offering (ICO) process: Regulation and risks,’’ J. Risk Financial Manage.,
vol. 14, no. 12, p. 599, Dec. 2021, doi: 10.3390/jrfm14120599.
[70] Y. Tsuchiya and N. Hiramoto, ‘‘How cryptocurrency is laundered: Case
study of coincheck hacking incident,’’ Forensic Sci. Int., Rep., vol. 4,
Nov. 2021, Art. no. 100241, doi: 10.1016/j.fsir.2021.100241. JANG HYUN KIM is currently a Professor with
[71] K. Grobys, ‘‘When the blockchain does not block: On hackings and the Department of Human-Artificial Intelligence
uncertainty in the cryptocurrency market,’’ Quant. Finance, vol. 21, no. 8, Interaction, the Department of Interaction Sci-
pp. 1267–1279, Aug. 2021, doi: 10.1080/14697688.2020.1849779. ence, and the Department of Applied Artificial
[72] U. W. Chohan. (2018). The Problems of Cryptocurrency Thefts and Intelligence, Sungkyunkwan University. He has
Exchange Shutdowns. [Online]. Available: https://ssrn.com/abstract= authored more than 50 articles in major journals,
3131702 such as Information Processing and Management,
[73] L. Wang, ‘‘The challenge and prospect of scalability of blockchain tech- Telematics and Informatics, Cities, Government
nology,’’ in Proc. 5th Int. Conf. Comput. Sci. Artif. Intell., Beijing, China, Information Quarterly, Technological Forecasting
Dec. 2021, pp. 296–301. and Social Change, and Journal of Computer-
[74] M. Zamani, M. Movahedi, and M. Raykova, ‘‘RapidChain: Scaling Mediated Communication. His research interests include social/semantic
blockchain via full sharding,’’ in Proc. ACM SIGSAC Conf. Comput. data analysis, social media, and future media.
Commun. Secur., Toronto, ON, Canada, Oct. 2018, pp. 931–948.