Bit Ex Tract
Bit Ex Tract
1, JANUARY 2019
Fig. 1. With BitExTract, we can observe the evolution of transaction and connection patterns of Bitcoin exchanges from different
perspectives A) The comparison view is designed to be highly interactive to compare multiple exchanges’ different indices. B) The
exchanges list panel reveals Bitcoin exchanges’ historical transaction volume. C) The massive sequence view (MSV) demonstrates the
overview of Bitcoin exchange market. Users can focus on one exchange to specifically exam its holistic connections. D) The connection
view illustrates the connection details intuitively with a node-link design which can facilitate the recognition of unique patterns.
Abstract—The emerging prosperity of cryptocurrencies, such as Bitcoin, has come into the spotlight during the past few years.
Cryptocurrency exchanges, which act as the gateway to this world, now play a dominant role in the circulation of Bitcoin. Thus,
delving into the analysis of the transaction patterns of exchanges can shed light on the evolution and trends in the Bitcoin market,
and participants can gain hints for identifying credible exchanges as well. Not only Bitcoin practitioners but also researchers in the
financial domains are interested in the business intelligence behind the curtain. However, the task of multiple exchanges exploration
and comparisons has been limited owing to the lack of efficient tools. Previous methods of visualizing Bitcoin data have mainly
concentrated on tracking suspicious transaction logs, but it is cumbersome to analyze exchanges and their relationships with existing
tools and methods. In this paper, we present BitExTract, an interactive visual analytics system, which, to the best of our knowledge,
is the first attempt to explore the evolutionary transaction patterns of Bitcoin exchanges from two perspectives, namely, exchange
versus exchange and exchange versus client. In particular, BitExTract summarizes the evolution of the Bitcoin market by observing the
transactions between exchanges over time via a massive sequence view. A node-link diagram with ego-centered views depicts the
trading network of exchanges and their temporal transaction distribution. Moreover, BitExTract embeds multiple parallel bars on a
timeline to examine and compare the evolution patterns of transactions between different exchanges. Three case studies with novel
insights demonstrate the effectiveness and usability of our system.
Index Terms—Bitcoin exchange, transaction data, comparative analysis, visual analytics, FinTech
1 I NTRODUCTION
• Xuanwu Yue, Xinhuan Shu, Xinyu Zhu, Xinnan Du and Zheqing Yu are with
the Hong Kong University of Science and Technology. E-mail:{xuanwu.yue, The rocketing Bitcoin value and turbulent market in the past two years
xinhuan.shu, xzhuah, xduac, zyuaf} @connect.ust.hk have elicited considerable attention from both the finance and technol-
• Dimitrios Papadopoulos is with the Hong Kong University of Science and ogy sectors, making this emerging payment system one of the most
Technology. E-mail: [email protected] compelling topics recently. The booming of Bitcoin has also catalyzed
• Siyuan Liu is with the Penn State University. E-mail: [email protected] the development of many other cryptocurrencies, including Ethereum,
Manuscript received 31 Mar. 2018; accepted 1 Aug. 2018. Litecoin [6]. All these alternative coins are built upon variations of
Date of publication 16 Aug. 2018; date of current version 21 Oct. 2018. blockchain technology, which distinguishes them from the fiat cur-
For information on obtaining reprints of this article, please send e-mail to: rency system. A blockchain is a public decentralized ledger using
[email protected], and reference the Digital Object Identifier below. a distributed database to verify, propagate, and record digital asset
Digital Object Identifier no. 10.1109/TVCG.2018.2864814 transactions between anonymous addresses in the form of public key
1077-2626 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
YUE ET AL.: BITEXTRACT: INTERACTIVE VISUALIZATION FOR EXTRACTING BITCOIN EXCHANGE INTELLIGENCE163
hashes. However, for the sake of privacy, public keys generation and view (MSV) to depict an overview of the Bitcoin market evolution
usage is usually random, it is difficult to associated the key used in a with all inter-exchange transactions during the 7-year period. Further-
specific transaction with the key owner in reality. Research and analysis more, the connection view, which is based on a node-link diagram,
on entity authentication using public keys have revealed the potential provides a comprehensive and intuitive demonstration of Bitcoin ex-
to unearth transaction patterns. Various works [21, 39] , thus, have change networks with multiple influential factors token into considera-
focused on clustering public keys based on external knowledge and tion.Meanwhile, the comparison view embeds multiple parallel bars on
heuristic rules into corresponding entities. In our system, we leverage a timeline, which allows users to gain insights of individual exchange
WalletExplorer [10], which has already tagged tens of millions of pub- evolution via interactively tracking the changing momentum compared
lic keys with their corresponding entities, covering a rich variety of with others. The well-coordinated system is built on a superimposed
organizations, such as Bitcoin exchanges, Bitcoin mining pools, online hierarchy, which guides users to explore the whole market progressively
gambling sites, cryptocurrency services, etc. and interactively. To verify BitExTract, we conducted three case studies
Among the diverse kinds of entities, Bitcoin exchanges distinguish to evaluate its capability and efficiency. The novel insights into Bitcoin
themselves by providing the dominant trading platform where most exchanges and further interviews with domain experts have provided
ordinary customers exchange fiat currency for Bitcoins and vice versa. promising feedback to our system.
The thriving interest in Bitcoin has led to the emergence of hundreds The major contributions of this paper are as follows:
of cryptocurrency exchanges of various sizes around the world since • To the best of our knowledge, the first interactive visualization
it was first introduced by Nakamoto [37]. The overwhelming develop- system which allows users to explore, analyze, and compare the
ment of exchanges has promoted the circulation of Bitcoin and brought evolution of different Bitcoin exchanges.
the growth of its value. A closer look at the Bitcoin blockchain data • A new dynamic timeline visualization combined with parallel
further reveals the dominance of the exchanges, with a huge number bars that demonstrates time-varying, multi-variable transaction
of Bitcoins being transferred between exchanges, thus building up a data features for comparison between exchanges.
sophisticated network of transactions. Furthermore, the analysis of Bit- • Case studies with domain experts and senior practitioners inte-
coin transactions between exchanges with the knowledge of exchange grate our visualization system into their analysis, leading to the
trading mechanisms represents the best approach to understand the in-depth discovery of and valuable insights into Bitcoin-related
Bitcoin market. Thus, our intent is to learn the evolution of both in- events and policies.
dividual exchanges and their associated networks so as to understand
the development history of Bitcoin exchanges and potential patterns 2 R ELATED W ORK
that could be helpful to various interested parties. Moreover, we expect Our work is related to the Bitcoin market and exchange analysis in
to provide a tool for the public to compare and look for a credible or the financial area, blockchain data visualization, and more generally
stable exchange to park their money online. dynamic graph visualization.
The critical roles of exchanges have already aroused much research
enthusiasm in the area of finance and statistics. Financial researchers 2.1 Bitcoin Market and Exchange Analysis
focus mostly on the business cooperation and competition between The thriving interest in Bitcoin transactions has resulted in the emer-
exchanges [22], while the statistics field uses diverse models to study gence of various works in the area. According to Yli-Huumo et al. [50],
the life cycle and survival probability of exchanges [35]. However, research related to Bitcoin covers a wide range of topics, such as
these works are mainly based on empirical hypotheses through a trial- security, wasted resources, usability, privacy, smart contracts, new
and-error validation process, which often lacks the ability to reveal cryptocurrencies, botnets, P2P broadcasting protocols, and trustworthi-
hidden patterns and relationships. Hence, visualization can be intro- ness. From the perspectives of applications, anomaly detection [41],
duced to empower users to utilize their visual senses and intuition to anonymity analysis [43], and Bitcoin price prediction [25] have also
glean insights from the data. Nevertheless, to the best of our knowl- drawn great interest in the area of computer science. The majority of
edge, there is little visual analysis work with respect to the Bitcoin previous work focuses on the whole blockchain system without empha-
blockchain data. Previous visualization regarding Bitcoin has mainly sizing specific community. More specifically, this paper concentrates
focused on illegal service discovery [36], anomaly detection [13], and on the analysis of the Bitcoin market and exchanges.
transaction presentation [5]; the core merely concentrates on tracking Kiran and Stanett [27] conducted comprehensive research on the
blockchain transaction logs. However, these visualization techniques risks of the Bitcoin market from social, legal, economic, and secu-
are insufficient for exploring the exchange-centered evolution of the rity perspectives. Gandal and Halaburda [22] studied the competi-
Bitcoin market. Despite having the same data source, which is based tion among Bitcoin exchanges and claimed that the trading frequency
mainly on the Bitcoin blockchain, the design requirements of our work and quantity is related to market news and social community. Other
expect a multi-layer structure for each exchange, containing transaction work [16, 45] focused on threats that cryptocurrencies are facing due
aggregation instead of raw logs usage directly. Therefore, our work to the limitation of the intrinsic mechanism of blockchain technology.
delves into a list of 60 Bitcoin exchanges with millions of transaction Moore and Christin [35] employed the survival model to calculate the
records ranging from 2011 to 2018. Our study provides a state-of-the- risk of an exchange going bankrupt based on its transaction volume.
art visualization system that investigates individual exchanges, analyzes While extensive studies on the Bitcoin market have been conducted
inter-exchange correlation and networks, also allows users to observe from the economic and technical perspectives, few studies have been
the evolving Bitcoin market trend. done to empirically analyze the detailed Bitcoin transaction history,
It should be noted that developing such a visual analytic system with since it is difficult to extract inter-exchange transactions. Ranshous et
huge volumes of data presents three major challenges. First, visual- al. [42] made use of transaction data to construct a directed hyper-graph,
izing tens of gigabytes of transaction data involves severe scalability from which transaction patterns of Bitcoin exchanges can be revealed
problems. The multi-perspective demonstration of exchange-specific and leveraged in fraudulent pattern mining, but it mainly aimed at the
details increases the complexity as well. Second, describing the so- tasks of characterizing and understanding patterns centered around a
phisticated trading networks among exchanges is non-trivial due to a specific individual exchange. Thus, to the best of our knowledge, there
wealth of influencing factors, such as their transaction volumes, trad- is no previous work emphasizing the evolution of connections among
ing frequencies, and geographic regions. The dynamic changes of the exchanges and the analysis of event influenced.
network should also be further traced. Third, providing a holistic and
comprehensive visual design is challenging. The system must take into 2.2 Blockchain Data Visualization
account people’s ability to absorb massive information and the capacity The innate nature of the Bitcoin public ledger has brought an unprece-
of limited user interfaces to display multi-dimensional data. dented opportunity for blockchain visualization. A wealth of websites
To address the aforementioned challenges, we introduce a highly and tools provides visualization artifacts, including real-time trans-
interactive visualization system called BitExTract, as it extracts Bit- action visualization [1, 2, 4, 8] and transaction-network information
coin exchange intelligence. The system leverages a massive sequence navigation [5, 7, 9]. However, most of them present only descriptive
164 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 25, NO. 1, JANUARY 2019
statistics or transaction details. Recently, visual analytics on blockchain- Continent Exchange name
related data has received increasing attention. BitConeView [13] is a Asia 796.com, Btc38.com, BTCC.com, BtcTrade.com,
pioneering work inferring illegal activities like money laundering; it uti- Bter.com, ChBtc.com, Hashnest.com, Huobi.com,
lizes tailored flow charts to illustrate suspicious Bitcoin flows. McGinn OKCoin.com, Vircurex.com, AnxPro.com,
et al. [32] deployed high-fidelity visualizations on a large-scale ob- Bitfinex.com, BitVC.com, Exchanging.ir, Ko-
servatory facility, displaying unexpected dynamic transaction patterns. rbit.co.kr, UrduBit.com-cold, CoinHako.com,
These prior studies are based on block contents to detect abnormal FYBSG.com, MaiCoin.com, BX.in.th
patterns, rarely involving particular users or organizations. Regarding Europe SimpleCoin.cz, CoinMotion.com, LocalBit-
this, heuristic clustering algorithms with graph visualizations [23, 34] coins.com, Bitcoin.de, C-Cex.com, Spectro-
have been adapted to analyze the blockchain data for user network char- Coin.com, TheRockTrading.com, HappyCoins.com,
acterization. Isenberg et al. extended their previous work on the visual LiteBit.eu, BitBay.net, Bitcurex.com, BTC-e.com,
exploration of the activities of entities exchanging Bitcoins [26] and fur- Matbea.com, YoBit.net, BitBargain.co.uk, Bit-
ther revealed transaction histories of individual entities [24]. However, stamp.net, BitX.co, Bit-x.com, Cex.io, Coinmate.io,
these existing works mainly fulfill the classification and presentation Exmo.com, HitBtc.com, Vaultoro.com
tasks, instead of digging into the behavior analysis within the targeted North America Igot.com, Cavirtex.com, Coins-e.com, Coin-
categories, such as exchanges. Moreover, the visualization techniques Trader.net, Bittrex.com, BlockTrades.us,
used are straightforward and unsuitable to tackle the complex problems CampBX.com, Cryptsy.com, Kraken.com,
of revealing connection patterns among exchanges. Paxful.com, Poloniex.com
In practice, the Bitcoin blockchain consists of linked blocks storing South America Bleutrade.com, FoxBit.com.br, MercadoBit-
all executed Bitcoin transactions. Visualizations working on transaction coin.com.br
data have already arisen in different contexts. SellTrend [31] combines Australia BtcMarkets.net, CoinSpot.com.au, VirWoX.com
a diverse set of techniques to analyze airline travel purchase requests
in real time. Many studies have focused on analyzing pre-collected Table 1. The Bitcoin exchange companies we studied together with their
data. Ko et al. [29] surveyed a series of visual analysis approaches geographic locations.
for transaction data. Various types of transaction data, such as wire
transfers [18], foreign currency exchanges [30], bids and asks [38] and
stock deals [28], have been investigated, resulting in some interesting
and insightful visualizations. These studies explore anomaly detection, 3 BACKGROUND
detail display, pattern mining and violation identification, respectively. This section first introduces the background knowledge of blockchain
Xie et al. [48] also introduced a visualization system called “VAET” to and Bitcoin transaction data. Thereafter, three levels of analytical tasks
detect salient transactions from large e-transaction time series. In these are proposed and discussed in detail.
studies, the buyers and sellers identity were fixed. However, in our case,
each exchange can have dual roles, which means each exchange can 3.1 Data Abstraction
both buy and sell Bitcoins. A particular design supporting this kind of 3.1.1 Raw data collection and pre-processing
transaction relationship remains absent.
The primal Bitcoin transaction data stored in the blockchain mainly
consists of four parts: a list of input public keys, input values, a list of
2.3 Dynamic Graph Visualization
output public keys and output values. Some keys can be recognized
Considering our focus on the evolution of connections between Bit- as belonging to the same wallet according to generally accepted rules
coin exchanges, dynamic graph visualization techniques should be [34]. We began by collecting historical blockchain data maintained
introduced to address the time-varying process. Two surveys [14, 15] by the website WalletExplorer [10] on which every public key has
provided a broad perspective of this growing research discipline, a been assigned a wallet ID. Since our focus is on the exchanges, we
list of visual approaches are proposed which could be categorized downloaded the whole transaction history (2011-01-01 to 2017-12-31)
into three types, namely, animation, timeline, and a hybrid of the two. of 60 different exchange companies. The companies that we studied
Considering the mental map required in the animation techniques, the are listed in Table 1. The original transactions we obtained from the
timeline technique indicated an advantage for the intuitive connection blockchain might have more than one input wallet and output wallet.
comparison task. In general, this time-to-space mapping consists of This kind of N-to-N transaction structure is not suitable for us to analyze
node-link-based and matrix-based approaches. Previous dynamic graph the relationship between two specific entities. Therefore, we broke
visualization techniques mainly focused on the display of entire graph each transaction into one or more 1-to-1 transactions with the same
variations. However, the evolution of connections between Bitcoin ex- timestamp, such that each 1-to-1 transaction contained only one input
changes gazes at the parts of interest rather than the overview topology, wallet and one output wallet. The transaction value of each 1-to-1
which requires a particular design to highlight the details. Also, the transaction was calculated proportionally. These 1-to-1 transactions are
exchange-centric analysis is emphasized. Nevertheless, there is only the raw data stored in our database. The following are the fields and
one application work [47] mentions the financial area but illustrates the descriptions for each record:
simulation of the financial network. • Transaction ID: The 64-character hash of the transaction.
Despite the state-of-the-art methods, most dynamic graph visual- • Exchange ID: The exchange wallet discriminator provided by
izations have a scalability problem: the visualization should change WalletExplorer [10]
correspondingly according to various data dimensions. For instance, al- • Client ID: A 16-character wallet discriminator provided by Wal-
though von Landesberger et al. [47] analyzed a financial network whose letExplorer [10]. It has the same functionality of exchange ID
dataset was similar to ours, the design proposed for the contagion simu- except that we can tell the name of the exchange from its ex-
lation cannot satisfy our requirements, as it lacked the consideration change ID, while the client ID is anonymous. When the client is
of other critical variables such as surplus or standing index. Further, another exchange, the exchange ID will be placed here instead.
there have been several attempts to utilize dynamic multivariate graphs, • Time: The timestamp when the transaction was made.
but the applications were dominated by social networks [17, 49], and • Transaction amount: The amount of Bitcoin sent by the exchange
software engineering [11]. With respect to the application-specific to the client. It can be a negative number when the exchange is
data characteristics and analytical requirements, it was unsuitable to the Bitcoin receiver.
directly use the existing techniques to analyze the dynamically evolv- • Balance: The amount of Bitcoin that “remained” in the exchange.
ing connections between exchanges, or to display details of multiple This field is maintained by WalletExplorer [10].
data dimensions with millions of records. Thus, we developed an inte- We also collected some other auxiliary data and information:
grated visualization system combining dynamic and multivariate graph • Historical Bitcoin price: We retrieved the Bitcoin historical daily
techniques to analyze the inter-exchange connections. prices provided by Coindesk [3]. This is an average value based
YUE ET AL.: BITEXTRACT: INTERACTIVE VISUALIZATION FOR EXTRACTING BITCOIN EXCHANGE INTELLIGENCE165
views. The massive sequence view summarizes the evolution of the of volatility through a Logistic function to make it fit into [0, 1] range
Bitcoin market over time. The connection view depicts the trading 1
nicely. The result of this term is then: Iti = 1+e1−std(V )
network of exchanges and their temporal transaction distribution. The The market share of exchange i is scaled by an activation function so
comparison view compares the evolution trends of different exchanges, as to nicely fit into the range between zero and one. That is, Sharet =
taking network standings into consideration. Besides, an additional Vi
panel listing all exchanges is provided to help users get an overview 2 · (1 − logistic(− 13 · log10 ( Vtt )))
and quickly select a certain exchange. The third term is inspired by the idea of PageRank [40], which
We describe a common workflow to show how a user utilizes our sys- propagates the influence of one node to its adjacency nodes so as
tem. The user can first selects exchanges of interest on the Exchanges to measure its relative importance within the set. Each partner p of
p,i
List (Fig. 1B) on the right side of the interface. Different sorting meth- exchange i will exert influence weighted by
Vt−1
, which is p’s
a,i
ods could provide hints for users to select interested exchanges with ∑a∈E Vt−1
particular patterns. Three other views then update according to the fraction of transaction volume with i.
selected exchanges. When focusing on an exchange, the massive se- α, β , γ are adaptive factors which control the three features’ relative
quence view (Fig. 1C) shows inter-exchange transactions involving importance in terms of network standing. Users can adjust them via
such exchange. The user can further brush an interested time period comparison view “Network Standing Parameters” sector in order to
on the price panel (Fig. 1C3 ). Then the news panel (Fig. 1C1 ) dis- evaluate different ranking results and dominant factor.
plays related events during this period. He can also explore the trading
network of selected exchange on the connection view (Fig. 1D) and 5.2 Business Proximity
compare evolution trends, surplus situation, different network standing Business proximity is an active research area in Information System
combination on the comparison view (Fig. 1A). [44] which measures relatedness of companies in terms of geographical
distribution, market, technology, etc. We tailor this idea to measuring
5 DATA A NALYSIS the business connection of exchanges as a linear combination of their
In this section, we will introduce our analytic approach to organize transaction volume and trading frequency. The formal definition is as
tremendous trading data, and we will provide linear ordering and busi- follows
ness proximity to the exchanges through an interactive user interface. Pt (a, b) = α ·Vt (a, b) + β · Ft (a, b) (3)
First, we will describe the adaptive ranking algorithm, which reflects
the market share and network standing of exchanges to some extent. where Vt (a, b) represents trading volume between exchange a and b,
Then we will give the definition of business proximity in the context of and Ft (a, b) represents their inner-transaction frequency. Currently, α
Bitcoin exchanges. and β are set with equal weights since we regard the two factors are
both important for proximity analysis.
Notation Meaning Business proximity is effective when it comes to understand the
Standit Network standing of an exchange company i in time period t market connection among exchanges. Users can easily identify the
Iit Inverse volatility of exchange i in time period t strongest partner of a specific exchange in the ego-view from Connec-
E Representing all the exchange companies examined in this paper tion View where business proximity serves as the distance metric. The
Shareti Market share, i.e.thetradingvolume of exchange i with other ex- detailed visual encoding will be discussed in Section 6.3.
changes during time period t
Vt p,i Trading volume between exchange i and p during time section t 6 V ISUAL D ESIGN
Pt (a, b) Business proximity between exchange a and b at the end of time
period t
Our visualization system consists of four components, namely com-
Vt (a, b) Trading volume between exchange a and b at the end of time period t
parison view Fig. 1A, exchanges list Fig. 1B, massive sequence view
Ft (a, b) Trading frequency between exchange a and b in time period t Fig. 1C, and connection view Fig. 1D.
We followed the design rationales below to guide the process.
Table 2. Terminology table used in financial analysis for Bitcoin ex- Follow user-centric design. In order to aid domain experts with
changes. visual analysis, one of the design goals was providing multi-stage
problem-solving views which involve end users from the beginning.
5.1 Adaptive Network Standing Index Considering the demands and learning-curve of first-time users, our
system should be highly interactive and the task flow should be intuitive.
A weighted average is widely used in scientific research in order to
Overview first, zoom and filter, details on demand. Given the
determine the relative importance of several potential factors [46].
massive scale of the temporal transaction data of Bitcoin exchanges,
Therefore, we apply this general rule to the context of quantifying
the system should provide sufficient interactions so that users can zoom-
Bitcoin exchanges. We combine three attributes suggested by Bitcoin
in and filter the data on demand. According to the theory of eyes beat
and financial domain experts, and define a network standing index of
memory [12], it is easier to compare views side-by-side than the visible
an exchange platform. We define our model as follows. Let Standti be
item to memory. Therefore, we should fully utilize the limited pixels
the network standing of an exchange company i in time period t, which
on the screen so that users would not necessarily rely on a mental map
is a function of the balance volatility, its market share during that time
to perceive the comprehensive information and make comparisons.
section, and also weighted network standing of exchange companies
Quantifying network standings of exchanges. Quantifying net-
which have incurred transactions with it in the previous time section.
work standing of exchange provides hints for Bitcoin investors to seek
p,i p credible or stable exchanges. Domain experts emphasize the impor-
∑ p∈E (Vt−1 · Stand t−1 ) tance of modifiable weights on factors of exchange evaluation. Thus,
Stand ti = α · Iti + β · Shareti + γ · p,i
(2)
∑ p∈E Vt−1 one of the design goals was providing direct interactions for adaptive
ranking parameters with intuitive visual feedback.
In the financial domain, volatility measures are usually described We use the following unified color encoding to represent the conti-
by the standard deviation of change in price or value of a financial nents where exchange companies are registered, i.e., yellow for Asia,
security [20]. In this context, we expect an exchange company to purple for Europe, green for North America, orange for South America,
send Bitcoins to its clients within its own capability without a volatile and blue for Australia.
record of balance. Therefore, in order to model the potential risk
of investing in an exchange company, we should take into account 6.1 Exchanges List Panel
its fluctuation of remaining Bitcoin balance. Moreover, though the We provide an exchanges list panel (Fig. 1B) for users to quickly select
distribution of the transaction data is symmetric bell-shaped, it is yet a certain exchange and observe its historical transaction volume in USD.
highly skewed. Therefore, we apply the widely used log-normalization Each card in the panel is essentially a bar chart encoding the exchange’s
balancet
on the volatility of balance: V = log( balance t−1
), and pass the inverse transaction volume with respect to time. Users can choose to sort the
YUE ET AL.: BITEXTRACT: INTERACTIVE VISUALIZATION FOR EXTRACTING BITCOIN EXCHANGE INTELLIGENCE167
exchanges according to their entry-time, or continents. When the entry- within the selected time period. There are five portions of arcs inside
time sorting criteria is selected, the newest incomers will be stacked on the outermost ring, whose lengths represent the market share of this
the top, while the most aged players will appear at the bottom. continent. Once an arc is selected, the diagram will enter the “continent
view”, which only shows the inner-continent transactions.
6.2 Massive Sequence View Apart from the “world view” and the “continent view” described
Description: The massive sequence view (MSV) (Fig. 1C) provides above, we also support “ego view” for analyzing a specific exchange in
a compact and comprehensive overview of the temporal transaction this diagram. By dragging a node (i.e. ego) into the center of the circle,
patterns of all the examined exchanges, which further reflects the whole we arrange the exchanges who trades with the ego during selected time
market evolution of Bitcoin. Based on this view, we can answer the range and those who does not into two concentric rings. Arrows along
analytical questions about the overall market (T1, T2) as well as inter- the path, which is essentially the timeline, that link the ego and its
exchange relationships (T3). partners encode the trading volume. For example, an exchange having
As Fig. 1C2 illustrates, the x-axis represents the timeline, and the larger volume transactions only at the beginning of the selected time
exchanges are stacked along the vertical axis. Each row represents an period will have long and dense arrows closer to the center while short
individual exchange. We order all exchanges chronologically along and sparse arrows further from the center. The highly-summarized
the y-axis. That is, the “older” exchange lies in the bottom and the layout provides visual patterns for users to compare and analyze across
newly emerging ones would be laid on top. In particular, the contour different time periods. One can always go back to the general inter-
of MSV exhibits the development speed of the market. For example, exchange view by clicking on the ego node.
the S-shaped contour in Fig. 1C4 presents a concave curve and then Justification: We considered alternative visual designs for illustrating
turns into a convex curve, showing a great number of exchanges had the inter-exchange connection. One option was to use matrix-based
emerged during that around 2013 to 2015 with concave shape but the design [19]. However, this design emphasize on the hierarchical group
acceleration stagnated afterwards. structure, which may not be necessary given the shallow hierarchy of
The diverging red-blue color represents the surplus of an exchange our dataset and it may induce steeper learning curve of the users. Our
as defined in Eq.1. That is, this exchange i with the larger gap of design based on the node-link diagram could better utilize limited pixel
sending amount minus the receiving one is shown in a reddish color space and is extendable to ego-view for further analysis. Specifically,
during the time section t. Deviation above and below the zero-surplus if users find an exchange interesting and drag it to the center, the
is well represented by the diverging color scheme. ego-view appears to illustrate its connections with other exchanges, as
This view also supports analysis on two types of transaction data, well as their detailed transaction patterns from the beginning of the
i.e., the transactions between exchanges and clients (“Surplus”), and the selected period to the end via path-with-arrow encoding, which is hard
aggregated transaction behavior between exchanges (“Inter-exchange”). to achieve through matrix design.
Users can switch between these modes from the selection bar above.
When a certain exchange row is clicked, inter-exchange transactions 6.4 Comparison View
involving that particular exchange will be highlighted with other irrele- We draw upon the design of parallel coordinate view for reference to
vant records filtered out. encode the overall evolution pattern of Bitcoin exchanges with their
The price panel (Fig. 1C3 ) shares the timeline with the MSV clients. While parallel coordinate targets at encoding high-dimensional
(Fig. 1C2 ), which shows historical prices of Bitcoin in USD in blue data, our comparison view leverages the similar vertically-aligned bars
background. It also embeds a time brush function which can trigger to encode time-varying transaction patterns. Generally speaking, this
interaction in multiple other views. The news panel (Fig. 1C1 ) on the view first provides users a handy comparison of exchange transaction
top-left will display major Bitcoin-related events corresponding to the patterns, and second, it provides a zoom-and-expand feature which
selected time period. When the cursor is hovered on a certain news, a supports more detailed visual analysis on a specific exchange (T5,
blue vertical line will appear on the price panel to help you align news T6). Moreover, through this view users can quickly identify the big
with patterns in MSV as well as Bitcoin price. exchanges in the Bitcoin market (T5). The visual encoding is as follows.
Justification: We discussed several candidate designs, such as bar Description: As shown in Fig. 1A, each vertical bar in the compari-
chart or line chart to encode the time-varying transaction amount of son view stands for the exchange transaction history in one month. The
all exchange companies. However, we found that these approaches width of a bar stands for the aggregated client number of all the ex-
would induce visual clutter, given the massive size of the datasets, thus changes in that month. Horizontal rows in one bar represent exchange
impede the visual analytics process. Moreover, the MSV provides one companies sorted by certain rules. Vertical lines in each row record a
more dimension of encoding compared to the bar chart and the line summarization of daily transactions of that company.
chart, from which we can encode the chronological order of entry-time We support multi-ranking (i.e., network standing, surplus and trans-
of exchanges by stacking them along the y-axis. Moreover, we also action volume) and two different data filling (i.e., surplus (Eq. 1) and
considered using Wordle [33] for the text representation of Bitcoin transaction volume) options. This flexible design is intended to help
news. However, we found it is hard to extract meaningful tokens for users analyze the influence of the filling index on the ranking index. If
reasoning, which is still an active research domain in Natural Language sorted by the rule of “surplus”, the companies on the top will have a red
Processing. Currently, we adopt a simple design which relies on Google color since they “send” more than they “receive”, while the companies
News. Other techniques can be developed further to summarize the on the bottom will have a blue color. While the “surplus” filling-scheme
financial text data so as to relieve the scalability pressure. follows divergent color encoding, the “transaction volume” filling fol-
lows sequential color scheme. When network standing ranking-scheme
6.3 Connection View is selected, users can customize three factors to the linear combination
The connection view, which is essentially a graph, aims at visualizing of the input, as shown in Section 5.1, by clicking on a point representing
inter-exchange behavior with geographic information(T3, T6), helping α, β , γ in the triangle. The closer distance to a triangle vertex, the more
users better understand the exchange relationships, and demonstrating weighted this factor is.
the transaction patterns of an ego-exchange with respect to its partners The same companies between consecutive time steps are linked by
(T2). Note that the “ego” in social network analysis means a specific lines, thus it is easier for users to trace a particular exchange throughout
individual for detailed investigation. history and observe the change in the rankings. Multiple exchanges
Description: The connection view (Fig. 1D) is built upon the design can be selected and highlighted simultaneously for easy comparison
of node-link diagram and extended with an ego-view layout where users (Fig. 4). Users can easily grasp the temporal evolution pattern of all
can drag any exchange to the center for detailed exploration. With each exchanges and apply customized rules to re-sort the exchanges through
node representing an exchange, it is colored with respect to different this comparison view, which could compare by different measures.
continents, and its opacity varies to distinguish countries inside the According to the empirical theory of visualization, it is hard to get
same continent. The link between two nodes shows their connection as an accurate value estimation on the opacity channel. Therefore, we
is described in Section 5.2. Thicker links denote stronger connections provide an interaction from which the horizontal row inside bars will be
168 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 25, NO. 1, JANUARY 2019
Fig. 3. Exchange network patterns on connection view. A) Summarized inter-exchange transaction from Nov, 2015 to March, 2016. B) Summarized
inter-exchange transaction from May 2017 to December 2017. C) Bitfinex gradually reduced its direct connections with its partners. D) Kraken
maintained some direct connections with its partners. Both Bitfinex (C) and Kraken(D) stopped sending Bitcoins to their direct partners after
integrated with BitGo service as the red arrow totally disappeared.
expanded when users double-click a company in a bar, and transaction certain periods where major shifts in the exchange market took place.
amount on each day will be projected onto the length of the expanded He began by looking through the comparison view from left to right
line. While the rows encode every vertical line with equivalent length and noticed that there were two special periods in which the width
but different opacity, the expanded line encodes the everyday send- of the pillar increased significantly, which indicated an increase in
versus-receive amount in its left-side month with various length, which the Bitcoin exchanges’ activity. As can be seen in Fig. 4, the Bitcoin
enables users to better perceive the quantity through the length channel. exchange market showed an increase in activity during the periods of
Justification: We considered some standard charts, for instance, line November 2015 to March 2016 and May 2017 to December 2017 (T1).
chart, before we adopted the current design. These alternative graphs The increase of the bars’ width shows an increasing amount of total
are usually too dense and difficult to comprehend. Meanwhile, our clients in the market.
comparison view can encode information on both vertical bars and
horizontal lines, which can provide a compact high-level summary
of temporal patterns. Aggregating every month’s information on the
vertical bars and zooming in on demand becomes necessary to give the
massive data volume in this context.
6.5 Cross-view Interaction
BitExTract supports various interactions, empowering users with strong
visual analytic abilities. The detailed explanation is as follows.
Temporal-related Interaction. In order to better understand the
temporal patterns, our system will by default show the data spanning
from 2011 to 2018. However, users can always zoom in to an interesting Fig. 4. Comparison views ranked by transaction amount and filled by
period on demand. We provide cross-view interaction from the MSV daily surplus. A) The Chinese exchanges’ uprising tendency together
to the comparison view and the connection view. By brushing the time with the increasingly active Bitcoin market from July, 2015 to December,
axis from the MSV, the data in the comparison view and the connection 2015. B) The American exchanges’ polarized tendency together with the
view will be filtered correspondingly. This will further facilitate the flourishing Bitcoin market from May, 2017 to October, 2017.
exploration of the relationship between the tendency on the comparison
view and the patterns in the connection view. For example, we can In order to identify the reasons behind these changes, EA then delved
study how the change in surplus of an exchange is related with its into a further comparison between these two periods. He first slid the
connections with other exchanges (T4). time window to the first one. The connection view showed the overall
Exchange-centered Interaction. The panel serves as the main con- market share comparison as well as the networking picture between all
troller for selecting a specific exchange for analysis. Clicking on a card the exchanges, as shown in Fig. 3A. From this, he was able to notice
on the panel will trigger the MSV to display transactions with regard to that exchanges geographically located in Asia (and in particular in
that exchange, the comparison view to highlight the corresponding line, China) accounted for more than half of the market share for this period
and the connection view to display the ego-view. Such interaction will (T1). Aided by this finding from the connection view, EA moved back
facilitate exchange-centered exploration by displaying visual patterns to the comparison view and filtered exchanges from China, as in Fig. 4.
from multiple views at the same time. He found that many Chinese exchanges significantly rose up in terms of
transaction amount during these months (T5). Therefore, EA concluded
7 E VALUATION that the flourishing period between 2015-11 and 2016-03 was mainly
led by Chinese exchanges. This is in line with the market behavior
This section presents three case studies. The ultimate goal of BitExTract during that period, and it was the result of certain market policies in
is to provide users with an efficient and effective tool to capture and China, in combination with the low economic growth in the greater
analyze the movements of different scales in the Bitcoin exchange Asia area, as validated by relevant news articles from the period 1,2 .
network. To test and evaluate the performance, we conducted three For the second period, which was between May 2017 and December
case studies with our domain experts, Bitcoin traders and researchers. 2017, EA followed the same exploration path and the diagrams from
Fig. 3B and Fig. 4B. According to Fig. 3B, EA found that USA-based
7.1 Detect and compare special periods
exchanges account for most of the transactions, despite the fact that the
Since many people are interested in the development history of the number of such exchanges is relatively small compared to the number
Bitcoin market, the expert EA planned to review and summarize the of exchanges based in Asia or Europe. For this period, Asian exchanges
key time periods during the history (T1) He was looking forward to become less important (T1). Subsequently, in the comparison view EA
dating active market periods, describing the market situation during filtered out the tendency of American exchanges. He found that this
these periods and getting hints about causes of these activeness.
In this scenario, the expert EA used BitExTract to go through the 1 https://bit.ly/2KceI67
view included a polarization. In particular, three of the American-based reflected especially in the Asian area. Note that dominance in the
exchanges took the top positions on the transaction amount ranking list Bitcoin market transferred mostly into North America, followed by
while others laid well below the average (T5). Therefore, EA concluded Europe (Fig. 5B) (T2). EC mentioned that North American exchanges
that this second flourishing period was led by a relatively small number seized this timing point to attract Asian customers. Even though Huobi
of American exchanges. Again, this finding is supported by news eventually recovered some connections with foreign exchanges, the
articles from that time period 3,4 . connections with Asia-based ones remained unchanged. Also, the North
American exchanges maintained the worldwide top place (Fig. 5C).
7.2 The impact of a policy event ED then explained regarding this situation that Huobi was the central
One of our expert co-authors, EB , who is a financial researcher inter- point of the Chinese market and, to some extent, even the entire Asian
ested in Bitcoin risk analysis, mentioned the effect that the introduction region. Once Huobi suspended its services, the other Chinese exchanges
of a new policy can have on the Bitcoin market. One of the most influ- failed to establish relations with other (internal or external) exchanges,
ential such events was the Chinese government’s decision to prohibit possibly due to the regulations imposed by the Chinese government.
Beyond these insights gained by the connection view, our experts also
Yuan-to-Bitcoin trading in September 2017 5 . In the period prior to
noticed an extreme drop in the “Network Standing” and “Transaction
the release of the official announcement by the Chinese government,
Amount” mode in comparison view (Fig. 1A).
various rumors had been circulating, impacting the Bitcoin market. We
recorded how EB , EC and ED attempted a comprehensive comparison
7.3 The different effects of BitGo adoption on Kraken and
for this time period with the help of BitExTract.
Bitfinex
In terms of the capitalization of the 2016-2017 cryptocurrency mar-
ket, the China-based Huobi exchange was definitely the largest Bitcoin How to choose a reliable exchange for Bitcoin investment is always
trading platform, both domestically and globally. In February 2017, the a huge concern almost for each Bitcoin trader. Many people discuss
People’s Bank of China (the Chinese central bank), inspected Huobi and with each other via online platforms. For instance, there is a discussion
OKCoin 6 . These two exchanges were two of Chinas most widely-used raised by Quora users to explore the reliability of three big exchanges:
ones, and they both announced their intent to suspend Bitcoin and Lite- Coinbase, Kraken and Bitfinex8 (T5). However, online information is
coin withdrawals effective immediately after the inspection. After this, sometimes biased. For example, some people mentioned that Kraken
China’s central bank issued a warning to domestic exchanges, going so and Bitfinex were good and safe because they had lots of currencies
far as to state it would move to shutter startups that violated its guid- available and they had all the information stored on the offline resources.
ance through the necessary government channels. Around four months These kinds of information are ubiquitous on the Internet, but they are
later, Huobi announced that they would resume Bitcoin withdrawal apparently too general and superficial. How much currencies can be
services7 . This policy and this event changed the Bitcoin exchanges called lots of and why storing information on the offline is safer?
market globally to a significant extent. We consulted expert EA on the exchanges comparison problem, like
Bitfinex and Kraken. EA pointed out that these two exchanges were
similar to each other because they both adopted the BitGo custodian
service to accelerate their customer transactions. Both of them kept part
of their Bitcoin in the BitGo wallet which could process the transactions
faster. However, EA had little idea about which exchange was more
reliable because he didn’t know the detailed strategies used by these
two. Therefore, we invited him to make a comparison via BitExTract.
connections with other exchanges. Turning his attention to Kraken, EA the inner-continent connections. The individual exchange is further dis-
observed an interesting turning point around November 2015. Similar played in the comparison view considering different attributes. For now,
to Bitfinex, Kraken also reduced its connections with other exchanges the order of each bar only reflects exchanges’ relative status instead
and no longer sent out Bitcoin. It just received Bitcoins from its partners of the absolute values. When compared with standard data analysis
after that time. EA swept over these time periods to see what happened software, our system has already processed raw transaction data and
to these exchanges. With the help of our real-time news plugin, EA provided a comprehensive analysis of the whole market and individual
learned that precisely on June 4th, 2016 Bitfinex started using the BitGo exchange with well-coordinated views. To the best of our knowledge,
service 9 , and on November 10th, 2016 Kraken also integrated with BitExTract is the first visualization system analyzing Bitcoin exchanges,
BitGo for multi-sig security 10 . Then on August 2nd, 2016 Bitfinex which closely follows domain tasks and requirements. Both domain
reported being hacked in an attack that resulted in approximately 70 experts and general users with basic knowledge about Bitcoin could
millions US dollars worth of Bitcoins being stolen 11 , and that may step into this world via BitExTract.
However, it still faces several limitations. Firstly, scalability is the
have been partially related to its BitGo implementation 12 (T6).
major concern when designing BitExTract. For example, we show
Networking evolution comparison using the connection view.
approximately a maximum of 10 months once in the comparison view.
By focusing on the connection views during different time periods, EA
Although scrolling in this view enables users to explore more than
found additional details about these periods as in Fig. 3C and Fig. 3D.
these, interactions still increase users’ memory burden. A possible
After integrating the BitGo service, the connections of Bitfinex and
improvement could be aggregating transactions based on different time
Kraken with other exchanges grew “weaker”, as they no longer sent
granularities according to the length of the selected time period. The
Bitcoins directly to any of their partners but used BitGo’s wallet as
massive sequence view currently gives a clear demonstration of the
an intermediary. The network of Bitfinex kept dropping continuously
evolution of 60 exchanges. However, this view might not provide such
which made its direct partners fewer and fewer, while Kraken (follow-
good performance when the number of exchanges is much larger than
ing a more conservative strategy regarding BitGo) still maintains a
this. Displaying dominant exchanges could be an acceptable trade-off,
number of direct partners (T3).
which still indicates the overall patterns of the whole market. Besides,
there might be severe visual clutters in the connection view in some
extreme cases such as showing transactions of the main exchange during
a long period, while it can be solved by grouping these arrows. Another
limitation lies in data uncertainty. Due to the anonymity properties of
the blockchain, identifying all the public keys belonging to a specific
exchange with 100% accuracy seems impossible, despite the use of
the state-of-the-art heuristic algorithms for classification. Although
this factor introduces uncertainty to data, we believe the exchanges
examined in our system are a representative subset of the current Bitcoin
Fig. 7. Bitfinex’s and Kraken’s network standing. Bitfinex’s network market. It is further supported by our findings in Section 7. Thirdly,
standing drops significantly after integrating with BitGo service in Jun, three attributes when defining the network standing index are derived
2015. Kraken’s network standing remained high after integrating with from our experts’ and senior Bitcoin traders’ empirical knowledge. It
BitGo service in Nov, 2015. might be partially incomplete and subjective.
Besides, the current BitExTract system can be extended to many
Network standing comparison using the comparison view. promising aspects. First, most exchanges can support multiple cryp-
Switching to the comparison view (Fig. 7), EA attempted to infer the tocurrencies trading. We can further apply our visualization approach
tendency of the network standing of Bitfinex and Kraken. Around the to other cryptocurrency analyses and dig into exchanges’ business
first timepoint, he observed that the network standing of Bitfinex shows intelligence. Secondly, the design in the comparison view could be
a steep decrease (T5). This indicates that Bitfinex’s wallet was at higher extended. Owing to the characteristics of financial attributes, multiple
risk after the adoption of the BitGo wallet service. By adjusting the sortable parallel bars which encode alternative attributes on a timeline
weighting scheme, he found that this drop was mainly caused by the are suitable for other time-varying, multi-variate transaction data.
increase in balance volatility, since the Market share and connection 9 C ONCLUSION
with its partners did not drop significantly. Bitfinex was still using its
old wallet to serve its customers. Apart from that, it first sent Bitcoins to In this work, we delved into the visual analysis of the Bitcoin mar-
its BitGo wallet and then to its customers so its market share remained ket development encompassing the evolution of Bitcoin exchanges.
roughly the same. The only difference was that Bitfinex maintained Close cooperation with domain experts allowed us to characterize two
only a small amount of Bitcoin in its old wallet, which made its wallet major targeted problems, namely, exchange selection priority for partic-
volatility quite high, thus affecting its network standing. On the other ipants and exchange network evolution. Thus, we developed BitExTract
hand, Kraken kept most of its Bitcoin in its own wallet and transferred and integrated this highly interactive visualization system into the ex-
only a necessary amount to the BitGo hot wallet and therefore, its ploration process to tackle these problems. We further validated our
network standing remained virtually unaffected. proposed system through three representative case studies and four
well-designed structured interviews with domain experts. The results
indicated that BitExTract performs efficiently in Bitcoin exchange in-
8 D ISCUSSION
telligence extraction and cryptocurrency transaction data exploration.
Three case studies demonstrate the effectiveness and usability of our In the future, we plan to integrate short-term impact analysis, e.g.,
system in extracting intelligence of Bitcoin exchanges. Overall, our cryptographic attacks, Bitcoin forks, or dramatic price fluctuation.
collaborating experts are satisfied with its strong analytical abilities, es- These kinds of activities take place frequently and influence the whole
pecially with straightforward visualization. The massive sequence view network thoroughly. Besides, the investigation of clients is an essential
presents a whole picture of the Bitcoin world based on transactions part of trading analysis. It would be more encompassing if users’ be-
between exchanges and clients. Supplementary information is provided haviour patterns could be revealed. Moreover, except the exchanges,
by the news panel. The connection view displays the business proxim- analysis on other communities like pools, dark markets or gambling
ity with a highly interactive design. Smooth interactions with prompt websites could also reveal the insights of cryptocurrency world to us.
visual feedback allow users to explore from the worldwide situation to
10 ACKNOWLEDGEMENTS
9 https://bit.ly/2MOxb7f
10 https://blog.kraken.com/post/247/kraken-integrates-with-bitgo/
We would like to thank all the reviewers and domain experts for their
11 http://fortune.com/2016/08/03/bitcoin-stolen-bitfinex-hack-hong-kong/ valuable comments. This work is supported in part by grant RGC GRF
12 https://themerkle.com/is-bitgo-to-blame-for-the-bitfinex-bitcoin-theft/ 16208514 and NFSC Grants-61572488, 61673241.
YUE ET AL.: BITEXTRACT: INTERACTIVE VISUALIZATION FOR EXTRACTING BITCOIN EXCHANGE INTELLIGENCE171