DJF Thesis Final
DJF Thesis Final
Daniel Fenn
St. Anne’s College
University of Oxford
Trinity 2010
To my parents
Acknowledgements
I would like to thank HSBC bank and the EPSRC for funding this work.
I am grateful to Mark McDonald and Stacy Williams for helping me to
understand the intricacies of the financial data and for many useful (and
occasionally enjoyable) discussions. I would particularly like to thank Nick
Jones and Mason Porter for guiding this research and for the countless sug-
gestions and insights that have helped to shape the ideas that I present.
I would also like to thank them for comments on various manuscripts,
which encouraged me to make my explanations clearer and my arguments
more precise. I am also grateful to Sam Howison for helpful conversations
and guidance and to J.-P. Onnela for many fruitful discussions and sug-
gestions. Finally, I would like to thank Neil Johnson and Peter Mucha for
valuable input at various stages of this work.
Abstract
Many systems studied in the biological, physical, and social sciences are
composed of multiple interacting components. Often the number of com-
ponents and interactions is so large that attaining an understanding of
the system necessitates some form of simplification. A common represen-
tation that captures the key connection patterns is a network in which
the nodes correspond to system components and the edges represent in-
teractions. In this thesis we use network techniques and more traditional
clustering methods to coarse-grain systems composed of many interacting
components and to identify the most important interactions.
This thesis focuses on two main themes: the analysis of financial systems
and the study of network communities, an important mesoscopic feature
of many networks. In the first part of the thesis, we discuss some of the
issues associated with the analysis of financial data and investigate the
potential for risk-free profit in the foreign exchange market. We then use
principal component analysis (PCA) to identify common features in the
correlation structure of different financial markets. In the second part of
the thesis, we focus on network communities. We investigate the evolving
structure of foreign exchange (FX) market correlations by representing the
correlations as time-dependent networks and investigating the evolution
of network communities. We employ a node-centric approach that allows
us to track the effects of the community evolution on the functional roles
of individual nodes and uncovers major trading changes that occurred in
the market. Finally, we consider the community structure of networks
from a wide variety of different disciplines. We introduce a framework
for comparing network communities and use this technique to identify
networks with similar mesoscopic structures. Based on this similarity,
we create taxonomies of a large set of networks from different fields and
individual families of networks from the same field.
Publications
Much of the work in this thesis has been published or a manuscript has been
submitted and is under review. Details of these publications are given below.
∗
These authors are listed as joint first authors on these papers. I performed all of the analysis
that we describe in publication [P4] and that I present in this thesis.
I have undertaken additional research during my D. Phil. that I do not include
in this thesis due to its disparate nature. For completeness, I list the publications
resulting from this work below.
The research in this thesis is a result of collaboration between myself and my coauthors
on the listed publications. My collaborators have helped develop the ideas described
in this thesis, but I have performed all of the analysis leading to the results that I
present.
Contents
1 Introduction 1
1.1 Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Topology and weighted networks . . . . . . . . . . . . . . . . 3
1.1.2 Community structure . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.3 Dynamics of and on networks . . . . . . . . . . . . . . . . . . 4
1.2 Financial systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
i
Contents
ii
Contents
iii
Contents
iv
Contents
7 Conclusions 191
7.1 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
References 247
v
List of Figures
vii
List of Figures
viii
List of Figures
D.1 The problem with summing over all i, j in the Potts Hamiltonian . . 234
ix
List of Tables
xi
Chapter 1
Introduction
1.1 Networks
A variety of systems studied across a range of academic disciplines are composed of
multiple components that interact with each other in some way. Often these systems
are described as complex [47]. Although there is no precise definition of a complex
system, roughly speaking, a system is considered complex if it possesses many parts,
whose behaviours are highly variable and strongly dependent on the behaviours of
the other parts [270,274]. Many authors also agree that for a system to be considered
complex, it should possess emergent properties that arise through the interactions of
the components in the absence of any central controller [13]. However, the concept of
emergence is also slippery and there is currently no standard definition [25, 51, 172].
Irrespective of the precise definitions of complex systems and emergence, for systems
composed of interacting components, the pattern of connections between the com-
ponents are often crucial to the behaviour of the system. The system cannot be
understood by studying the parts in isolation; it is essential to consider the interac-
tions.
When studying systems that possess many components and interactions, to make
the analysis tractable it is often necessary to simplify the analysis by focusing on a
subset of key interactions. A common way of studying the pattern of interactions
in a given system is to construct a network (or graph) in which the components
are represented as nodes and the connections are represented as edges [9, 60, 217,
223].1 A network is therefore a simplified representation that reduces a system to a
structure that captures only the key connection patterns; however, information is lost
in the simplification process, so for any analysis to be meaningful it is important to
1
Nodes are sometimes referred to as vertices and edges as links. In this thesis, we use these terms
interchangeably.
1
Chapter 1 | Introduction
ensure that the discarded details are not critical to the properties of the system being
investigated. Networks can take different forms: they can be embedded in Euclidean
space, such as airline networks and neural networks; or they can be defined in an
abstract space, such as social networks2 and language networks [46].
Traditionally, the study of networks lay within the domain of graph theory [49],
which is usually considered to date back to 1736 when Euler published a solution to
the Königsberg bridge problem. Initially graph theory focused on regular graphs, but
since the 1950s graph theorists have also investigated random graphs [50]. This shift
was stimulated by the work of Rapoport [249,250,277] and Erdős and Rényi [86–88] on
a simple random graph model. In the model, one begins with N nodes and connects
them uniformly at random with probability p, creating a graph which on average has
1
2
pN(N − 1) edges distributed at random.
In addition to the developments in mathematical graph theory, beginning in the
1920s, social scientists started to use networks to study the relationships between
social entities, e.g., [76, 98, 111, 213, 249, 251, 258, 304]. Because of the difficulty in
collecting and analyzing large data sets, most of the early studies of social networks
were very small and the networks usually only included tens of nodes. In many of these
studies, the social scientists were often interested in answering questions relating to
the meaning of edges in the networks, such as whether they arose through friendship,
obligation, strategic alliance, or something else [127].
In the late 1990s, a surge of interest in network research across a wide range of
disciplines [225] was sparked by the publication of seminal papers by Watts and Stro-
gatz [305] and Barabási and Albert (BA) [26]. These and subsequent developments
in network science were made possible by two key factors: (1) the computerization of
data acquisition, which meant that it was significantly easier to collect data for large
networks, and (2) the increase in computational resources, which enabled researchers
to analyze these data sets. An important observation from the new data was that,
in contrast to ER random graphs, real-world networks often possess significant inho-
mogeneities.3 For example, many networks display the “small-world” phenomenon –
despite the fact that networks contain a large number of nodes, there is often a rela-
tively short path between two nodes (i.e., a small average path length) – studied by
Milgram [209], whose work spawned the phrase the six degrees of separation. Watts
and Strogatz observed that networks that possess the small-world property often also
2
Social networks can sometimes contain implicit geographical information.
3
An example of a homogeneous property of ER random graphs is the degree (number of neigh-
bours) of each vertex. Because there is an equal probability of all edges existing, most nodes in ER
networks have similar degrees.
2
1.1 | Networks
show high levels of clustering – two nodes with a common neighbour are more likely
to be connected [305]. Another important observation was that the degree distribu-
tion of many real-world networks significantly deviates from the Poisson distribution
expected for random graphs, with some nodes having significantly more edges than
expected. This led Barabási and Albert (BA) [26] to propose a preferential attach-
ment model for network growth (in which there is a higher probability for new edges
to attach to nodes with high degree) which produces network with a power-law degree
distribution.4
3
Chapter 1 | Introduction
4
1.2 | Financial systems
third possibility is the comparison of networks for interactions aggregated over non-
overlapping time windows, e.g., [92].
The spreading processes on network (such as the spread of diseases, information,
or rumours) mentioned earlier in this section in the context of network topology are
particular examples of the more general concept of dynamical systems on networks
[223]. A dynamical system is any system whose state, as represented by some set of
variables, changes over time according to a set of rules or equations [281]. Typically,
dynamical systems on networks consist of independent dynamical variables associated
with each node that are only coupled together along the edges of the network. Many
real-world processes can be represented as dynamical systems operating on a network.
For example, the flow of traffic on roads, electricity over power grids, or the changing
concentrations of metabolites in cells [223]. One particular area of focus is the study of
the synchronization of coupled oscillators on networks, which represents an important
feature of many real-world systems [15,46]. For example, evidence suggests that there
is a pathological synchronization of neural populations during epileptic attacks [46].
In this thesis, we investigate the properties of networks possessing each of the
characteristics described in the previous three sections. In Chapter 5, we analyze
the evolving community structure of a dynamic, weighted network, and in Chapter 6
we compare the community structures of a wide variety of weighted and unweighted
networks.
5
Chapter 1 | Introduction
Several studies have attempted to tackle this problem using networks. For example,
networks have been used to analyze the trade relationships between nations [275] and
liabilities in the inter-bank lending market [52]. Perhaps the most common applica-
tion of networks to financial market is in the study of the relationships between the
price time series of financial assets, e.g., [197,198,229]. In this approach, each node in
the network represents an asset and each weighted edge represents a time-dependent
correlation between the asset price time series.
In Chapter 5, we study FX market networks in which each node represents an
exchange rate and the edges represent the correlations between rates. An argument
made to justify the study of networks constructed from asset price time series is as
follows [228]. In markets, traders repeatedly compete for a limited resource, as they
buy and sell assets, with the exact timing of these trading decisions often driven by
exogenous events, such as news announcements, scheduled economic data releases,
and other events. Although the exact nature of the interactions between market
participants is often not known, the asset prices should reflect the complex pattern of
actions, feedback, and adaptation of traders, so the price time series can be considered
as the manifestation of these interactions. Under these assumptions, instead of the
nodes representing the interacting components (i.e., the traders), they represent the
resource that the components are competing for (i.e., the assets) and instead of the
edges representing the interactions between the components (i.e., buying and selling
actions) they represent correlations in a signal that results from this process (i.e., the
asset price).
Irrespective of the exact relationship between the network and the underlying
financial system, it is insightful to investigate networks based on the correlations
between different assets. In fact, this example demonstrates the power of the network
framework. Because we work with networks in an abstract form, the tools of network
analysis can in theory be applied to any system that can be represented as a network
[223]. In essence, networks methods are simply a set of techniques for studying and
identifying patterns in data generated by interacting systems. Of course, the insights
that can be gained using a network approach depend on the suitability of the technique
to the problem and for some systems other methods will be more appropriate.
1.3 Outline
This thesis is organized into six additional chapters. In each chapter in which we
present new research we provide an overview of the relevant literature and a motiva-
6
1.3 | Outline
tion for the work. The chapters are more or less distinct and can be read in isolation.
However, a continuous thread runs through the thesis as we move from an analysis of
financial systems to an investigation of communities in financial systems to a study
of communities in systems from a wide variety of different disciplines.
In Chapter 2, we discuss some of the problems associated with analyzing financial
data and present a study in which the type of data used is critical to the output of the
analysis. We also describe the FX market, which is the focus of Chapters 2 and 5. The
results of Chapter 2 answer a question of particular interest to market practitioners
regarding the possibility of making risk-free profit in the FX market. In Chapter 5, we
continue to investigate financial markets, but we extend the analysis to include assets
from a variety of different markets. We study the correlation structure across these
different markets by using principal component analysis to coarse-grain the data and
identify common features. We then study the way in which these relationships evolve
through time and discuss how the features are affected by different market events.
In the remainder of the thesis, we focus on communities in networks. In Chapter 4
we describe some of the most widely used techniques for detecting communities in
networks and present a relatively comprehensive review of the literature on communi-
ties in dynamic networks. In Chapter 5, we study the structure of the FX market by
representing the correlations between currency exchange rates as time-dependent net-
works and investigating the evolution of network communities. We propose a method
for tracking communities in dynamical networks and use this approach to identify
significant changes in the structure of the FX market. In Chapter 6, we investigate
the community structure of networks from a range of different disciplines, including
biology, sociology, politics, and finance, and introduce a framework for comparing
network communities. We use this technique to identify networks with similar meso-
scopic structures. Based on this similarity, we create taxonomies of a large set of
networks from different fields and individual families of networks from the same field.
Finally, in Chapter 7, we offer some conclusions and suggest some possible directions
for future research.
7
Chapter 2
The work described in this chapter has been published in reference [P2]. We highlight
that this is an empirical chapter and the analysis we present is not technical. However,
this simplicity serves to emphasize one of the main purposes of this chapter which is
to demonstrate that one needs to exercise caution when analyzing financial data. If
one uses data that is inappropriate for a particular analysis, it is easy to reach false
conclusions. We show how even for the simplest financial questions, seemingly similar
data can produce very different results. In demonstrating this, we answer a question
of interest to financial market practitioners.
2.1 Introduction
The advance in computing power during the last two decades has facilitated the stor-
age and analysis of increasingly large data sets. The increased storage capacity is
particularly useful in financial markets because, as well as enabling market partici-
pants to record details of executed transactions, institutions are now able to record
additional market information even if a trade is not executed (such as the best avail-
able price and the volume available at this price). The increased computing power has
also enabled exchanges to publish prices at increasingly higher frequencies, with some
exchanges in the FX market now publishing price updates every 250 milliseconds.
The availability of these enormous, accurate, high-frequency data sets has pro-
vided economists and financial mathematicians with unprecedented resources to test
their models and has resulted in many researchers from other disciplines studying fi-
9
Chapter 2 | Triangular Arbitrage in the FX Market
10
2.1 | Introduction
are quoted as exchange rates of the form XXX/YYY, which indicate the amount of
currency YYY that one would receive in exchange for one unit of currency XXX. In
this thesis, we refer to currencies with the standard three letter abbreviations (tickers)
used to identify them in the FX market. The codes for the currencies we study are
USD - U.S. dollar, CHF - Swiss franc, JPY - Japanese yen, EUR - euro, DEM -
German mark, AUD - Australian dollar, CAD - Canadian dollar, XAU- gold2 , GBP
- pounds sterling, NZD - New Zealand dollar, NOK - Norwegian krone, and SEK -
Swedish krona. In contrast to most other markets, the FX market is liquid 24 hours
a day.3 There are two prices quoted for an exchange rate: a bid and an ask price.
These give the different prices at which one can buy and sell currency, respectively,
with the ask price tending to be larger than the bid price. The exchange rate between
EUR and USD may, for example, be quoted as 1.4085/1.4086. A trader then looking
to convert USD into EUR might have to pay 1.4086 USD for each EUR, while a
trader looking to convert EUR to USD may receive only 1.4085 USD per EUR. The
difference between the bid and ask prices is the bid-ask spread.
11
Chapter 2 | Triangular Arbitrage in the FX Market
by a few basis points4 . The main purpose of an indicative price is to supply clients of
banks with a gauge of where the price is. A large body of academic research into the
FX market has been performed using indicative quotes often under the assumption
that, due to reputational considerations, “serious financial institutions” are likely to
trade at exactly the quoted price, especially if they are hit a short time after the
quote is posted [73, 74, 138]. The efficiency of using indicative quote data for certain
analyses has, however, been drawn into question, e.g., [193, 200]. In Ref. [193], Lyons
highlights some of the key problems with indicative prices: indicative prices are not
transactable; the indicative bid-ask spread, despite usually “bracketing” the actual
tradeable spread, is usually two to three times as large (i.e., the tradeable bid and
ask prices usually lie between the indicative bid and ask prices); during periods of
high trading intensity market makers are too busy to update their indicative quotes;
and market makers themselves are unlikely to garner much of their high-frequency
information from indicative prices. In the FX market today indicative prices are
typically updated by automated systems, nevertheless the quoted price is still not
necessarily a price at which one could actually execute a trade.
Goodhart et al. [131] performed a comparison of indicative bid-ask quotes from
the Reuters FXFX page and executable prices from the Reuters D2000–2 electronic
broking system over a 7 hour period and found that the behaviour of the bid-ask
spread and the frequency at which quotes arrived were quite different for the two
types of quote. In particular, the spread from the D2000–2 system showed greater
temporal variation, with the variation dependent upon the trading frequency. In
contrast, the indicative price spread tended to cluster at round numbers, a likely
artifact of the use of indicative prices as a market gauge. This discrepancy between
indicative and executable prices is likely to be less important if one is performing a
low frequency study, arguably down to time scales of 10–15 minutes [138]. If, however,
one is considering very high-frequency data, this difference becomes highly significant.
For example, in Ref. [130] Goodhart and Figliuoli find a negative first-order auto-
correlation in price changes at minute-by-minute frequencies using indicative data.
In Ref. [131], however, Goodhart finds no such negative auto-correlation when real
transaction data is used. Indicative data seem particularly unsuitable to many market
analyses today because banks are now able to provide their clients with automated
4
A basis point is equal to 1/100th of a percentage point. In this paper we will also discuss points,
where a point is the smallest price increment for an exchange rate. For example, for the EUR/JPY
exchange rate, which takes prices of the order of 139.60 over the studied period, 1 point corresponds
to 0.01. In contrast, for the EUR/USD rate with typical values around 1.2065, 1 point corresponds
to 0.0001.
12
2.1 | Introduction
executable prices through an electronic trading platform so there is even less incentive
for them to make their indicative quotes accurate.
13
Chapter 2 | Triangular Arbitrage in the FX Market
2.3 Data
The data we use for the analysis consists of second-by-second executable prices for
{EUR/USD, USD/CHF, EUR/CHF, EUR/JPY, USD/JPY}. We investigate trian-
gular arbitrage opportunities for the transactions involving {EUR/USD, USD/CHF,
14
2.3 | Data
EUR/CHF} and {EUR/USD, USD/JPY, EUR/JPY} for all week days over the pe-
riod 02/10/2005–27/10/2005 and we compare the results with those for two earlier
periods: 27/10/2003–31/10/2003 and 01/10/2004–05/10/2004.5 The full data set
consists of approximately 2.6 million data points for each of the rate products γ1 and
γ2 , 5.2 million data points for each of the currency groups, and 10.4 million data
points in total. A rate product, indicating whether or not a triangular arbitrage op-
portunity existed, was found for each of these points. We show a sample of one of the
sets of exchange rates and the corresponding time series of bid-ask spreads in Fig.
2.1.
1.2900 1.5480
price
1.2000
1.2850 1.5470
bid
ask
1.1950 1.2800 1.5460
00:00 06:00 12:00 18:00 24:00 00:00 06:00 12:00 18:00 24:00 00:00 06:00 12:00 18:00 24:00
time
6 10 10
8 8
4
6 6
spread
2 4 4
2 2
0
0 0
00:00 06:00 12:00 18:00 24:00 00:00 06:00 12:00 18:00 24:00 00:00 06:00 12:00 18:00 24:00
time
Figure 2.1: Exchange rate time series for EUR/USD, USD/CHF and EUR/CHF
on 12/10/2005. Upper: bid and ask prices. Lower: bid-ask spread. Each marker
represents the spread at a single time step. The vertical axes have been truncated to
make the detail around the typical values clearer.
5
All times in this paper are given in GMT. The full day 28/10/2005 is excluded from the analysis
for the JPY group of exchange rates due to an error with the data feed on this day. During periods
of lower liquidity it is possible that there were times at which no party was offering a bid and/or ask
price. At these times it would not have been possible to complete a triangular transaction involving
the missing exchange rate so we set the associated rate product to zero.
15
Chapter 2 | Triangular Arbitrage in the FX Market
1.0004
0.9996
γ
0.9988
0.9980
03/10/05 04/10/05 05/10/05 06/10/05 07/10/05
time
1.0004
1.0002
γ
1.0000
03/10/05 04/10/05 05/10/05 06/10/05 07/10/05
time
Figure 2.2: Rate product evolution for the period 03/10/2005–07/10/2005 for the
transaction EUR→USD→JPY→EUR. Upper: all rate products, with a few extreme
values removed so that the structure around the typical values is clearer. All points
above the red line correspond to potential triangular arbitrages. Lower: the same plot
truncated vertically at γ = 1 so that each spike represents an arbitrage opportunity.
The distributions in Fig. 2.3 show that, as expected, the rate product tends to be
slightly less than one and typically γ ∈ [0.9999, 1]. The log-linear plots also highlight
that the distributions possess long tails extending to smaller values of the rate product
and that there are some times when γ > 1. This means that for the majority of
6
Triangular parity implies that the direct exchange rate is equal to the exchange rate generated
through the cross-rates. For example, EUR/USD = (EUR/JPY)/(USD/JPY), where one needs to
use the correct bid and ask price to construct the synthetic exchange rate.
16
2.4 | Arbitrage properties
deviations from triangular parity the individual exchange rates are shifted in such
a direction that triangular arbitrage is not possible, but that occasionally potential
profit opportunities do occur. Over the four week period analyzed there are 10, 018
triangular arbitrage opportunities for the two CHF-based transactions given by Eqs.
(2.2) and (2.3) and 11, 367 for the equivalent JPY transactions.
We now establish both the duration and magnitude of these potential arbitrages
and attempt to determine whether or not they represent genuine, executable profit
opportunities.
5
x 10
JPY 5
x 10
CHF
5 5
4 4
frequency
frequency
3 3
2 2
1 1
0 0
0.994 0.996 0.998 1.000 0.994 0.996 0.998 1.000
γ γ
4 4
10 10
frequency
frequency
2 2
10 10
0 0
10 10
0.994 0.996 0.998 1.000 0.994 0.996 0.998 1.000
γ γ
Figure 2.3: Occurrence frequency for rate products of different magnitudes for the
period 02/10/2005–27/10/2005. Upper: aggregated results for both JPY transactions
and CHF transactions. Any parts of the histograms to the right of the line at γ = 1
correspond to potential triangular arbitrages. The JPY panels show all data points
within this period and the CHF panels all points except a few at very small and very
large γ. Lower: the same distributions on a log-linear scale.
2.4.2 Durations
Firstly, we consider the length of periods for which γ > 1 and thus over which trian-
gular arbitrage opportunities exist. We define an X second arbitrage as one for which
γ > 1 for more than X − 1 seconds, but less than X consecutive seconds. In Fig. 2.4,
we show the distributions of the observed durations of arbitrage opportunities and
we provide summary statistics for these distributions in Table 2.1. The vast majority
of arbitrage opportunities are very short in duration; although some opportunities
17
Chapter 2 | Triangular Arbitrage in the FX Market
appear to exist for in excess of 100s, for both currency groups 95% last for 5 seconds
or less and 60% for 1 second or less.
JPY
0.8
observed probability
0.8
observed probability
0.6 0.6
0.4
0.4 0.2
0
0.2 1 2 3 4 5 6 7 8 9 10
duration (s)
0
0 10 20 30 40 50 60 70 80
duration (s)
CHF
0.8
observed probability
0.8
observed probability
0.6 0.6
0.4
0.4 0.2
0
0.2 1 2 3 4 5 6 7 8 9 10
duration (s)
0
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150
duration (s)
Figure 2.4: Distributions showing the durations of arbitrage opportunities for the
period 02/10/2005–27/10/2005 for JPY (upper) and CHF (lower) transactions. The
insets show the region of the distributions for arbitrage opportunities with durations
of between 1 and 10 seconds.
18
2.4 | Arbitrage properties
Table 2.1: Summary statistics for the duration of arbitrage opportunities for the
two JPY and two CHF transactions for the period 02/10/2005–27/10/2005. An
opportunity labelled as Xs lasted for more than X − 1 but less than X seconds.
position in USD and a short position in CHF.7 The trader may choose to unwind8 this
position immediately by converting USD into CHF and this transaction will cost the
amount by which the price has moved from the arbitrage price. Over a short time-
scale, this is likely to be 1–2 points (approximately 1.5–2 basis points). Incomplete
arbitrage transactions therefore typically cost a small number of basis points.
The extremely short time scales involved in these trades means that the physical
distance between the traders and the location where their trades are filled is important
in determining which trade arrives first and is completed at the arbitrage price. This
explains why a number of exchanges have begun to offer the possibility of locating
trading systems on their premises. A trader has a higher chance of completing an
arbitrage transaction for opportunities with longer durations because the arbitrage
prices remain active in the market for longer. When an arbitrage signal is received,
however, there is no way of knowing in advance how long the arbitrage will exist.
Over half of all arbitrage opportunities last for less than 1 second, so there is a high
probability that any signal that is traded on is generated by an opportunity of less
than a second. This includes many opportunities that last for only a few milliseconds.
For these opportunities there is a smaller chance of the transaction being completed
at an arbitrage price. For each attempted arbitrage, one cannot eliminate the risk
that one of the prices will move to an arbitrage-free price before the transaction is
completed.
2.4.3 Magnitudes
Given these risks, one possible criterion that could be used to decide whether or not
to trade is the magnitude of the apparent opportunity. If the value of the rate product
7
In market parlance, a trader buying an asset is opening a long position and a trader selling an
asset is opening a short position.
8
This is the closure of an investment position by executing the opposite transaction. For example,
if a trader has bought an asset A, they can unwind their position in A by selling the asset.
19
Chapter 2 | Triangular Arbitrage in the FX Market
is large, and thus it appears that a significant profit could potentially be gained, one
may decide that the potential reward outweighs the associated risks and execute the
arbitrage transactions. In this section we consider the magnitudes of the arbitrage
opportunities.
Table 2.2: The number and mean duration of arbitrage opportunities exceeding dif-
ferent thresholds for the two JPY transactions and two CHF transactions for the
period 02/10/2005–27/10/2005. A one basis point threshold corresponds to a rate
product of γ ≥ 1.0001.
Table 2.2 demonstrates that most arbitrage opportunities have small magnitudes,
with 94% less than 1 basis point for both the JPY and CHF. An arbitrage opportunity
of 1 basis point corresponds to a potential profit of 100 USD on a 1 million USD
trade. A single very large trade (or a large number of smaller trades) would thus be
required in order to realize a significant profit on such an opportunity. Large volume
trades are, however, often not possible at the arbitrage price. For example, consider
the transaction EUR→USD→JPY→EUR at a time when EUR/USDbid = 1.2065,
USD/JPYbid = 115.72 and EUR/JPYask = 139.60, resulting in γ = 1.000115903. If
there are only 10 million EUR available on the first leg of the trade at an arbitrage
price then the potential profit is limited to 1, 159 EUR. In practice, the amount
available at the arbitrage price may be substantially less than 10 million USD and
consequently the potential profit correspondingly smaller.
This calculation also assumes that it is possible to convert the full volume of
currency at an arbitrage price for each of the other legs of the transaction. In practice,
however, the volumes available on these legs will also be limited. For example, again
consider the case where there are 10 million EUR available at an arbitrage price on
the first leg of the above transaction. If the full 10 million are converted into USD,
the trader will hold 12.065 million USD. There may, however, only be 10 million USD
available at an arbitrage price on the next leg of the trade. In order for the full volume
to be traded at an arbitrage price, the trader should therefore limit the initial EUR
trade to 10/1.2065 = 8.29 million EUR. The volume available on the final leg of the
trade would also need to be considered in order to determine the total volume that
20
2.4 | Arbitrage properties
can be traded at an arbitrage price. This volume and the total potential profit are
therefore determined by the leg with the smallest available volume.
Occasionally, larger magnitude arbitrage opportunities can occur. Table 2.2 shows
that, over the studied period, there are potential arbitrages of more than 9 basis points
for both currency groups, with a mean duration9 of in excess of 2 seconds for the large
CHF opportunities. This duration suggests that one would have stood a good chance
of completing an arbitrage transaction for one of these opportunities. However, this
mean was calculated using only six opportunities and so does not represent a reliable
estimate of the expected duration. The fact that these large opportunities occur
so infrequently (with only around 20 potential arbitrages in excess of 4 basis points
occurring for each transaction over the four week period analyzed) means that trading
strategies that only trade on these larger opportunities would need to make large
volume trades in order to realize significant profits. As we have already discussed
though, there is only ever a limited volume available at the arbitrage price.
JPY CHF
3000 3000
no. of opportunities
no. of opportunities
2000 2000
1000 1000
0 0
Mon Tue Wed Thu Fri Mon Tue Wed Thu Fri
day day
3 3
mean duration (s)
2 2
1 1
0 0
Mon Tue Wed Thu Fri Mon Tue Wed Thu Fri
day day
Figure 2.5: Daily arbitrage statistics for the period 02/10/2005–27/10/2005. Upper:
the number of arbitrage opportunities. Lower: mean duration of arbitrage opportu-
nities.
9
Each mean duration represents an upper bound. This is because each opportunity labelled as
Xs may have existed for anywhere between X − 1 and X seconds, but in calculating the mean
duration we assume that it lasted for exactly X seconds.
21
Chapter 2 | Triangular Arbitrage in the FX Market
We now consider whether there is any seasonality in the number and duration
of arbitrage opportunities by investigating daily and hourly statistics. Figure 2.5
shows that the number of arbitrage opportunities per day and their mean duration
is reasonably uniform across days. However, Fig. 2.6 demonstrates that there is a
large amount of variation in these quantities for different hours of the day. Both the
JPY and CHF transactions show a particularly small number of opportunities, with a
large mean duration, between approximately 22:00 and 01:00, and a large number of
opportunities, with a short duration, between 13:00 and 16:00. In general, the hours
with larger numbers of arbitrage opportunities correspond to those with shorter mean
durations and vice versa.
JPY CHF
1500 1500
no. of opportunities
no. of opportunities
1000 1000
500 500
0 0
0 4 8 12 16 20 0 4 8 12 16 20
hour hour
6 8
mean duration (s)
6
4
4
2
2
0 0
0 4 8 12 16 20 0 4 8 12 16 20
hour hour
Figure 2.6: Hourly arbitrage statistics for the period 02/10/2005–27/10/2005. Upper:
the number of arbitrage opportunities. Lower: mean duration of arbitrage opportu-
nities.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Asia
Europe
Americas
Table 2.3: Grey blocks indicate the hours corresponding to high liquidity for the
Asian, European and American markets.
22
2.4 | Arbitrage properties
American FX markets are at their most liquid. The period of highest liquidity is
from 08:00–16:00; over almost all of this period two of the markets are highly liquid
at similar time. The period of least liquidity is from around 22:00–01:00. The hours
with the largest number of arbitrage opportunities and the shortest mean durations
in Fig. 2.6 thus correspond to the periods of highest liquidity. This observation of
more arbitrage opportunities during the periods of highest liquidity seems counter-
intuitive but can be explained as follows. During liquid periods the bid-ask spread is
narrower (see Fig. 2.1) and prices move around at a higher frequency due to the large
volume of trading. This results in more price mis-alignments and consequently more
potential arbitrages. The high trade frequency, however, also ensures that the mis-
pricings are quickly traded away or removed and that any arbitrage opportunities are
short-lived. In contrast, during less liquid periods the spread is wider and the trading
volume lower which leads to fewer arbitrage opportunities. The smaller number of
traders available to correct any mis-pricings during less liquid times also results in
the arbitrages having longer durations.
Table 2.4 shows that the number of arbitrage opportunities decreased from 2003–
2005 for the JPY and CHF transactions. This can be explained by the increasingly
wider use of electronic trading platforms and trading algorithms over this period.
23
Chapter 2 | Triangular Arbitrage in the FX Market
JPY CHF
0.14 0.12
observed probability
observed probability
2003
0.12 0.1
2004
0.1 2005 0.08
0.08
0.06
0.06
0.04
0.04
0.02 0.02
0 0
0.9980 0.9985 0.9990 0.9995 1.0000 1.0005 1.0010 0.9980 0.9985 0.9990 0.9995 1.0000 1.0005 1.0010
γ γ
observed probability
observed probability
−2 −2
10 10
−4 −4
10 10
0.9980 0.9985 0.9990 0.9995 1.0000 1.0005 1.0010 0.9980 0.9985 0.9990 0.9995 1.0000 1.0005 1.0010
γ γ
Figure 2.7: Comparison of the rate product distributions for the periods 27/10/2003–
31/10/2003, 01/11/2004–05/11/2004 and 17/10/2005–21/10/2005. Lower: distribu-
tions on a log-linear scale.
These systems enabled traders to execute trades faster and to react more quickly
to price changes, which in turn gave rise to increased trading efficiency, fewer mis-
pricings and fewer triangular arbitrage opportunities. Table 2.4 also demonstrates the
significant effect that this increased execution speed had on the duration of arbitrage
opportunities. From 2003–2005, the proportion of opportunities lasting less than 1
second increased from 40% to 62% for the JPY transactions and from 41% to 64%
for the CHF transactions; and the proportion of opportunities lasting in excess of 5
seconds halved for both sets of transactions.
The distributions in Fig. 2.7 and the distribution statistics in Table 2.4 provide
further evidence of the increased pricing efficiency of the FX market from 2003 to
2005. Over this period the distribution of rate products became concentrated in
a sharper peak, with a smaller standard deviation and mean closer to one, which
demonstrates that triangular parity held a larger proportion of the time.
2.5 Profitability
We provide further insights into the profitability of trading on triangular arbitrage
signals by running simulations to determine the profit or loss that could potentially be
achieved using different trading strategies. For the full time series of JPY and CHF
24
2.5 | Profitability
(2) All arbitrages with a duration l ≥ 1 second are definitely filled. All opportunities
traded on with a length l < 1 second are filled with probability P2 .
25
Chapter 2 | Triangular Arbitrage in the FX Market
−0.5
0.5
−1
−1.5
0
mean profit per trade
−2
−0.5 0.5
0
−1
−0.5
−1
−1.5
−1.5
0
−2 0.5 −2
1 1
0.8 0.6 1.5
0.4 0.2 0 2
fill probability, P loss per unfilled arbitrage, L
1
Figure 2.8: Mean profit/loss per trade (in basis points) as a function of the probability
of a transaction being filled at an arbitrage price and the loss incurred on missed
arbitrages for JPY transactions over the period 10/02/2005–10/27/2005. We assume
a trade threshold γ = 1 and scenario (1). The grey curve shows the break-even fill
probabilities. The probabilities are averaged over 100 simulations.
In order to achieve the 400, 000 EUR profit, it would have been necessary to stake
1 million EUR more than 17, 000 times. If we estimate transaction fees and settlement
costs at 2 EUR per trade, then each arbitrage transaction costs 6 EUR. The total
cost of 17, 000 transactions is then 102, 000 EUR, which is a significant proportion
of the potential profits. This profit is also likely to be a significant over-estimate.
In the simulations, we assumed that each arbitrage transactions is completed for
the full 1 million EUR initially staked. As discussed in Section 2.4.3, however, the
amount available at the arbitrage price is limited and may be less than this amount.
More importantly, a 100% fill probability is extremely unrealistic and in practice
the achievable fill probability will be significantly smaller. At a still unrealistic fill
probability of P2 = 0.8, for scenario (2), the potential profit is reduced to around
100, 000 EUR. This potential profit is already less than the estimated transaction
costs and there are additional infrastructure costs that also need to be considered.
26
2.6 | Fill probabilities
5 5
x 10 x 10
4 4
2
2
0
−2 0
−4
profit, Π1
−2
profit, Π
−6
−8
−4
−10
−12 −6
γ =1
t
−14
−8 γ =1.00005
t
−16 γ =1.0001
t
−18 −10
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
fill probability, P1 fill probability, P2
Figure 2.9: Total profit (in EUR) for JPY transactions over the period 02/10/2005–
27/10/2005. Each arbitrage transaction is traded with an initial currency outlay of 1
million EUR and each completed transaction is filled for the full traded volume. We
assume a fixed loss L = 1.5 basis points for each incomplete arbitrage transaction.
Left: scenario (1). Right: scenario (2). Error bars indicate the standard deviation
in the profit over 100 simulations. The standard deviations in the profit for P1 = 0,
P1 = 1, P2 = 0, and P2 = 1 are zero because the same arbitrage opportunities are
filled for each simulation.
27
Chapter 2 | Triangular Arbitrage in the FX Market
where hγ − 1|γ > γt i denotes the average value of γ − 1 over the interval W given
that γ > γt , and the break-even fill probability P1b (found when Π1 = 0) is given by
−1
b hγ − 1|γ > γt i
P1 = 1 + . (2.5)
L
The break-even fill probability P1b is therefore independent of the number of arbitrage
opportunities and decreases with increasing hγ − 1|γ > γt i. This can be seen in Fig.
2.10 where the break-even fill probabilities are smaller for larger γt . For scenario (2),
we take Na = ng + n, where ng is the number of opportunities over W that last for
l ≥ 1 second, and n the number with l < 1 second. The total profit Π2 is then given
by
For this scenario, the break-even fill probability P2b therefore depends on the propor-
tion of arbitrage opportunities with length l ≥ 1, the mean value of the rate product
for opportunities with length l ≥ 1, and the mean rate product for opportunities with
l < 1.
Figure 2.10 shows break-even fill probabilities generated by trading simulations
and highlights the fact that P2b is lower than P1b , for the corresponding loss, and that
the break-even fill probabilities tend to be slightly lower for the CHF than for the
JPY transactions. This difference is most marked for scenario (2), with γt = 1.0001.
In this case, if a fixed loss of 2 basis points per unfilled arbitrage is assumed, a fill
probability of only 17% is needed to break-even.
Although this fill probability seems low, it would nevertheless be difficult to
achieve. Consider a strategy where a similar fill probability of 20% is required to
break-even. This implies that one would need to be filled on 1 in 5 of the arbitrage
opportunities traded on. If there are 5 market participants trading on each oppor-
tunity, each able to transact at the same speed, then this fill frequency is feasible.
28
2.6 | Fill probabilities
JPY CHF
1 1
γ =1
t
0.9 0.9 γ =1.00005
t
γ =1.0001
0.8 0.8 t
b
break−even fill probability, P1
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
2 1.5 1 0.5 0 2 1.5 1 0.5 0
loss per unfilled arbitrage, L loss per unfilled arbitrage, L
1 1
0.9 0.9
0.8 0.8
b
break−even fill probability, P2
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
2 1.5 1 0.5 0 2 1.5 1 0.5 0
loss per unfilled arbitrage, L loss per unfilled arbitrage, L
Figure 2.10: The fill probability required to break-even as a function of the loss in-
curred per incomplete arbitrage transaction. Upper: scenario (1). Lower: scenario
(2). Error bars indicate the standard deviation in the fill probability over 100 simu-
lations.
29
Chapter 2 | Triangular Arbitrage in the FX Market
However, in the FX market there are many more market participants than this com-
peting for each arbitrage opportunity, so to achieve this fill probability one would
need to identify and execute each arbitrage opportunity faster than most of these
competitors. These competitors are also likely to be continually striving to increase
their execution speeds in the electronic trading “arms race”. Given the costs asso-
ciated with staying ahead in this race, it would be extremely costly to maintain the
fastest execution speeds and to regularly beat the majority of other competitors to
the arbitrage prices over a prolonged period of time. The fill probabilities required
to realize the profits indicated in Fig. 2.9 are therefore very difficult to achieve and,
as a consequence, the profit levels are also extremely unrealistic.
The calculated fill probabilities also represent lower bounds of acceptability be-
cause to justify trading on an opportunity a trader would expect a reasonably high
expected profit and not simply to break-even. When one factors in costs such as
brokerage, the internet connections required to access the market, and the cost of
developing and supporting a sophisticated electronic trading system, the actual fill
probabilities necessary to achieve an acceptable level of profit would be substantially
higher than those calculated. It therefore appears that, although mis-pricings do ap-
pear in the FX market, an unfeasibly large fill probability would need to be achieved
over a prolonged period of time to realize any significant profits from them.
2.7 Summary
We have shown that triangular arbitrage opportunities exist in the FX market, but
the vast majority of these opportunities are less than 1 second in duration and 1 basis
point in magnitude. The longer, larger opportunities that do occur appear with a
significantly lower frequency. We showed that, somewhat counter-intuitively, more ar-
bitrage opportunities occur during periods of higher liquidity, but these opportunities
tend to be removed from the market very rapidly. The increased number of oppor-
tunities during liquid periods was attributed to the higher trading frequency, which
resulted in more mis-pricings, but also ensured that they were quickly corrected. We
have also shown that from 2003 to 2005 the market became increasingly efficient at
eliminating mis-pricings and explained this by the increased use of electronic trading
platforms, which enabled traders to react faster to price changes.
Finally, we used trading simulations to investigate the profitability of trading on
triangular arbitrage signals. Considering the strong competition for each arbitrage,
the costs of trading, and the costs required to maintain a technological advantage, it
30
2.7 | Summary
seems that a trader would need to beat other market participants to an unfeasibly
large proportion of arbitrage opportunities for triangular arbitrage to remain prof-
itable in the long-term. We therefore conclude that the FX market appears internally
self-consistent and these results provide a limited verification of FX market efficiency.
This chapter has also demonstrated the critical importance of using the correct
type of data to study financial markets. If one uses data that is inappropriate for
a particular analysis, it is easy to conclude that data artifacts represent meaningful
structure. For example, when indicative FX price data is used to investigate triangular
arbitrage, arbitrage opportunities appear to remain active in the market longer, and
to be more profitable, than when executable data is used.
31
Chapter 3
In this chapter, we continue to investigate financial markets, but extend the analysis
to include assets from other major markets in addition to the FX market. We have
submitted a paper based on this work for publication [P5]. We return to the FX
market in Chapter 5.
3.1 Introduction
The global financial system is composed of a multitude of markets spread across
a range of geographic locations with a wide variety of assets traded in each mar-
ket. There is strong coupling between different financial markets such that the price
changes of particular assets can be driven not only by the price changes of assets
traded in the same market, but also by price changes of assets traded in other
markets. Because of the close relationships between different assets and markets,
a primary concern of market practitioners is estimating the correlations between the
changes in asset price time series. There are many reasons for wanting to understand
correlations in price movements; perhaps the most common motivation is for risk
management purposes. For a portfolio of assets, the likelihood of large losses can
be significantly higher when the assets held in the portfolio are correlated [176]; an
understanding of the correlation between different financial instruments can therefore
help in managing the risk associated with a portfolio. The standard approach for
representing the correlations of a group of financial assets is to calculate the linear
correlation coefficient between pairs of assets. However, for N assets, this results in
1
2
N(N − 1) correlation coefficients, so simultaneous investigation of these interactions
33
Chapter 3 | Financial Market PCA
Factor models can be separated into two classes: confirmatory and exploratory. Al-
though this classification is slightly fuzzy, it helps to illustrate the different techniques.
Many of these models assume that the observed price series can be written as lin-
ear combinations of common factors. In confirmatory factor modelling, a number
34
3.1 | Introduction
of indicator variables are selected that are posited to drive the observed price time
series and a factor model is constructed to test this prediction. There are two main
types of confirmatory factor models in finance [296]: macroeconomic and fundamen-
tal models. Macroeconomic factor models attempt to explain the behaviour of asset
returns using macroeconomic variables, such as interest rates and GDP growth rates.
The relationship between these indicator variables and the observed assets is usually
determined using linear regression. Fundamental factor models try to explain price
movements using fundamental properties of the assets; for example, using properties
such as market capitalization and industrial sector to explain equity price movements.
Again, the relationships between the indicator variables and the observed assets are
usually determined using linear regression [63, 296].
The second class of factor models are exploratory methods. In contrast to confir-
matory models, exploratory methods make no assumptions about which variables
correspond to the underlying factors; instead, the factors are estimated directly from
the asset return time series. Two widely used exploratory techniques are factor anal-
ysis1 (FA) and PCA [63, 301]. The two names are often used interchangeably in the
literature and the two approaches share the common goal of reducing a set of N
observed variables to a set of m < N new variables [301]. However, there are clear
distinctions between the two techniques.
In FA the aim is first to identify factors that are common to two or more variables
and these factors can either be correlated or uncorrelated. In addition to the common
factors, a set of unique factors are also identified that are specific to each variable and
orthogonal to each other and to all of the common factors [103]. In PCA the PCs are
identified on the basis of variance. The first component accounts for as much of the
variance in the system as possible, the second as much of the remaining variance as
possible, and so on. In PCA the components are chosen such that they are mutually
orthogonal and account for all of the variability in the system; there are no unique
components specific to particular variables.
The differences between PCA and FA can be better understood by considering
the decomposition of the covariance matrix of the observed variables in the two cases.
If we let Z represent an N × T matrix of T observations of the return time series of
1
Note the distinction between factor modelling and factor analysis – factor analysis is a type of
factor modelling.
35
Chapter 3 | Financial Market PCA
ΣZ = ΩF ΣF ΩTF − ε2 , (3.1)
ΣZ = ΩΣY ΩT , (3.2)
36
3.1 | Introduction
37
Chapter 3 | Financial Market PCA
3.2 Data
3.2.1 Description
We study correlations for a wide variety of markets, but several factors limit the
assets that we can include. For example, for some time series there are a large
number of missing data points. One solution to this problem is to fill-in the missing
data by interpolating between the data points that we do have, but this approach
is inappropriate when there are several consecutive missing data points. A second
reason for excluding time series is that they begin too recently. In this study, we use
time series beginning in January 1999 because this time interval contains periods of
very different market behaviour that we can compare. However, data is only available
for some instruments for a few years. For example, although some emerging market
corporate bond indices are now published, they have only been published since 2005
and consequently we do not include them. Other time series are complete and cover
the full time period, but are excluded because of the nature of the data. For example,
38
3.2 | Data
we exclude any exchange rate that has been pegged3 for any of the studied period.
The peg implies that the value of the pegged exchange rate can be determined using
the rate (or rates) to which it is pegged, so the pegged rate is redundant.
Taking all of these factors into account, we include time series for 98 financial
products from all of the major markets. This includes 25 developed market equity
indices, 3 emerging market equity indices, 4 corporate bond indices, 20 government
bond indices, 15 exchange rates, 9 metals, 4 fuel commodities, and 18 other com-
modities. (See Table A.1 in Appendix A for a description of all of the assets that we
include.) For many markets, we study indices rather than specific assets so that we
have an aggregate view of the market.4 For all of the commodities, we use futures5
contracts because commodities are most widely traded in the futures market. How-
ever, for single futures contracts, the price time series will have a discontinuity at the
contract expiry date. To minimize this discontinuity, we use the “Generic 1st futures”
contract for each commodity, which is the price of the nearest dated futures contract
(i.e., the contract with the closest expiry date).
We include assets from a range of geographical regions, so many are traded during
different hours of the day. For example, stocks included in the Nikkei 225 are traded
on the Tokyo Stock Exchange, which operates between midnight and 6 AM GMT,
whereas stocks included in the FTSE 100 index are traded on the London Stock
Exchange, which operates between 8 AM and 4:30 PM GMT. To minimize any effects
resulting from the non-synchronicity of the price time series for markets from different
time zones, we use weekly price data. We take the weekly price of an asset to be the
last price posted each week. In this study, we investigate the period from 08/01/1999–
01/01/2010 and we have 575 prices for each asset.
3
A pegged exchange rate is an exchange rate regime wherein a currency’s value is matched to
the value of a single currency or a basket of currencies.
4
By using indices the data is also representative of a larger set of assets than if we included time
series for individual assets. For example, if we studied individual equities, we would need to include
several stocks from each industrial sector to obtain a representative cross-section of the market;
however, we can only include a limited number of assets in the analysis. We discuss the reasons for
this in Section 3.2.3.
5
A future is a standardized contract to buy or sell an asset at a specified future date at a price
agreed on the day that the contract is entered in to.
39
Chapter 3 | Financial Market PCA
3.2.2 Returns
We denote the price of asset i at discrete time t as pi (t), i = 1, . . . , N, and define a
logarithmic return zi (t) for asset i between consecutive time steps as6
pi (t)
zi (t) = ln . (3.3)
pi (t − 1)
In Fig. 3.1, we show that there are large differences between the return distributions
for assets from different classes. For example, returns for U.S. government bonds are
concentrated in a sharp peak around zero, whereas the distribution for oil has much
more weight in the tails as a result of regular large moves in the oil price.
In calculating a correlation coefficient for a pair of time series, it is important to
ensure that the time series are stationary [134]. We study returns rather than prices
because, in general, return time series are close to stationary whereas price time series
are not [73]. This can be demonstrated by considering the autocorrelation function
(acf) of the two types of time series. The autocorrelation αi of a time series zi is given
by PT −τ
t=1 zi (t) − hzi i zi (t + τ ) − hzi i
αi (τ ) = PT 2 , (3.4)
t=1 zi (t) − hzi i
where h· · · i indicates a time average over T − τ returns and τ is the lag between time
steps over which the autocorrelation is calculated. For stationary time series, the acf
decays rapidly with increasing lag, but this is usually not the case for non-stationary
series [73]. In Fig. 3.2(a), we demonstrate for the EUR/USD exchange rate that the
return time series acf decays rapidly, whereas the price time series decays slowly. In
Fig. 3.2(b), we show that the return time series for all of the studied assets decay
rapidly, with most values falling within the 95% confidence bounds for Gaussian white
noise.7 The rapid decay of the acfs of the return time series suggests that the return
process is stationary, which implies that these time series are suitable for investigating
market correlations.
6
An alternative return is the arithmetic return which is defined as zia (t) = [pi (t)− pi (t− 1)]/pi (t−
1). This is equal to the first term in the Taylor expansion of the logarithmic return, so arithmetic and
logarithmic returns are approximately equal for small returns. Logarithmic returns are, however,
often used instead of arithmetic returns because logarithmic returns are symmetric [73]. For example,
an investment of £100 that yields an arithmetic return of 50% followed by an arithmetic return of
-50% results in £75. In contrast, an investment of £100 that yields a logarithmic return of 50%
followed by an logarithmic return of -50% results in £100.
7
We note that there are two spikes in the acf for frozen pork bellies (PB1) at 26 and 52 week
lags, which suggests that there is an interesting periodicity in this return time series.
40
3.2 | Data
USGATR
MOODCAA
SPX
0.5 GBP/USD
XAU
CO1
observed probability
0.4
0.3
0.2
0.1
0
−0.1 −0.05 0 0.05 0.1 0.15
z
1 1
a) b)
0.8 0.8
0.6 0.6
α
0.4 0.4
0.2 0.2
0 0
−0.2 −0.2
0 20 40 60 80 100 0 20 40 60 80 100
τ τ
Figure 3.2: a) Comparison of the autocorrelation function for the EUR/USD price
(upper green line) and return (lower blue line) time series. The horizontal red lines
show the 95% confidence intervals assuming Gaussian white noise. b) Autocorrelation
functions for all of the studied return time series. The horizontal red lines show the
95% confidence intervals assuming Gaussian white noise.
41
Chapter 3 | Financial Market PCA
3.2.3 Correlations
To simplify the notation for the definition of the empirical correlation matrix, we
define a standardized return as
zi (t) − hzi i
ẑi (t) = , (3.5)
σ(zi )
q
where σ(zi ) = hzi2 i − hzi i2 is the standard deviation of zi over a time window of
T returns and h· · · i indicates a time average over T . We represent the standardized
returns as an N × T matrix Ẑ, so the empirical correlation matrix R is given by
1
R= ẐẐT , (3.6)
T
and has elements r(i, j) that lie in the interval [−1, 1]. Note that, because we have
standardized the time series, the correlation matrix R of returns Ẑ is equal to the
covariance matrix ΣẐ of Ẑ.
We create an evolving sequence of correlation matrices by rolling the time window
of T returns through the full data set. The choice of T is a compromise between
overly noisy and overly smoothed correlation coefficients [227, 229], but is usually
chosen such that Θ = T /N ≥ 1.8 In this study, we fix T = 100 (each window then
contains just under two years of data and Θ = 1.02) and we roll the time window
through the data one week at a time. By only shifting the time window by one data
point, there is a significant overlap in the data contained in consecutive windows;
however, this approach enables us to track the evolution of the market correlations
and to identify time steps at which there were significant changes in the correlations.
The choice of T results in 452 correlation matrices for the period 1999–2010.
42
3.2 | Data
correlation matrix for each position of the window. We produce simulated data by
independently generating N time series of returns (where each series is the same
length as the original data) whose elements are drawn from a Gaussian distribution
with mean zero and unit variance. We again roll a time window of length T through
the data and calculate a correlation matrix for each window.
Figure 3.3 shows that the distribution of correlation coefficients for the market
data is significantly different to the two random distributions, with more large positive
and negative correlations for market returns. The differences between the distribu-
tions demonstrate that there are temporal correlations between returns for financial
assets that are incompatible with the random null models that we consider.
0.08
observed
shuffled
0.07 simulated
0.06
observed probability
0.05
0.04
0.03
0.02
0.01
0
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1
r
Figure 3.3: Distribution of correlation coefficients r(i, j) aggregated over all time
windows for the observed, shuffled and simulated data.
Figure 3.3 shows the distribution of correlations between the return time series of
all assets, but it is instructive to disaggregate this distribution and to consider only
correlations between assets in the same class. In Fig. 3.4 we show that there are clear
differences in the intra-class correlations for different assets. For example, corporates
bonds and government bonds tend to be highly correlated, whereas many of the
assets within the “other commodities” class are uncorrelated (which is unsurprising
given the variety of commodities that we include in this class). The distributions
of correlation coefficients for all of the asset classes deviate from the distributions
43
Chapter 3 | Financial Market PCA
expected for random returns. The deviations from random distributions imply that
financial market correlation matrices contain structure that warrants investigation.
0.8
0.3 0.3 0.3
0.6
0.2 0.2 0.2
0.4
observed probability
0 0 0 0
−1 0 1 −1 0 1 −1 0 1 −1 0 1
0 0 0
−1 0 1 −1 0 1 −1 0 1
r
Figure 3.4: The distribution of correlation coefficients r(i, j) between the return time
series for assets within each class aggregated over all time windows.
Y = ΩẐ, (3.7)
3.3.1 Eigenvalues
In Section 3.1.2, we highlighted the close links between RMT and PCA. A standard
financial application of RMT is to compare the eigenvalues of the correlation matrix
of market returns with the distribution of eigenvalues for random matrices, e.g., [176,
239]. Any deviations from the predictions of RMT are usually considered to indicate
non-random, and potentially insightful, structure in the correlation matrix [176,239].
The correlation matrix for N mutually uncorrelated time series of length T with
elements drawn from a Gaussian distribution is a Wishart matrix [176, 239]. In the
limit N → ∞, T → ∞, and with the constraint that Θ = T /N ≥ 1, the probability
density function ρ(β) of the eigenvalues β of such correlation matrices is given by [269]
p
Θ (β+ − β)(β− − β)
ρ(β) = , (3.9)
2πσ 2 (Ẑ) β
where σ 2 (Ẑ) denotes the variance of the elements of Ẑ, and β+ and β− are the maxi-
mum and minimum eigenvalues and are given by
r !
1 1
β± = σ 2 (Ẑ) 1 + ± 2 . (3.10)
Θ Θ
When Θ = 1, the lower bound of the range of eigenvalues β− = 0, the upper bound
β+ = 4σ 2 (Ẑ), and as β → β− = 0, the density of eigenvalues ρ(β) diverges as
√
∼ 1/ β. The above results are only valid in the limit N → ∞; for finite N, the
boundaries of the eigenvalue distribution are blurred with a non-zero probability of
finding eigenvalues larger than β+ and smaller than β− . For the standardized return
matrix Ẑ that we investigate, σ 2 (Ẑ) = 1 so β+ = 4.
9
PCA is sometimes performed on the covariance matrix rather than the correlation matrix.
However, if there are large differences in the variances of the time series used as inputs in the PCA,
the variables with the largest variances will tend to dominate the first few PCs when the covariance
matrix is used. Figure 3.1 shows that there are differences in the variances of the returns for different
assets, so we use the correlation matrix. Of course, the correlation matrix is simply the covariance
matrix for standardized variables.
45
Chapter 3 | Financial Market PCA
In Fig. 3.5, we compare the eigenvalue distribution for market data (aggregated
over all time windows) with the distributions for shuffled and simulated data. In
Fig. 3.5(a), we show that the eigenvalue distribution for market correlations differs
from that of random matrices.There are many eigenvalues larger than the upper bound
β+ = 4 predicted by RMT (with several eigenvalues almost 10 times as large as the
upper bound). In prior studies of equity markets, the eigenvector corresponding to
the largest eigenvalue has been described as a “market” component, with roughly
equal contributions from each of the N equities studied, and the eigenvectors cor-
responding to the other eigenvalues larger than β+ have been identified as different
market sectors [176,239]. In Section 3.5, we discuss the interpretation of the observed
eigenvectors with eigenvalues β > β+ . For now, we simply note that the deviations
of the empirical distribution of eigenvalues from the predictions of RMT again im-
ply that the correlation matrices contain structure that is incompatible with the null
models that we consider.
In Figs. 3.5(b) and (c), we illustrate that the distributions for shuffled and simu-
lated data are very similar and that they agree very well with the analytical distribu-
tion given by Eq. (3.9) over most of the range of β. In particular, both distributions
have an upper bound close to the theoretical maximum β+ = 4. However, for Θ ≈ 1.02
(the value that corresponds to the selected T and N), the observed distribution of
eigenvalues for random data does not fit the distribution in Eq. (3.9) as β → 0. For
both the simulated and shuffled data, we observe a much higher probability density
near β = 0 than that predicted by RMT. The high probability density near zero is
a result of the fact that T ≈ N. When we simulate eigenvalue distributions for data
with T ≫ N, we observe a much smaller probability density near zero. In Figs. 3.5(b)
and (c), we also show the theoretical distribution for Θ = 1. In this case, ρ(β) diverges
as β → 0, which fits the randomly generated distributions reasonably well.
3.3.2 Eigenvectors
We now investigate the distribution of the elements ωij (the PC coefficients) of the
eigenvectors of the correlation matrices. We denote the ith element of the k th eigen-
vector as ωk (i) and use the standard approach of normalizing each eigenvector such
2
that N
P
i=1 ωk (i) = N [176, 238, 239]. Correlation matrices R are real symmetric ma-
trices, so we compare the eigenvector properties of the matrices R with those for real
symmetric random matrices. Such random matrices display the universal properties
of the canonical ensemble of matrices known as the Gaussian orthogonal ensemble
46
3.3 | Principal component analysis
2
0.004
a)
ρ(λ)
0.002
1
0
5 10 20 30 40
λ
0
0 5 10 15 20 25 30 35 40
4
Q = 1.02 0.1 b)
ρ(λ)
Q=1 0.05
ρ(λ)
2
0
3 3.5 4
λ
0
0 0.5 1 1.5 2 2.5 3 3.5 4
4
0.1 c)
Q = 1.02
ρ(λ)
Q=1 0.05
2
0
3 3.5 4
λ
0
0 0.5 1 1.5 2 2.5 3 3.5 4
λ
47
Chapter 3 | Financial Market PCA
(GOE) [238, 239]. For the GOE, the probability density ρ(ωk ) of the elements of the
k th eigenvector is a Gaussian distribution with mean zero and unit variance [137].
In Fig. 3.6, we show the distribution of elements of the eigenvectors k = {1, . . . , 6}
(the six components with the largest eigenvalues), k = {25, 40, 50, 75} (whose eigen-
values lie within the interval [β− , β+ ] for random matrices), and k = {97, 98} (the two
components with the smallest eigenvalues). We aggregate the distributions for each
eigenvector over all time windows; however, the sign of each eigenvector is arbitrary so,
to ensure that the signs of the eigenvectors are consistent through time, we choose the
sign of the k th eigenvector ωkt at time step t to maximize N t−1 t
P
i=1 sgn[ωk (i)] sgn[ωk (i)],
where sgn[x] is the sign function.10
Figure 3.6 shows that the RMT distribution closely matches the distributions for
shuffled and simulated data, but there are differences between these distributions
and the distributions for the market correlation matrices. These differences are most
pronounced for the first and second PCs; in particular, there are asymmetries in
the distributions for market data that are not present in the random distributions.
The eigenvector distributions for eigenvalues within the interval [β− , β+ ] also deviate
from the predictions of RMT, which contrasts with the results of similar studies of
equity markets. In Refs. [176, 238], the distributions of elements of the eigenvectors
corresponding to eigenvalues falling within the interval [β− , β+ ] were found to fit a
Gaussian distribution, which was taken as an indication that these eigenvectors did
not contain any information [176]. However, the eigenvector distributions that we
observe for eigenvalues in the interval [β− , β+ ] have excess kurtosis compared with a
Gaussian distribution. A key difference between the analysis that we present and prior
studies is that we investigate multiple asset classes, whereas prior studies focused on
a single type of asset. The addition of inter-asset-class correlations may explain the
differences that we observe in the eigenvector distributions.
48
3.4 | Temporal evolution
0 0 0 0
−4 −2 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4
k=5 k=6 k = 25 k = 40
0.6 0.6 0.6 0.6
shuffled
0.2 0.2 0.2 0.2 RMT
0 0 0 0
−4 −2 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4
k = 50 k = 75 k = 97 k = 98
0.6 0.6 1.2 1.2
0.4 0.4
0.6 0.6
0.2 0.2
0 0 0 0
−4 −2 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4
ωk
Figure 3.6: The distribution ρ(ωk ) of the elements ωk (i) of the i = 1, . . . , N elements
of the k th eigenvector aggregated over all time windows. We show the distributions
for k = 1, . . . , 6 (the six eigenvectors with the largest eigenvalues), k = 25, 40, 50, 75
whose eigenvalues fall within the range [β− , β+ ] predicted by RMT, and k = 97, 98
(which have very small eigenvalues). The red line histograms show the equivalent
distributions for shuffled data and the green lines show the distribution predicted by
RMT.
49
Chapter 3 | Financial Market PCA
where ΣẐ is the covariance matrix for Ẑ and σ 2 (ẑi ) is the variance of the vector ẑi
of returns for asset i. The proportion of the total variance in Ẑ explained by the k th
PC is then given by
σ 2 (yk ) βk βk
PN 2 = = , (3.13)
i=1 σ (zi )
β1 + . . . + βN N
i.e., the ratio of the k th largest eigenvalue βk of the correlation matrix R to the number
of assets studied N is equal to the the proportion of the variance accounted for by
the k th PC.
In Fig. 3.7 we show the fraction of the variance βk /N accounted for by the first
five PCs (k = 1, . . . , 5) as a function of time. From 2001–2004 the fraction of the vari-
ance explained by the first PC increased; between 2004 and 2006 it decreased before
gradually increasing again with a sharp rise as the week including 15th September
2008 entered the rolling time window. This was the day that Lehman Brothers filed
for bankruptcy and Merrill Lynch agreed to be taken over by Bank of America. The
variance explained by the first PC peaks as the week ending 5th December 2008 en-
ters the rolling window (which was the week during which the National Bureau of
Economic Research officially declared that the U.S. was in recession) at which point
it accounts for nearly 40% of the variance in Ẑ.
The amount of variance in market returns explained by a single component is
quite striking and demonstrates that there is a large amount of common variation
in financial markets; this highlights the close links between different assets. The in-
crease in the variance accounted by the first PC between 2001 and 2010 also implies
that markets have become more closely related in recent years. In particular, the
significant rise in the variance of the first PC following the collapse of Lehman Broth-
ers demonstrates that markets became more correlated during the period of crisis
following the failure of this major bank.
50
3.4 | Temporal evolution
Although the changes in the variance accounted for by the higher PCs are smaller
than the changes for the first PC, the variance explained by the second and third
PCs appears to be anti-correlated with the variance explained by the first PC. This
is expected because the total variance is constrained to sum to N, so when the
first PC accounts for a higher proportion, less remains to be explained by the other
components.
It is also instructive to consider the combined variance explained by the first few
PCs. In 2001 the first twelve PCs accounted for approximately 65% of the variance of
market returns; by 2010 only five PCs explained the same proportion of the variance.
The fact that only a few components account for such a large proportion of the
variance in market returns highlights the close ties between different markets. The
larger amount of common variance also suggests that market correlations can be
characterized by fewer than N components.
0.5
k=1
0.45 k=2
k=3
0.4 k=4
k=5
0.35
0.3
βk/N
0.25
0.2
0.15
0.1
0.05
0
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
date
Figure 3.7: Fraction of the variance in Ẑ accounted for by each of the first five PCs
as a function of time. The highest line shows the variance accounted for by the first
PC, the next highest line the variance accounted for by the second PC, and so on.
The date axis shows the date of the last data point lying in each time window.
51
Chapter 3 | Financial Market PCA
The IPR can be better understood by considering two limiting cases: (1) an eigenvec-
√
tor with identical contributions ωk (i) = 1/ N from all N assets has Ik = 1/N; (2)
an eigenvector with a single component ωk (i) = 1 and remaining components equal
to zero has Ik = 1. The IPR quantifies the reciprocal of the number of elements that
make a significant contribution to each eigenvector. For ease of interpretation, we
define a participation ratio (PR) as 1/Ik . A large PR for a PC indicates that many
assets are contributing to it.
In Fig. 3.8(a), we show the PR of the first three PCs as a function of time. The PR
of the first PC increases from 2001–2010, with sharp increases when the weeks ending
12th May 2006 and 19th September 2008 enter the rolling time window. The second
increase is a result of the market turmoil that followed the collapse of Lehman Brothers
and occurs at the same time as a significant increase in the variance accounted for by
the first PC (see Fig. 3.7). The first increase is largely attributable to surging metal
prices. During the week ending 12th May 2006, the price of gold rose to a 25 year
high, reaching over $700 per ounce, and the prices of several other metals also rose to
record levels: platinum and copper reached all time highs; aluminium hit an 18-year
peak; and silver prices rose to their highest levels since February 1998. In addition,
during the same week, corporate bond prices reached a 2 year high and the price of
emerging market equities reached record levels. Although these events resulted in a
11
We now normalize the eigenvectors such that ΣN 2
i=1 ωk (i) = 1.
52
3.4 | Temporal evolution
significant increase in the PR of the first PC, this increase was not accompanied by
a sharp rise in the variance explained by this component.
The sharp rise in the PR of the first PC following the collapse of Lehman Brothers
implies that many different assets were highly correlated during this period of market
turmoil. Based on the value of the PR, over 70% of the assets that we study signif-
icantly contribute to the first PC after Lehman’s collapse. To test the significance
of the PR of the first PC, in Fig. 3.8(b) we compare it to the corresponding PR for
random returns. Figure 3.8(b) shows that between 2006 and 2010 the PR of the
observed returns was significantly larger than the PR expected for random returns.
This demonstrates the strength of the correlations between a wide range of different
assets during this period.
We observe very different behaviours for the evolution of the PRs of the higher
components. For example, between 2001 and 2003 the PR of the second PC doubles;
it then fluctuates around the same level until the collapse of Lehman Brothers, at
which point it decreases sharply. Similarly, the PR of the third PC increases from
2001 until Lehman’s collapse when it also falls sharply. This suggests that following
the collapse of Lehman Brothers the first PC influences a large number of assets at the
expense of higher components. The dominance of a single component again implies a
large amount of common variance in asset returns and further suggests that the key
market correlations can be described using only a few PCs.
53
Chapter 3 | Financial Market PCA
80
k=1
k=2
70 k=3
60
50
[Ik]−1
40
30
20
10
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
date
(a)
75
70
65
60
55
[Ik]−1
50
45
40
35
30
25
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
date
(b)
Figure 3.8: The participation ratio [I k ]−1 as a function of time for (a) the three PCs
with the largest variance (k = 1, 2, 3) (b) the PC with the largest variances (k = 1).
The horizontal solid line (green) shows the mean IPR for 100,000 simulations of
randomized returns with T = 100 and N = 98 and the dashed horizontal lines (red)
show the one standard deviation error bars.
54
3.5 | Asset-component correlations
as a function of the eigenvalue index, where the eigenvalues are sorted such that
β1 > β2 > . . . > βN ; the left most data point in a scree plot shows the magnitude
of the largest eigenvalue and the right most the smallest eigenvalue. The number of
significant PCs is considered to be equal to the number of eigenvalues in the scree
plot for which the eigenvalue for the observed data is larger than the corresponding
eigenvalue for random data.
In Figs. 3.9(b) and (c), we show the number of significant components as a function
of time calculated using the Kaiser-Guttman criterion and the scree plot technique,
respectively. There are large differences in the number of significant components iden-
tified using the two approaches, but both agree that the number decreased between
2001 and 2010. The discrepancies in the results for the two methods imply that we
cannot reliably determine the exact number of significant PCs; however, the similar
trends provide evidence that the number of significant components decreased between
2001 and 2010. This again implies that markets have become more closely related in
recent years. Both methods also agree that the number of significant components is
much lower than the number of assets that we study. Therefore, although we cannot
determine the number of significant components using the methods described in this
section, the results nonetheless suggest that market correlations can be characterized
by fewer than N components.
55
Chapter 3 | Financial Market PCA
Kaiser−Guttman
27/11/2009
25
09/03/2001
b)
random
40
a) no. of significant components 20
35
30
5 15
2002 2004 2006 2008 2010
25
4 random comparison
βk
20 8
k
β
3 c)
15
6
2
3 5 7 9
10
k
4
5
0 2
1 20 40 60 80 100 2002 2004 2006 2008 2010
k date
Figure 3.9: (a) Scree plot showing the magnitude of the eigenvalues as a function of
the eigenvalue index, where the eigenvalues are sorted such that β1 > β2 > . . . > βN .
We show curves for correlation matrices for time windows ending on 09/03/2001 and
27/11/2009. We also show the eigenvalues for random correlation matrices, where
we have averaged the eigenvalues over 500 realizations of the correlation matrix. The
inset zooms in on the region in which the two example curves for observed data cross
the curve for random data. The two plots on the right show the number of significant
components as a function of time determined using (b) the Kaiser-Guttman criterion
(c) by comparing the scree plots of the observed and random data.
56
3.5 | Asset-component correlations
This implies that the covariance of the returns of asset i and the j th PC is given by
Σ(ẑi , yj ) = ωij βj and the correlation R(ẑi , yj ) by
ωij βj p
R(ẑi , yj ) = = ωij βj , (3.16)
σ(ẑi )σ(yj )
p
where σ(ẑi ) = 1 is the standard deviation of ẑi over T returns and σ(yj ) = βj is the
standard deviation of yj . The correlations between the PCs and the original variables
are therefore simply equal to the PC coefficients scaled by the appropriate eigenvalue.
The signs of the PC coefficients are arbitrary, so the signs of the PCs and the signs
of the correlations R(ẑi , yj ) are also arbitrary. To avoid having to choose a sign
for each correlation coefficient, we consider the absolute correlations |R(ẑi , yk )|. By
considering absolute correlations, we cannot tell if an asset is positively or negatively
correlated with a PC; however, we are interested only in determining which assets
contribute to each component, so it is reasonable to ignore the signs.
In Fig. 3.10, we show the variation through time of the correlation of every as-
set with each of the first six PCs. Figure 3.10 highlights that the number of large
correlations is significantly lower for the higher components. For the first PC, many
of the correlation coefficients are greater than 0.8, but the correlations between the
asset returns and the sixth PC very rarely exceed 0.5. As one looks at increasingly
higher components the maximum correlation decreases until, for the highest compo-
nents, all correlations are less than 0.2. The low correlations between the asset return
time series and the higher PCs implies that much of the key structure from the cor-
relation matrices is contained within the first few PCs. Based on the correlations
shown in Fig. 3.10, it appears that the first five PCs describe the main features of
the correlations for the studied assets.
Figure 3.10 also demonstrates the changing correlations between the different as-
set classes. From 2001–2002, all of the corporate and government bonds (with the
exception of Japanese government bonds) are strongly correlated with the first PC.
Over the same period, most of the equity indices are strongly correlated with the
second PC and most of the currencies with the third PC; six grain commodities (soy-
bean, soybean meal, soybean oil, corn, wheat oats) are strongly correlated with the
fourth PC; and fuel commodities are strongly correlated with the fifth PC. Therefore,
each of the first five PCs corresponds to a specific market over this period and the
separation into components implies low correlations between different assets classes.
During 2002, however, these relationships begin to break down as bonds and equi-
ties both become strongly correlated with the first PC and both asset-classes have a
57
Chapter 3 | Financial Market PCA
correlation of approximately 0.5 with the second PC. The strong correlation of both
bonds and equities with the same PCs marks the start of a period during which the
coupling between asset classes increased and different markets became more closely
related.
There are three major changes in the correlations between the asset return time
series and the PCs between 2002 and 2009. These changes are most obvious for the
second PC in Fig. 3.10. The first change corresponds to a local peak in corporate bond
prices; the second change corresponds to surging metal prices (see Section 3.4.1); and
the third, and most striking, change occurs following the collapse of Lehman Brothers.
After Lehman’s bankruptcy, the first PC becomes strongly correlated with nearly all of
the assets, including equities, currencies, metals, fuels, other commodities, and some
government bonds. The major exceptions are corporate bonds and, to a lesser extent
government bonds, but both sets of bonds are strongly correlated with the second
PC. During this period, only a few assets are strongly correlated with the third PC,
including EUR/USD, CHF/USD, gold, silver, and platinum; and very few assets are
strongly correlated with any of the higher PCS. The strong correlations between the
majority of the studied assets and the first PC following Lehman Brothers’ collapse
further demonstrates the strength of market correlations during this crisis period and
highlights the common behaviour of nearly all markets.
Figure 3.10 also shows that for a system in which the first few PCs account for
a significant proportion of the variance, a consideration of the correlations between
these components and the original variables provides a parsimonious framework to
uncover the key relationships within the system. Instead of having to identify the
important correlations within a matrix with 21 N(N − 1) elements, one only needs to
consider the correlations between the N variables and the first few PCs, which reduces
the number of correlations to consider by a factor of N. Figure 3.10 demonstrates
that this method uncovers the changing relationships between the different asset
classes and highlights assets, such as Japanese government bonds, whose behaviour is
unusual. This approach also uncovers notable changes that occurred in markets and
the assets that were significantly affected by these changes.
3.6 Summary
We used PCA to investigate the evolving correlation structure of a variety of financial
assets and to identify common features across different markets. We found that the
percentage of the variance in market returns accounted for by the first PC steadily
58
k=1 k=2 k=3
government
corporate
equities
currencies
metals
fuels
other
2002 2004 2006 2008 2010 2002 2004 2006 2008 2010 2002 2004 2006 2008 2010
k=4 k=5 k=6
government
corporate
59
equities
currencies
metals
fuels
other
2002 2004 2006 2008 2010 2002 2004 2006 2008 2010 2002 2004 2006 2008 2010
|r(zi,yj)|
3.6 | Summary
0.2 0.4 0.6 0.8 1
Figure 3.10: The absolute correlation |R(ẑi , yk )| between each asset and the first six PCs (k = 1, . . . , 6) as a function of time.
Each point on the horizontal axis represents a single time window and each position along the vertical axis an asset. Dark
red regions indicate strong correlations (positive or negative) between assets and PCs and dark blue regions indicate weak
correlations.
Chapter 3 | Financial Market PCA
increased from 2001–2010, with a sharp rise following the 2008 collapse of Lehman
Brothers. We further found that the number of significant components decreased and
the number of assets making significant contributions to the first PC increased over
this period. We investigated the evolving relationships between the different assets
by analyzing the correlations between the asset price times series and the first few
PCs. From 2001–2002, each of the first five components corresponded to a specific
market; however, after 2002 these relationships broke down and by 2010 nearly all of
the studied assets were significantly correlated with the first PC. The major changes
in the correlation structure following the collapse of Lehman shows the extent to
which market correlations increased during this crisis period.
60
Chapter 4
In this chapter, we describe several concepts that we will use in Chapters 5 and
6 in which we study network communities. We define a community, explain some
of the most widely used techniques for detecting communities in static networks,
discuss attempts to cluster networks using mesoscopic structures, and present a rel-
atively comprehensive review of the literature investigating communities in dynamic
networks.
4.1 Introduction
A network community consists of cohesive groups of nodes that are relatively densely
connected to each other but sparsely connected to other dense groups in the network.
Communities can represent functionally-important subnetworks [2, 75, 105, 107, 121,
139,243,244,295]. For example, a community in a cellular or genetic network might be
related to a functional module; a community in a stock market network might corre-
spond to stocks belonging to the same industrial sector; and a community in a social
network might correspond to a group of friends or a group of work colleagues. Com-
munities can affect dynamical processes (such as the spread of opinions and diseases)
that operate on networks [75, 107, 244], so their identification, and an understanding
of their structure, can potentially provide insights into these processes.
In this chapter we review some of the most widely used community detection
techniques. Because there is no rigorous definition of a community, different methods
often define communities in different ways; the main difference between methods is
essentially their precise definition of “relatively densely connected”. A vast amount
of research has been published on community detection in recent years, so this review
is not exhaustive. More detailed reviews of the community detection literature can be
found in Refs. [105,244]. However, these review articles focus on communities in static
61
Chapter 4 | Community Structure in Networks
4.2 Notation
First, we introduce some of the notation that we use in the remainder of the thesis.
We consider undirected networks G = (V, E) that consist of a set of vertices V and
a set of edges E. We represent a network as an adjacency matrix A with elements
Aij . For unweighted networks Aij = 1 if an edge exists between nodes i and j and
is 0 otherwise; in weighted networks Aij can take other values (which are always
real numbers for the networks that we consider) that indicate the strength of the tie
between i and j.
We consider partitions P = {C 1 , · · · , C η } of a network G into η disjoint communi-
ties C such that C k ∩ C k = ∅ and ∪ηk=1 C k = V, where |V| = N is the number of nodes
′
in the network. We use the letter C to identify a community and the scripted letter C
to represent the set of nodes in that community. We also reference the communities
in two different ways: C k is the set of nodes in community k (k = 1, · · · , η), whereas
Ci is the set of nodes in the same community as node i (i = 1, · · · , N). We represent
the number of nodes in community k as |C k | = nk .
62
4.3 | Community detection methods
combine them based on their similarity [84].2 In contrast, divisive algorithms (which
include some spectral methods [221]) begin with all objects in a single cluster and
find smaller groups by iteratively splitting clusters.
Data clustering is closely linked to community detection. The widespread inter-
est of statistical physicists and applied mathematicians in community detection was
sparked by the 2002 publication of a paper by Girvan and Newman [121] in which they
proposed a technique for identifying communities using (geodesic) edge betweenness.
Betweenness [110] is widely used in social network analysis to quantify the extent
to which edges lie on paths that connect agents3 . In the Girvan-Newman method,
edges with the largest betweenness are iteratively removed from the network. After
each edge removal the betweenness of the remaining edges is recalculated, which is
important because it can cause previously low-betweenness edges to have higher be-
tweenness. As the edges are removed the network breaks up into progressively smaller
isolated communities.
The main problem with the Girvan-Newman approach is that it tends to be slow
for large networks (unless they are very sparse) and typically produces poor results
for dense networks [244]. Nevertheless, Ref. [121] was the catalyst that led to the
explosion of research on community detection in networks and since its publication
many alternative community detection methods have been proposed. We explain
three of the more prominent methods below.
63
Chapter 4 | Community Structure in Networks
where Ci is the community of node i and the null model Pij denotes the probability
that nodes i and j are connected for unweighted networks and the expected weight of
the link with which nodes i and j are connected for weighted networks. Modularity
is therefore easily extended from unweighted to weighted networks. The quantity m
represents the total number of edges in the network for unweighted networks and
the total edge weight for weighted networks and is given by m = 21 i ki , where
P
P
ki = j Aij is known as the degree of node i for unweighted networks and the strength
for weighted networks4 .
Communities are identified by finding the partition P that maximizes Q. The
choice of null model is not entirely unconstrained because it is axiomatically the case
that Q = 0 when all of the nodes are placed in a single group.5 One is then restricted
4
For an unweighted network a node’s strength is equal to its degree.
5
When all of the nodes in a network are in a single community, the number of edges within the
community and the expected number of such edges are both equal to m.
64
4.3 | Community detection methods
to null models in which the expected edge weight is equal to the actual edge weight
in the original network [220]. The simplest null model satisfying this criterion is a
uniform null model in which a fixed average edge weight occurs between nodes [244].
However, the strength distribution produced by this model is significantly different
to the distribution observed for many real-world networks. The most popular choice
of null model (introduced by Newman and Girvan [224]) is
ki kj
Pij = , (4.2)
2m
which preserves the strength distribution of the network and is closely related to the
configuration model [212].6
An important issue with using modularity as a quality function to identify com-
munities is that it suffers from a resolution limit [106]. Modularity optimization has
been shown to fail to find communities smaller than a threshold that depends on the
total size of the network and on the degree of interconnectedness between the network
communities. Communities smaller than the threshold tend to be merged into larger
communities, thereby missing important structures. However, many modularity-
maximization techniques can easily be adapted to other quality functions, and several
alternatives have been proposed that avoid the resolution limit by uncovering com-
munities at multiple resolutions, e.g., [17, 179, 220, 254].
where Ci is the state of spin i and Jij is the interaction energy between spins i and j.
The coupling strength Jij is given by Jij = Aij − λPij , where Pij again denotes the
expected weight of the link with which nodes i and j are connected in a null model
and λ is a resolution parameter. If Jij > 0 spins i and j interact ferromagnetically
6
The difference between the two models is that the configuration model is conditioned on the
actual degree (strength) distribution, whereas Newman and Girvan’s model is conditioned on the
expected degree (strength) distribution. In the configuration model, the probability of an edge
falling between nodes i and j is also given by Eq. 4.2 in the limit of large network size; however, for
smaller networks, there are corrections of order 1/N [220].
65
Chapter 4 | Community Structure in Networks
and seek to align in the same spin-state (join the same community); if Jij < 0
i and j interact antiferromagnetically and try to have different orientations (join
different communities). One can find communities by assigning each spin to a state
and minimizing the interaction energy of these states given by Eq. (4.3). Within
this framework, community identification is equivalent to finding the ground state
configuration of a spin glass.
Tuning λ allows one to find communities at different resolutions; as λ becomes
larger, there is a greater incentive for nodes to belong to smaller communities. The
Potts method therefore allows the investigation of communities below the resolution
limit of modularity. One can write a scaled energy Qs in terms of the Hamiltonian in
Eq. (4.3) as
−H(λ)
Qs = . (4.4)
2m
The modularity is then the scaled energy with λ = 1. Community detection using
modularity optimization is therefore a special case of the Potts method.7 In this
thesis, we use the Potts method to detect communities and we employ the standard
model of random link assignment Pij = ki kj /2m as the null model.
The number of possible community partitions grows rapidly with the number of
nodes [218], so it is typically computationally impossible to sample the energy space
by exhaustively enumerating all partitions [55]. A number of different heuristic pro-
cedures have been proposed to balance the quality of the identified optimal partition
with computational costs, e.g., [75, 107, 244]. In this thesis, we minimize Eq. (4.3)
at each resolution using the greedy algorithm of Ref. [44] which finds good quality
partitions and is computationally fast, so can be used to detect communities in large
networks. We note that quality functions like Eq. 4.3 have complex energy landscapes
and we are optimizing using an optimization heuristic, so care needs to be taken when
interpreting results for this method for real networks [128]. With this in mind, we
validate the results obtained using the greedy algorithm by reproducing the analysis
using spectral [221] and simulated annealing [141] algorithms.8
7
Recently, an alternative version of the Potts method has been proposed that is able to deal with
both positive and negative links [294]. Even more recently, a new framework of network quality func-
tions based on modularity has been proposed that can deal with time-evolving networks, networks
with multiple types of links, and multiple scales [214]; we discuss this method in Section 4.6.12
8
See Ref. [105] for more details of these and other modularity (energy) optimization heuristics.
66
4.4 | Edge communities
67
Chapter 4 | Community Structure in Networks
the internet at the level of autonomous systems; and the final family included four
Barabási-Albert preferential attachment networks.
In Ref. [142], Guimerà et al. clustered networks based on the over- or under-
representation of nodes with particular network roles. Guimerà et al. first assigned
each node a role based on the node’s pattern of inter- and intra-community links and
then created profiles indicating the under or over-representation of node roles within
each network compared with randomized networks9 . Based on these profiles, Guimerà
et al. identified two clusters of networks. The first cluster included metabolic and
airport networks, and the second cluster included protein interaction and internet
networks. The authors hypothesized that the division of the networks into these
two clusters might result from the fact that the networks in the first cluster are
transportation networks, in which strong conservation laws must be obeyed, whereas
the networks in the second cluster could be considered as signalling networks, which do
not obey conservation laws. We discuss further the clustering in Refs. [210] and [142]
in Section 6.6.4.
68
4.6 | Community dynamics
the changes in correlations between groups of financial assets can lead to better risk
management tools.
Although the study of dynamic communities is still in its infancy, several methods
have been proposed for detecting and tracking communities in dynamic networks. In
the remainder of this chapter, we describe the different techniques.
69
Chapter 4 | Community Structure in Networks
and indices for measuring the rate at which split and merge events occurred. They
applied this framework to an evolving sequence of networks of Japanese web archives
for each of the years 1999–2002 and found that changes in the community structure
were largely a result of merge and split events. This observation seems unsurprising
given the tight constraints on nodes’ community membership for the other types of
event. They also found that the distribution of community sizes and the distributions
of the sizes of emerged and dissolved communities followed power laws.
In another early study [152], Hopcroft et al. employed a very different approach
to study the evolution of weighted citation networks10 in which the nodes represented
papers and the weights were given by the cosine similarity [84] between vectors rep-
resenting each paper’s citations. Hopcroft et al. identified communities using ag-
glomerative hierarchical clustering, but found that the clusters were very sensitive to
random perturbations in the network. This led them to define “natural communi-
ties” as clusters that were robust to the random removal of nodes. To identify natural
communities, Hopcroft et al. created n realizations of the network in which 5% of the
nodes were randomly removed and compared clusters C in the original network with
clusters C ′ in the perturbed networks using the similarity function
|C ∩ C ′ | |C ∩ C ′ |
min , , (4.5)
|C| |C ′ |
where |C ∩ C ′ | indicates the cardinality of the intersection of nodes in the two clusters.
They defined a natural community as a cluster in the original network whose similarity
with any cluster in a perturbed network exceeded a pre-defined threshold p for a
fraction f of the networks with nodes removed.11
Hopcroft et al. tested their method by comparing citation data for the years
1990–1998 with data for 1990–2001, which enabled them to identify changes that
occurred in the community structure during the period 1999–2001. They associated
each community with a research topic by considering the most frequent words in the
titles of the papers in the communities. They then classified the natural communities
in the second period as either established or emerging depending on their overlap with
communities in the first period. They found that some of the established communities
grew rapidly while others stagnated, and that some of the communities that emerged
had split from communities that existed in the first period. Based on the evolution
10
The citation data covered the period 1990–2001 and was downloaded from the CiteSeer database,
which is available at http://citeseer.ist.psu.edu/.
11
Hopcroft et al. created n = 45 networks in which 5% of nodes were removed and set f = 0.6;
they set p = 0.7 for clusters containing fewer than 1,000 papers and p = 0.5 for larger clusters.
70
4.6 | Community dynamics
of the different communities, they identified new, expanding, and declining fields of
research.
An important issue with the approach proposed by Hopcroft et al. is that hier-
archical trees contain clusters at several different levels and their method does not
identify the level at which the clusters are most appropriate. In addition, there
are several parameters that the user must define, including the fraction of nodes to
randomly remove from the original network to create the comparison networks; the
overlap threshold p for a pair of clusters to be considered similar; and the fraction of
perturbed networks f in which a cluster needs to match another cluster for it to be
considered a natural community. The node compositions of the natural communities
depend on all of these parameters.
71
Chapter 4 | Community Structure in Networks
node overlap with C r (t), but C t (t + 1) has a greater link weight in common with
C r (t). In this case, either community could be identified as the descendant of C r (t)
depending on whether one considers node or edge overlap to be a more important
measure of community similarity. Most methods that we describe map communities
based on node overlap; in practice, the choice between node or edge overlap is likely
to depend on which measure is most appropriate for a particular analysis.
72
4.6 | Community dynamics
its implications for the potential rate of growth of online communities such as social
networking websites.
Berger-Wolf and collaborators have proposed several different methods for study-
ing dynamic communities [35]. In Ref. [36], Berger-Wolf and Saia presented a tech-
nique in which it was assumed that the community partition at each time step was
already known. They created a network of these known communities in which two
communities C and C ′ were linked if the similarity
2|C ∩ C ′ |
(4.6)
|C| + |C ′ |
exceeded a pre-defined threshold βm . Within this framework two communities could
be connected irrespective of the number of time steps separating the networks in which
they were observed. This is in contrast to other techniques that create networks of
communities but only allow communities to be connected if they appear in networks
separated by a specified number of time steps (e.g., Refs. [91, 93, 96, 97] which we
discuss in Section 4.6.6.).
Using the network of communities, Berger-Wolf and Saia then defined a metagroup
as any connected group of at least αm communities and considered that an individual
node was a member of a metagroup if it belonged to more than γm communities in
that metagroup. They then defined the most persistent metagroup as the metagroup
that contained the most communities; the most stable metagroup as the metagroup
that contained the most links as a fraction of the number of time steps over which
the group persisted; and the largest metagroup as the metagroup that contained the
most nodes.
Berger-Wolf and Saia tested the metagroup framework on the southern women
social network [76], which is a standard benchmark in social network analysis [112].
The southern women data set consists of details of the participation of 18 women
in 14 social events in Mississippi in the 1930s; the network contains 14 communities
corresponding to the groups of women that attended each of the 14 events. Berger-
Wolf and Saia found good agreement between the membership of the most stable
metagroups and the clusters identified in previous studies of the same data. They also
considered the importance of communities and nodes to the existence of metagroups
at different similarity thresholds βm and length thresholds αm and found that the
metagroups were robust to the removal of particular communities and individuals
from the population. Such observations are of practical interest in epidemiology; for
example, targeted vaccination of individuals whose removal results in the break-up of
metagroups might help to prevent the spread of diseases [10, 62, 151].
73
Chapter 4 | Community Structure in Networks
74
4.6 | Community dynamics
75
Chapter 4 | Community Structure in Networks
already on the board and found that the probability of an author joining the com-
mittee increased with the number of their co-authors already on it. This result is
consistent with Backstrom et al.’s [23] observation that the probability of a person
joining a community increases with the number of people that they know in that
community.
76
4.6 | Community dynamics
that nodes that were only loosely connected to their community were more likely
to leave that community than nodes with strong intra-community connections. The
observation that nodes are more likely to leave a community if they are only weakly
connected to that community is complementary to the observation made by Back-
strom et al. [23] and Jdidia et al. [154] that the probability of an individual joining a
community increases with the number of people they know in that community.
Reference [233] provides answers to several of the fundamental questions relating
to community dynamics, but the results should be validated using other techniques.
This is important because there are some issues with the CPM method; for example,
there are several possible choices for the clique size; and the definition of a community
is very stringent (see Section 4.3.1). It would also be interesting to investigate if
similar results are observed for other types of network.
77
Chapter 4 | Community Structure in Networks
time steps; a community declined if the number of members decreased and/or the
number of edges, or the edge weights, decreased between time steps. In contrast to
the grow and shrink events defined by Toyoda and Kitsuregawa in Refs. [291, 292],
the grow and decline events defined by Falkowski et al. allowed nodes to join from
existing communities, or leave to join other communities; i.e., the nodes joining a
community did not have to be nodes not previously in the network and the nodes
leaving a community were not required to be nodes leaving the network. Falkowski
et al. also introduced software for visualizing community dynamics in Ref. [92], but
they did not provide details of any results found using their method.
Falkowski et al. extended the work of Ref. [92] in a series of papers [91,93,96,97] in
which they matched two communities C and C ′ at different time steps if the overlap
function
|C ∩ C ′ |
(4.7)
min(|C|, |C ′ |)
exceeded a threshold ρm . Instead of simply comparing communities at consecutive
time steps, they matched communities between all networks within τp time steps
of each other. They then constructed a graph in which each community observed
over the full evolution of the network represented a node and they connected all
nodes that appeared in networks within τp time steps of each other for which the
community overlap exceeded ρm . Finally, they found groups of similar communities in
this network of communities using the Girvan-Newman edge betweenness algorithm.
An issue with this method is that it suffers from two of the weaknesses that we
described in Section 4.6.2; namely that the user is required to select values for the
parameters ρm and τp and the detected communities are sensitive to these choices.
Using this method to track dynamic communities, in Ref. [97] Falkowski et al.
defined several quantities that described the relationships between nodes and their
communities. These included the involvement of a node, which measured the number
of intra-community interactions of nodes at a single time step; the participation, which
measured the involvement of nodes in a particular community over all time steps as
a fraction of the total interactions of the node; and the relevance, which measured
the involvement of nodes in a community as a fraction of the total interactions over a
specified period. Falkowski et al. used this framework to perform a “preliminary set
of proof-of-concept experiments” on a social network.19
19
Although the network is not specified in Ref. [97], this paper extends the work presented in
Ref. [93] in which an online social network of students from the University of Magdeburg is studied,
so it seems likely that this network is also used in Ref. [97].
78
4.6 | Community dynamics
79
Chapter 4 | Community Structure in Networks
where win is the number of edges connecting two nodes in C, wout is the number of
edges with only one end connected to a node in C, and µ is a user-defined parameter.
Goldberg et al. then considered a community C(t + 1) to be the descendant of a
community C(t) if the similarity function
exceeded a threshold αt , which they set to 1/3 because it was found to produce
meaningful results. They discovered that the number of blogger communities that
persisted for more than a few weeks was very small, which is unsurprising given that
they also observed that 25% of the nodes changed between some time steps. However,
the persistence of communities determined using this method is affected by the value
of the threshold αt , so the results might have been different for other values of this
parameter. The dependence of the results on αt highlights why it can be undesirable
for dynamic community detection methods to incorporate user-defined parameters.
80
4.6 | Community dynamics
Asur et al. also defined four types of events involving nodes: appear (a node joined
the network), disappear (a node left the network), join (a node joined a community),
and leave (a node left a community). Based on these events, they defined four indices
that measured the behaviour of the nodes in relation to the communities: the stability
index (the tendency of a node to interact with the same nodes over time), the socia-
bility index (the number of different interactions in which a node was involved), the
popularity index (the number of nodes attracted by a community in a time interval),
and the influence index (the number of nodes that left or joined a community at the
same time as a node).
Asur et al. used these measures to describe the community evolution of a co-
authorship network taken from DBLP database and a network of clinical drug trial
patients in which the weights of the edges were based on the correlation in patients’
liver toxicity levels during the trial. For the drug network, they identified several
patients with a low stability index and suggested that this implied that these patients
were suffering from side effects from the drugs. The reasoning behind this suggestion
was that patients with a low stability index regularly switched between communities
because of large variations in their response to the trial drugs, and these variations
implied that they were responding badly to the treatment. Asur et al. also used the
influence index to predict the appearance of links within clusters.
In another method utilizing random walkers, Lin et al. [190] proposed a technique
that identified communities by finding the partition of the network that minimized
the number of steps that the walkers needed to take to reach other nodes within
the same community and maximized the number of steps required to move between
communities. In contrast to many of the methods that we discuss, instead of mapping
communities between time steps based on node or edge overlap, Lin et al. compared
vectors representing the interactions between members of a community at different
time steps. They then defined five types of community evolution: one-to-one mapping,
merge, split, extinct, and emerge. Each of these events corresponds to one of the five
events defined in Ref. [22] by Asur et al.
Lin et al. tested their method on a blog network in which each node represented
a blogger and there was a directed edge from the author of any comment to the
author of the blog where the comment was posted. Based on this network, they then
constructed weighted networks for several different search queries in which the edge
weights in a network for a particular query were based on the relevancy of a post to
that query. For example, they studied the community evolution in a network based on
the query “London bombing” and found that there were initially two distinct political
81
Chapter 4 | Community Structure in Networks
blog communities, with different political interests, which joined together when their
interests converged – in this case, when both communities began to discuss terrorist-
related issues, such as the investigation of terrorist suspects.
The major weakness of the method proposed by Lin et al. which is common to sev-
eral of the methods that we discuss, is that the user is required to input the number of
communities to identify at each time step. In almost all situations, it is desirable that
the appropriate number of communities is determined by the community-detection
algorithm from the structure of the data, rather than being specified by the user.
Random walkers have also been used to investigate communities in dynamic net-
works using methods based on modularity maximization (see Section 4.3.2). In
Ref. [177], Lambiotte et al. demonstrated the equivalence of modularity-like qual-
ity functions and Laplacian dynamics of populations of random walkers. Recently,
Mucha et al. extended this framework to study the community structure of multislice
networks, which are combinations of individual networks coupled through additional
links that connect each node in one network slice to itself in other slices [214].20
Their generalization includes an additional parameter that controls the strength of
the coupling between slices. The different connections between the network slices
are flexible and can represent connections between time slices, connections between
networks including different types of links, or connections across different resolutions.
The method allows one to simultaneously identify communities at different time steps
and to systematically track the development of communities through time.
Mucha et al. tested their method on an evolving roll call voting network for the
U.S. Senate over the period 1789–2008 (see Refs. [241, 306] and Chapter 6). They
uncovered details about the group voting dynamics of U.S. Senators that would not be
captured by simply considering the union of the partitions for the different networks.
In particular, their analysis identified several important changes in U.S. politics, such
as the formation of political parties and the beginning of the Civil War. The method
proposed by Mucha et al. is particularly appealing because it can simultaneously deal
with multiple time steps, multiple types of edge, and multiple resolutions. However,
the user is again required to make a parameter choice; in this case to choose an
appropriate value for the parameter controlling the coupling between different network
slices.
20
This approach bears similarities to the method proposed by Jdidia et al. in Ref. [154] (see
Section 4.6.4).
82
4.6 | Community dynamics
83
Chapter 4 | Community Structure in Networks
networks in a segment were then considered to represent the same stage of the evo-
lution of the network and were characterized by the same partition of the nodes into
communities. Furthermore, the boundaries between segments were considered to cor-
respond to “change points” in the evolution of the network. Sun et al. determined
the partitions of the sequence of networks into segments and of the segments into
communities using the minimum description length principle [136, 257], which is a
formalization of Occam’s Razor [80] in which the best description of a data set is the
one that results in the largest compression of the data. They calculated an encoding
cost for the description of the communities in each segment and summed this value
over all segments to give a total encoding cost for the sequence of networks. The
optimal partition of the sequence of networks into segments and of the segments into
communities was the partition that minimized the total encoding cost. Sun et al.
applied this technique to several data sets, including the Enron e-mail network, for
which they identified a change point at the time when the investigation into document
shredding at the company began and the CEO resigned.
Duan et al. used a different approach to partition evolving, directed, weighted
networks into segments in which all of the networks had similar optimal partitions
into communities [83]. They began by partitioning the first two instances G(t = 1) and
G(t = 2) of the network into community partitions P(1) and P(2), respectively, by
maximizing modularity for each network.21 They compared the two partitions P(1)
and P(2) using a similarity function based on the intersection of the communities at
the two time steps. They then proceeded in different ways depending on the value of
this similarity:
1. If the similarity exceeded a pre-defined threshold, the two networks were consid-
ered to belong to the same segment. Duan et al. then constructed an aggregate
network G(1, 2), containing all of the nodes and the total edge weight in G(1),
and G(2) and identified communities in this new network. They compared the
community partition P(3) of the third network G(3) with the partition for the
aggregate network G(1, 2) to determine if this network was sufficiently similar to
also join the segment. If it was, they integrated G(3) into the aggregate graph
G(1, 2); if it was not, there was a change point and G(3) began a new segment.
2. If the similarity of the first two partitions P(1) and P(2) did not exceed the
threshold, they were considered to belong to different segments, so there was a
21
See Ref. [184] for details of the generalization of modularity to directed networks.
84
4.6 | Community dynamics
change point between time steps 1 and 2 and the community partition P(3) of
the third network was then only compared to the second partition P(2).
They repeated this process for each of the time steps of the evolving network. Duan
et al. also applied their method to the Enron e-mail network and identified stable and
fluctuating periods during which there were few and many changes points, respec-
tively.
85
Chapter 4 | Community Structure in Networks
The positive value of the growth coefficient for social networks is in agreement
with the findings of Palla et al. [233] (see Section 4.6.5). However, Palla et al. only
investigated social networks, so it would be interesting to test whether their methods
also find that the age of a community is negatively correlated with its size in other
types of network. If this difference between the two types of network is verified using
other methods, a further question of interest is what properties of the networks lead
to these differences?
There are, however, several problems with Wang et al.’s approach. For example,
it is possible that a community could split in half such that one half contains all of
the core nodes and the other none. In this case, although the two new communities
contain the same fraction of nodes of the original community, only one will be labelled
as a descendant. A second possibility is that a community C(t) with three core nodes
could split into three communities, two of which contain significantly fewer nodes than
the third community. In this case, all three new communities would be considered
to be descendants of the original community; however, it would perhaps be more
reasonable to only consider the largest community as the descendant of C(t).
In another node-centric approach, Asur and Parthasarthy [20, 21] tracked the
evolution of groups of nodes in the local neighbourhood of individual nodes. For each
node i, they defined a viewpoint neighbourhood as the network of nodes rooted at i
that contained only nodes (and their connections) with some degree of importance
to i. To identify the viewpoint neighbourhood of a node, Asur and Parthasarthy
proposed an activation spread model defined as follows. They began at a node i
with an amount M of some resource to allocate; they then distributed this resource
amongst the immediate neighbours of i, assigning the proportion of the resource to
each node based on an activation function. They described any node that received
some of the resource as activated. The activated nodes retained some amount of
the resource for themselves and assigned the rest to their neighbours. This activation
process proceeded with the amount of resource to be allocated decaying as the number
of steps from i increased until a minimum threshold was reached, at which point the
resource was considered indivisible and the activation ceased. Asur and Parthasarthy
defined the viewpoint neighbourhood of node i as the set of all activated nodes. They
considered three activation functions for determining the fraction of the resource
allocated to the nodes at each level: one based on the degree of a node, another
based on edge betweenness, and a third based on the semantic similarity of keywords
with which the nodes were annotated.
86
4.6 | Community dynamics
Asur and Parthasarthy used this technique to investigate the dynamics of a DBLP
co-authorship network and a network of Wikipedia pages. They identified five evolu-
tion events for viewpoint neighbourhoods, which are similar to the events identified in
Ref. [22] by Asur et al. These events are growth, shrinkage, continuity (the nodes in
a viewpoint neighbourhood remained unchanged, but there were possibly changes in
the edges), mutate (a growth or shrinkage event in which more than half of the nodes
in a viewpoint neighbourhood changed between consecutive time steps), κ-attraction
(κ% of the nodes in a viewpoint neighbourhood moved closer), and κ-repulsion (κ%
of the nodes in a viewpoint neighbourhood moved further apart). These events are
not mutually exclusive so, for example, a viewpoint neighbourhood could grow and
attract at the same time. Asur and Parthasarthy found that, for both the DBLP and
Wikipedia networks, growth and shrinkage events were frequent, while continuity
events were rare in the DBLP network, but quite frequent in the Wikipedia network.
Asur and Parthasarthy also defined four indices (which are again similar to the
indices in Ref. [22]) for measuring the changes in a node’s viewpoint neighbourhood:
stability (how much a neighbourhood changed over time), sociability (how many dif-
ferent nodes were affected by a particular node over time), popularity (how many
nodes were attracted to a node’s neighbourhood), and impact (for identifying nodes
that had the highest impact on most viewpoint neighbourhoods). Perhaps unsurpris-
ingly, the authors in the DBLP network with the highest impact scores were those that
were regarded as most influential within their field. Finally, Asur and Parthasarthy
identified stable and short-lived subgraphs within viewpoint neighbourhoods.
Kampis et al. [162] also proposed a method for tracking communities from the
point of view of individual nodes. In Ref. [162], they suggested a community tracking
method based on identifier nodes, which they defined as the nodes within communities
with the highest betweenness centrality. A community was then considered to evolve
from another community if the communities shared the same identifier node. However,
Ref. [162] is a work in progress and Kampis et al. have not produced any results yet
using this framework.
A primary objective of node-centric techniques for investigating community dy-
namics is to make the mapping of communities between consecutive time steps un-
equivocal. However, as we highlighted at the beginning of this section when discussing
the method proposed by Wang et al. in Ref. [303], this mapping can still be ambiguous
for community mappings based on the properties of individual nodes. In Chapter 5,
we introduce an alternative node-centric method for tracking communities in dynam-
ical networks.
87
Chapter 4 | Community Structure in Networks
where ν is a tunable parameter that adjusts the relative weights of the snapshot
quality and history cost. This framework is flexible and one can define quality func-
tions and history costs that are appropriate for a particular problem. In Ref. [65],
Chakrabarti et al. derived evolutionary versions of hierarchical clustering and k-means
clustering [84] and tested their algorithm under different parameter settings on a bi-
partite network of photos and photo-tags from the photo-sharing website flickr.com.
Using a similar approach, in Refs. [69] and [68], Chi et al. proposed two evolu-
tionary spectral algorithms. In both algorithms, the optimal clustering was identified
22
Chakrabarti et al. defined the similarity of a pair of nodes in a bipartite network as a linear
combination of their cosine similarity and the correlation between time series associated with each
node. Their method, however, does not depend on this choice, so other similarity measures can be
used.
88
4.6 | Community dynamics
by minimizing a cost function with both a snapshot and a history cost, similar to
Eq. 4.10.23 The two approaches proposed by Chi et al. differed in the data that was
used to minimize the history cost. In the first approach, the clustering P(t) was com-
pared to the similarity data at t − 1, whereas in the second approach the clustering
P(t) was compared to the clustering P(t − 1). Chi et al. tested their methods on an
evolving blog network and found that, for certain values of the parameter weighting
the snapshot and history costs, the clusters were stable in the short-term, but evolved
over longer time horizons.
The evolutionary spectral framework proposed by Chi et al. allowed for changing
numbers of clusters, but they also extended it to allow for changing numbers of
nodes. However, the number of clusters required at each time step is not determined
automatically in this method, but must be input by the user. This is also true of the
evolutionary k-means and hierarchical clustering algorithms proposed by Chakrabarti
et al. [65]. Another issue with these evolutionary clustering techniques is that they
do not include a method for automatically selecting the parameter weighting the
snapshot and history costs; instead, the value of this parameter again needs to be
chosen by the user.
In another paper on evolutionary clustering, Lin et al. [188] proposed an algo-
rithm called FacetNet, that produces a soft assignment of nodes to communities, i.e.,
instead of each node belonging to a single community at each time step, the algo-
rithm returns a probability that a node belongs to a community, so nodes can be
considered to belong to more than one community. Lin et al. defined the snapshot
cost as the Kullback-Leibler divergence24 [173] between the similarity matrix at time
t and the matrix describing the community structure of the network at t; they defined
the history cost as the Kullback-Leibler divergence between the matrices describing
the community structures at t and t − 1. They also extended the framework to deal
with the insertion and deletion of nodes and changing numbers of communities. To
determine the number of communities at each time step, they defined a soft mod-
ularity and selected the partition that maximized this function. However, the user
is still required to input an appropriate range of candidate values for the number of
23
Note that Chi et al. [69] minimized a cost function whereas Chakrabarti et al. [65] maximized
a quality function. Of course, the two optimizations are closely related: the higher the quality, the
lower the cost.
24
The Kullback-Leibler divergence is a non-symmetric measure of the distance between two proba-
bility distributions. For two matrices X and
PY with elements xij and yij , respectively, the Kullback-
xij
Leibler divergence is given by D(X||Y ) = i,j (xij log yij − xij + yij ).
89
Chapter 4 | Community Structure in Networks
communities because evaluating the modularity over all possible community numbers
is computationally expensive.
Lin et al. tested the method on a blog network and a DBLP co-authorship network
and concluded that the identified communities were robust with respect to data noise.
In Ref. [189], they extended this work and provided more detailed algorithms and
proofs. The main problems with FacetNet are that it is not able to deal with the
appearance and disintegration of communities and the computation time needed to
find communities does not scale well with network size. These issues were addressed
in a recent paper by Kim et al. [167].
Evolutionary clustering algorithms have also been developed for multipartite net-
works. In Ref. [284], Tang et al. developed an evolutionary spectral clustering algo-
rithm for multipartite networks and applied this technique to a tripartite represen-
tation of the Enron e-mail network (in which nodes corresponded to users, e-mails
and words, and edges connected users to e-mails and e-mails to words) and to a four-
mode DBLP co-authorship network (in which nodes corresponded to papers, authors,
words in the title, and conferences/journals). For the co-authorship network, they
found that the method was able to successfully detect community changes. For ex-
ample, it identified the shift in focus of the Neural Information Systems Processing
conference from neural networks to machine learning between 1995 and 2004. The
main shortcomings of this approach are that the user must provide weightings indi-
cating the relevant importance of each edge type in the community detection and the
number of communities to identify in each mode.
Finally, in Ref. [319], Zhou et al. proposed an evolutionary spectral clustering
algorithm for identifying dynamic communities in tripartite networks. The algorithm
allows the nodes in the network to change through time, but the user is required input
the number of communities to identify. They applied the method to a co-authorship
network derived from the CiteSeer website in which nodes represented authors, words,
and venues, and they successfully identified communities corresponding to scientific
disciplines.
4.6.13 Summary
Although the study of community dynamics is still in its infancy, many methods have
already been proposed for detecting and tracking communities in dynamic networks.
In this section, we have described several techniques and discussed their application
to different networks. We have also highlighted the problems with some of these
methods; in particular, the difficulty of mapping communities between different time
90
4.6 | Community dynamics
steps. A problem that we have not discussed is the issue of using these methods to
provide insights into data. With a few notable exceptions, the studies that we have
described in this section present a method for detecting dynamic communities and
then validate the method by checking that the communities it identifies in a real-
world network are reasonable. However, many of the studies stop there. They do
not then go on to investigate the mechanisms driving the community evolution or
try to answer some of the fundamental questions that we posed at the beginning of
this section, such as what properties of a community lead to stability. For many of
the studies that we discussed, answering such question was probably not the authors’
objective, but it seems that this is where real insights into these evolving systems can
be gained and this is an obvious direction for future research.
In Chapter 5, we try to answer some of these questions for the FX market. We
begin by presenting a node-centric method for tracking dynamic communities which
side-steps the issue of mapping communities between different time steps. We then
investigate some of the properties of these evolving communities and use them to
provide insights into the changing structure of the FX market.
91
Chapter 5
The work described in this chapter has been published in reference [P1] and a further
paper that extends this publication is under review [P3]. The techniques we present
are complementary to the methods described in Chapter 3 and provide an alternative
approach for investigating evolving correlation matrices. In this chapter, we consider
an FX market network in which each node represents an exchange rate and each
weighted edge represents a time-dependent correlation between the rates and we use
community detection to study the temporal evolution of these correlations.
5.1 Introduction
An investigation of a financial market can be formulated as a network problem. In
the most common network description of a market, each node represents an asset,
and each weighted link is a function (the same function for all links) of the pairwise
temporal correlations between the two assets it connects [198]. A wide range of
financial assets have been investigated using network techniques, including equities,
e.g., [197, 198, 229], currencies, e.g., [133, 204, 205], commodities, e.g., [272], bonds,
e.g., [41], and interest rates, e.g, [79]. However, because the network adjacency matrix
is a function of the correlation matrix, a network for N assets contains 12 N(N − 1)
links (i.e., the network is fully-connected) so, as with the correlation matrices we
analyzed in Chapter 3, simultaneous investigation of the interactions is difficult for
even moderate N and some simplification is necessary to attain an understanding of
the market system.
The most prevalent method for reducing the complexity of a financial network is
to construct a minimum spanning tree (MST), e.g., [53, 197, 198, 229, 231]. The MST
93
Chapter 5 | Dynamic Communities in the FX Market
is generated using a hierarchical clustering algorithm (see Section 4.3) and reduces
the network to N − 1 of its most important microscopic interactions. This approach
has resulted in many useful financial applications, including the construction of a
visualization tool for portfolio optimization [229] and a means for identifying the
effect of news and major events on market structure [205]. Nevertheless, the MST
approach has a number of limitations which we discuss in Section 5.6.
An alternative simplification method is to coarse-grain the network and consider
it at various mesoscopic scales. The properties of the market can then be understood
by considering the dynamics of small groups of similar nodes (communities). From a
financial perspective, communities correspond to groups of closely-related assets, so
this treatment has the potential to suggest possible formulations for coarse-grained
models of markets.
Most prior studies of financial networks find groups of closely-related assets using
traditional hierarchical clustering techniques, e.g., [198, 204, 229] or by thresholding
to create an unweighted network, e.g., [99]. In contrast, in this chapter, we identify
communities using the Potts method described in Section 4.3.3. To the best of my
knowledge, other studies that uses similar approaches to study financial networks have
not examined longitudinal networks or have considered networks of equities rather
than exchange rates, e.g., [148].
To provide insights into the clustering of the exchange rate time series, we intro-
duce a new approach for investigating dynamic communities in networks. Commu-
nity detection in fully-connected networks of the type studied is equivalent to the
problem of clustering multivariate time series [187]. We propose a method to track
communities from the perspective of individual nodes, which removes the undesirable
requirement of determining which community at each time step represents the de-
scendant of a community at the previous time step that we discussed in Section 4.6.
We demonstrate that exchange rate community dynamics provide insights into the
correlation structures within the FX market and uncover the most important ex-
change rate interactions. Although we focus on the FX market, the techniques that
we present in this chapter are general and can be applied to other systems for which
an evolving similarity between the constituent elements can be defined.
5.2 Data
The FX networks we construct have N = 110 nodes, each of which represents an ex-
change rate of the form XXX/YYY (with XXX6=YYY), where XXX, YYY∈{AUD,
94
5.2 | Data
CAD, CHF, GBP, DEM, JPY, NOK, NZD, SEK, USD, XAU} and we note that
DEM→EUR after 1998. Other authors have recently studied the FX market by con-
structing networks in which all nodes represent exchange rates with the same base
currency, implying that each node can then be considered to represent a single cur-
rency [133]. Exchange rate networks formed with reference to a single base currency
are somewhat akin to ego-centred networks studied in the social networks litera-
ture [304]. Ego-centred networks include links between a number of nodes that all
have ties to an ego which is the focal node of the network. However, this approach has
two major problems for FX networks. First, it neglects a large number of exchange
rates that can be formed from the set of currencies studied and consequently also
ignores the interactions between these rates. Second, the network properties depend
strongly on the choice of base currency and this currency is, in effect, excluded from
the analysis. We therefore construct a network including all exchange rates that can
be formed from the studied set of currencies.
5.2.1 Returns
We take the price pi (t) at discrete time t as the mid-price of the bid and ask prices,
so that
1 bid
pi (t) + pask
pi (t) = i (t) . (5.1)
2
We define the logarithmic return of an exchange rate with price pi (t) as (see Eq. 3.3)
pi (t)
zi (t) = ln ,
pi (t − 1)
and we use the last posted price within an hour to represent the price for that hour.
To calculate a return at time t, one needs to know the price at both t and t − 1. To
minimize the possibility of a price not being posted in a given hour, we focus on the
FX market’s most liquid period: 07:00-18:00 U.K. time. Nevertheless, there are still
hours for which we do not have price data (this usually occurs as a result of problems
with the data feed). One can calculate a return for hours with missing price data by
assuming the last posted price or interpolating between prices at the previous and
next time step [73]. However, to ensure that all time steps included in the study are
ones at which a trade can actually be made, we take the stricter approach of omitting
all returns for which one of the prices is not known. In order to ensure that the time
series of exchange rates are directly comparable, we consequently remove a return
from all exchange rates if it is missing from any rate.
95
Chapter 5 | Dynamic Communities in the FX Market
For the period 1991–2003, we derive each exchange rate XXX/YYY with XXX,
YYY6=USD from two USD rates. For example, we find the CAD/CHF price at
each time step by dividing the USD/CHF price by the USD/CAD price. For the
period 2005–2008, we derive each exchange rate not included in the set {AUD/USD,
EUR/NOK, EUR/SEK, EUR/USD, GBP/USD, NZD/USD, USD/CAD, USD/CHF,
USD/JPY, USD/XAU} from pairs of exchange rates in this set. For example, we
find the USD/NOK price at each time step by dividing the EUR/NOK price by the
EUR/USD price. Although this approach appears somewhat artificial, it matches
the way in which many exchange rates are calculated in the actual FX market. For
example, a bank customer wishing to convert CAD to NZD (or vice versa) will need
to be quoted the CAD/NZD prices. Because this is not a standard conversion, the
bank will not be able to quote a direct market price but will instead calculate a price
using the more widely traded USD/NZD and USD/CAD exchange rates. Calculating
the exchange rates in this way implies that there is some intrinsic structure inherent
in the FX market. However, as shown in Ref. [204] and demonstrated further in
Sections 5.5.2 and 5.5.3 of this chapter, this “triangle effect” does not dominate the
results.
96
5.2 | Data
and j. For example, two exchange rates i and j whose return time series are perfectly
correlated will be connected by a link of unit weight.
We exclude self-edges in order to deal with simple graphs. This approach was
also taken in a previous study of a stock network derived from a correlation matrix
[148]. We note that if we include self-edges, the node compositions of the identified
communities are identical if one makes a small parameter change in the community
detection algorithm. We discuss the the effect of including self-edges in Sections 5.4
and 5.2.2.
Similarly to Section 3.2.3, we create a longitudinal sequence of networks by consec-
utively displacing the time windows by ∆t = 20 hours (approximately 2 trading days)
and fix T = 200 hours (approximately 1 month of data). This choice of T , motivated
in part by the example data in Fig. 5.1, represents a trade-off between over-smoothing
for long time windows and overly-noisy correlation coefficients for small T [227, 229].
Figure 5.2 demonstrates that the choice of ∆t has a similar, but less pronounced, ef-
fect on the standard deviation of the edge weights and we again select a compromise
value. The time windows we use to construct the networks overlap, so the single-time
networks are not independent but rather form an evolving sequence through time.
0.26 a) 0.26 b)
0.24 0.24
σ(Aij)
σ(Aij)
0.22 0.22
0.2 0.2
0.18 0.18
1993 1994 1995 1996 1997 1998 1993 1994 1995 1996 1997 1998
time time
0.26 c) 0.26 d)
0.24 0.24
σ(Aij)
σ(Aij)
0.22 0.22
0.2 0.2
0.18 0.18
1993 1994 1995 1996 1997 1998 1993 1994 1995 1996 1997 1998
time time
Figure 5.1: The standard deviation of the edge weights Aij as a function of time
for the period 1991–1998. For each panel, ∆t = 20 (approximately 2 days), and (a)
T = 100 hours, (b) T = 200 hours, (c) T = 400 hours, and (d) T = 1200 hours
(approximately 0.5, 1, 2, and 6 months, respectively).
97
Chapter 5 | Dynamic Communities in the FX Market
0.26 0.26
a) b)
0.24 0.24
0.22 0.22
σ(Aij)
σ(A )
ij
0.2 0.2
0.18 0.18
0.16 0.16
1993 1994 1995 1996 1997 1998 1993 1994 1995 1996 1997 1998
time time
0.26 0.26
c) d)
0.24 0.24
0.22 0.22
σ(Aij)
σ(A )
0.2 ij 0.2
0.18 0.18
0.16 0.16
1993 1994 1995 1996 1997 1998 1993 1994 1995 1996 1997 1998
time time
Figure 5.2: The standard deviation of the edge weights Aij as a function of time for
the period 1991–1998. For each panel, T = 200 hours, and (a) ∆t = 10, (b) ∆t = 20,
(c) ∆t = 50, and (d) ∆t = 200 (approximately 1 day, 2 days, 5 days, and 2 weeks,
respectively).
where Ci is the state of spin i and Jij is the interaction energy between spins i and j.
The coupling strength Jij is given by Jij = Aij − λPij , where Pij denotes the expected
weight of the link with which nodes i and j are connected in a null model and λ is a
resolution parameter. We employ the standard null model of random link assignment
Pij = ki kj /2m, which preserves the degree distribution of the network.
We construct FX networks by calculating a correlation coefficient between every
pair of exchange rates, resulting in a weighted, fully-connected network. We include
each exchange rate XXX/YYY and its inverse rate YYY/XXX in the network, be-
cause one cannot infer a priori whether a rate XXX/YYY will form a community with
a rate WWW/ZZZ or its inverse ZZZ/WWW. However, the return of an exchange rate
XXX/YYY is related to the return of its inverse YYY/XXX by z XXX = −z YYY . This
YYY XXX
98
5.4 | Robust community partitions
implies that the correlation coefficients between these rates and a rate WWW/ZZZ
XXX WWW YYY WWW
are related by r YYY , ZZZ = −r XXX , ZZZ . Consequently, every node has the
same strength
X 1
ki = Aij = (N − 2) , (5.4)
j
2
so the probability of connection in the standard null model Pij = ki kj /2m is also
constant and is given by
N −2
Pij = . (5.5)
2N
In the case of the FX network, the standard null model Pij = ki kj /2m and the uniform
null model are thus equivalent. However, the methods we present are general and can
be applied to networks with non-uniform strength distributions.
If we include self-edges in the network, the strength of each node increases by
one. This, in turn, leads to a constant increase in the expected edge weight in
the null model. For a network with self-edges, the expected edge weight is given by
.
Pijs = N/[2(N +2)], a shift by a constant value of Pijs −Pij = 2/[N(N +2)] = 1.62×10−4
relative to the network in which self-edges are excluded. Self-edges always occur
within a community, so they will always contribute to the summation in Eq. 4.3
irrespective of exactly how the nodes are partitioned into communities. This implies
that self-edges play no role when determining the community partition that minimizes
the interaction energy at a particular resolution.
Additionally, every community has an equivalent inverse community. For ex-
ample, if there is a community consisting of the three exchange rates XXX/YYY,
XXX/WWW, and ZZZ/WWW in one half of the network, there must be an equiva-
lent community formed of YYY/XXX, WWW/XXX, and WWW/ZZZ in the other
half. The existence of an equivalent inverse community for each community means
that at each time step, the network is composed of two equivalent halves. However,
the exchange rates residing in each half change in time as the correlations evolve.
99
Chapter 5 | Dynamic Communities in the FX Market
′
X X
′q(k, k ′)
I(C, C ) = q(k, k ) log , (5.7)
k=1 k ′ =1
q(k)q(k ′ )
′
where q(k, k ′ ) = |C k ∩ C k |/N. The mutual information is the amount by which
knowledge of a node’s community in P reduces the uncertainty about its community
membership in P ′ (averaged over all nodes). The normalized variation of information
V̂ between P and P ′ is then given by
The factor log N normalizes V̂ (P, P ′ ) to the interval [0, 1], with 0 indicating identi-
cal partitions and 1 indicating that all nodes are in individual communities in one
partition and in a single community in the other. We will use Eq. 5.8 to compare
partitions in networks with the same number of nodes and remark that one should
not normalize by log N when comparing the variation of information in data sets with
different sizes [207].
1
In order to find equivalent communities in the network in which self-edges are included, it is nec-
essary to decrease the resolution parameter to compensate for the increase in the constant expected
edge weight in the null model. If we identify communities in the network in which self-edges are
excluded using the resolution parameter λ, then we find identical communities in the corresponding
network with self-edges using a resolution parameter λs = λPij /Pijs = λ(N + 2)(N − 2)/N 2 . For ex-
ample, if we identify communities in the network without self-edges using a resolution of λ = 1.4500,
then we identify equivalent communities in the network with self-edges with a resolution parameter
of λs = 1.4495.
2
Recall that the quantity C k represents the set of communities indexed by k but that Ci is the
set of nodes in the same community as node i.
100
5.4 | Robust community partitions
1
07/03/08
01/02/07
0.8 28/12/05
23/11/04
normalized parameter
20/10/03
0.6 15/09/02
11/08/01
date
07/07/00
0.4 η
03/06/99
S
Qs 29/04/98
25/03/97
0.2 dH/dλ
19/02/96
V̂
15/01/95
0 11/12/93
06/11/92
a) b)
03/10/91
0.6 0.8 1 1.2 1.4 1.6 1.8 2 1 1.2 1.4 1.6
λ λ
Figure 5.3: (a) The quantities η, S, Q, and dH/dλ (defined in the text), normalized
by their maximum values, versus the resolution parameter λ for a single time window
beginning on 17/03/1992. The shaded grey area highlights the main plateau. The
bottom curve gives the normalized variation of information between partitions at
resolutions separated by ∆λ = 0.015. (b) The position of the main plateau at each
time step for the full period 1991–2008. Main plateaus (blue) containing the fixed
resolution (set at λ = 1.41 for 1991–2003 and to λ = 1.45 for 2005–2008) and (red)
not containing the fixed resolution. The grey block corresponds to 2004, for which
we do not have data.
101
Chapter 5 | Dynamic Communities in the FX Market
can also identify robust communities by examining summary statistics that describe
the community structure as a function of the resolution parameter. We consider the
number of communities η, the modularity Q (see Eq. 4.1), the entropy S (see Eq. 5.6),
and the rate of change of the energy with resolution dH/dλ. Robust communities
correspond to plateaus (constant values) in curves of any of these quantities as a
function of the resolution parameter. In Fig. 5.3(a), we plot curves for each of the
summary statistics as a function of λ.
Figure 5.3(a) contains four principal plateaus, corresponding to partitions of the
network into η = 1, 2, 20, and 110 communities. The first and last plateaus, re-
spectively, represent all nodes in a single community and all nodes in individual
communities. The second plateau represents one community of exchange rates and
a corresponding community of inverse rates. The η = 20 plateau occurs over the
interval λ = [1.34, 1.57], in which there is a single plateau in the η plot and a few
smaller plateaus in each of the other plots. In contrast to the other plateaus, this
one was not expected, so the robust communities over this interval can potentially
provide new insights into the correlation structure of the FX market. Although the
community configuration over this interval does not have maximal Q (i.e., it is not
the community configuration corresponding to the maximum value of the traditional
modularity, which is the scaled energy [see Eq. 4.4] with λ = 1.), it provides an ap-
propriate resolution at which to investigate community dynamics due to its resolution
robustness and the financially-interesting features of the detected communities. For
the remainder of this chapter, we will refer to this plateau as the “main” plateau.
102
5.5 | Community detection in dynamic networks
In order to demonstrate the validity of this technique, we show in Fig. 5.4(b) the
distribution of the λ distance from the fixed resolution to the main plateau and in
Fig. 5.4(c) the distribution of the normalized variation of information between the
community configuration obtained at the fixed resolution and that corresponding to
the main plateau. Both distributions are peaked at zero. The fixed resolution is a
λ distance of less than 0.05 from the main plateau 91% of the time for the period
1991–1998, 93% of the time for 1999–2003, and 88% of the time for 2005–2008. The
community configurations of the main plateau and the fixed resolution differ in the
community assignments of fewer than five nodes in 78% of time steps for the period
1991–1998, in 83% of time steps for 1999–2003, and in 88% of time steps for 2005–2008.
For the majority of time steps, the community configuration at the fixed resolution is
hence identical or very similar to the configuration corresponding to the main plateau.
This supports the proposed method of investigating the community dynamics at a
fixed λ for each period.
103
Chapter 5 | Dynamic Communities in the FX Market
fraction observed
0.6 1991−1998 a)
1999−2003
0.5
2005−2008
0.4
0.3
0.2
1.3 1.35 1.4 1.45 1.5 1.55
λ
normalized frequency
normalized frequency
0.8 1
b) c)
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0 0
−0.2 −0.1 0 0.1 0.2 0.3 0 0.1 0.2 0.3 0.4 0.5
λ−interval V̂
Figure 5.4: (a) Observed fraction of time steps that the resolution λ lies on the main
plateau. The vertical lines indicate λ = 1.41, which lies in the highest number of main
plateaus for the period 1991–2003, and λ = 1.45, which lies in the highest number
of main plateaus for 2005–2008. These are the resolutions at which we investigate
the community dynamics over the two periods. For the full period 1991–2008, we
show in panel (b) the normalized sampled distribution of the main plateau width
(blue) and the normalized sampled distribution of the λ-distance between the main
plateau and the fixed resolution (red). The distance is exactly zero for 53% of the
time steps. Again for 1991-2008, we show in panel (c) the normalized variation of
information distribution between the community configuration at the fixed resolution
and the configuration corresponding to the main plateau (blue) and the normalized
variation of information distribution between consecutive time steps (red). The value
of V̂ is exactly zero for 64% of the time steps. The vertical lines give the mean V̂
when (left to right) 1, 2, 5, 10, 20, and 50 nodes are randomly reassigned to different
communities (averaged over 100 reassignments for each time step).
104
5.5 | Community detection in dynamic networks
provides some insights into the effects of the triangle relations on the properties of
the actual data.
By inspection, Fig. 5.5(b) shows that the communities identified for the actual
data are stronger than those generated using shuffled data. The sample mean scaled
energy for the actual data is 0.011 (with a standard deviation of 0.0061) and for
shuffled data the sample mean is 0.0039 (with a standard deviation of 0.0013). The
communities observed for the actual data are therefore stronger than the commu-
nities for randomized data in which the structure results from the triangle effect.
This provides strong evidence that the communities represent meaningful structures
within the FX market, so these communities can provide insights into the correlation
structure of the market. We now consider properties of these communities in detail.
1
observed fraction
0.6 a)
observed
0.8
fraction
0.4
0.6
0.2
0.4
0
14 15 16 17 18 19 20 21 22
0.2 η
0
0 5 10 15 20 25 30 35
community size
normalized frequency
0.15 b)
0.1
0.05
0
−0.005 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045
Qs
Figure 5.5: (a) The fraction of time steps at which a community of a given size
is observed for 1991–1998 (blue), 1999–2003 (red), 2005–2008 (green), and shuffled
data (grey). The shuffled data distribution combines the results for the period 1991–
2003 and for 2005–2008. The distributions were almost identical for the two periods.
The inset shows the fraction of time steps at which η communities are observed
for 1991–1998 (blue), 1999-2003 (red), and 2005–2008 (green). (b) Comparison of
the distribution of the scaled energy for 1991–2003 for market data (blue) and 100
realizations of shuffled data (red).
105
Chapter 5 | Dynamic Communities in the FX Market
106
5.6 | Minimum spanning trees
effects of important news and events on individual currencies. The relative size of
different base-currency communities can provide similar information. For example, if
we observe a community of ten CHF/YYY exchange rates and a community of three
DEM/YYY, the larger size of the CHF/YYY community suggests that the CHF is
more dominant than the DEM in the market at this time.
It is also worth noting that the most frequently observed community of ten ex-
change rates with the same base currency is the gold (XAU) community. We include
gold because there are many similarities between it and a currency. However, gold
also tends to be more volatile than most currencies, so the gold exchange rates tend
to have relatively high correlations and strong links in the network. Given this, it is
unsurprising that the gold rates often form their own community; the absence of a
large gold community at a time step is often a good indication that another currency
is particularly influential.
Importantly, the identified communities do not always contain exchange rates with
the same base currency, providing insights into changes in the inherent values of differ-
ent currencies. For example, consider a community containing several exchange rates
with CHF as the base currency and several rates with DEM as the base currency. The
fact that the exchange rates are in the same community suggests that they are corre-
lated. The structure of this community also provides information about the inherent
values of the CHF and DEM. Exchange rates of the form XXX/YYY quote the value
of one currency in terms of another currency, so if the price of XXX/YYY increases it
is not clear whether this is because XXX has become more valuable or because YYY
has become less valuable. However, if one observes that the price of XXX increases
against a range of different YYY over the same period, then one expects that the
value of XXX has increased. Therefore, returning to the example, if one observes a
community of several CHF/YYY and DEM/YYY exchange rates for many different
YYY then this suggests that these rates are positively correlated. Because the values
of CHF and DEM have increased against a range of other currencies, we expect that
the inherent values of both CHF and DEM are increasing.
107
Chapter 5 | Dynamic Communities in the FX Market
Table 5.1: Examples of frequently-observed communities for the pre-euro period 1991–
1998 and for the two post-euro periods (1999–2003 and 2005–2008). The quantity Fr
denotes the fraction of time steps at which each community is observed. The notation
XXX/{YYY,ZZZ} indicates the set of exchange rate {XXX/YYY,XXX/ZZZ}.
Period Community Fr
USD/CAD 0.62
DEM/CHF 0.45
NZD/{CAD,USD} 0.33
AUD/{CAD,NZD,USD} 0.32
1991–1998 XAU/{AUD,CAD,CHF,DEM,GBP,JPY,NOK,NZD,SEK,USD} 0.28
SEK/{AUD,CAD,CHF,DEM,GBP,JPY,NOK,NZD,USD,XAU} 0.17
DEM/NOK 0.16
AUD/{CAD,NZD,USD,XAU} 0.14
GBP/{CHF,DEM,NOK} 0.12
EUR/CHF 0.88
USD/CAD 0.67
XAU/{AUD,CAD,CHF,EUR,GBP,JPY,NOK,NZD,SEK,USD} 0.64
NOK/{CHF,EUR} 0.59
1999–2003 SEK/{CHF,EUR,NOK} 0.51
GBP/{CAD,USD} 0.24
NZD/{AUD,CAD,CHF,EUR,GBP,JPY,NOK,SEK,USD} 0.21
JPY/{CAD,GBP,USD} 0.17
AUD/{CAD,CHF,EUR,GBP,JPY,NOK,SEK,USD} 0.14
XAU/{AUD,CAD,CHF,EUR,GBP,JPY,NOK,NZD,SEK,USD} 0.91
EUR/CHF 0.65
AUD/NZD 0.39
CAD/{AUD,CHF,EUR,GBP,JPY,NOK,NZD,SEK,USD} 0.39
2005–2008 GBP/{CHF,EUR} 0.35
SEK/{CHF,EUR} 0.33
NZD/{AUD,CAD,CHF,EUR,GBP,JPY,NOK,SEK,USD} 0.26
NOK/{CHF,EUR,SEK} 0.21
GBP/{CHF,EUR,NOK,SEK} 0.20
108
5.6 | Minimum spanning trees
the limitations of this approach for community detection and describe the additional
information that the Potts method can provide.
MSTs are constructed using the agglomerative hierarchical clustering technique
known as single-linkage clustering [84, 244]. Agglomerative methods start with N
singleton clusters and create a hierarchy by sequentially linking clusters based on
their similarity. At the first step, the two nodes separated by the smallest distance
are joined in a cluster. At each subsequent step, the distance between the new cluster
and each of the old clusters is recomputed and the two clusters again joined. This
can be repeated until all clusters are connected. The similarity of clusters C and
C ′ is usually expressed as a distance that is determined by considering the distance
dij between each node i ∈ C and each node j ∈ C ′ . In single-linkage clustering, the
distance between clusters is given by
For financial networks, the standard measure used for dij is the nonlinear trans-
formation of the correlation coefficient r(i, j) given by [197, 198]
p
dij = 2 [1 − ρ(i, j)]. (5.11)
109
Chapter 5 | Dynamic Communities in the FX Market
algorithm, which shows the full hierarchical structure, is a dendrogram (or hierarchical
tree) [84, 244]. At the first level of the dendrogram, there are N singleton clusters.
As one climbs the vertical distance scale of the dendrogram, clusters are combined
consecutively until all nodes are contained in a single community at the top of the
dendrogram.
Figure 5.6: The minimum spanning tree for the network formed from a time window
of returns beginning on 18/09/1991. The tree is split into two identical halves (in-
dicated by and ), which are connected via the edge (shown in red) between the
XAU/USD and USD/AUD exchange rates. For each community of exchange rates,
there is an equivalent community of inverse rates in the other half of the tree. We
colour each node according to its community membership determined using the Potts
method with λ = 1.41, and we show each community of exchange rates in the same
colour as the corresponding community of inverse rates.
110
5.6 | Minimum spanning trees
to their community membership as determined using the Potts method. The MST
is partitioned into two halves with communities of exchange rates in one half and
equivalent communities of inverse exchange rates in the other. In this example, nodes
belonging to the same community are always linked in the MST, but this is not always
the case.
The main problem with single-linkage clustering (and, as a consequence, with
MSTs) is that clusters can be joined as a result of single pairs of elements being close
to each other even though many of the elements in the two clusters have large separa-
tions. The MST then contains weak links that might be misinterpreted as being more
financially meaningful than they actually are [231]. It is also difficult to determine
where the community boundaries lie on the MST. For example, a branch of an MST
might include nodes belonging to a single community or the nodes might belong to
several communities. As an example of this phenomenon, and of the additional clus-
tering information provided by the Potts method, consider the branch at the far right
of the tree shown in Fig. 5.6. By simply considering the MST, one might have inferred
the existence of a cluster that includes all of the NOK/YYY rates and USD/CAD.
However, the Potts method highlights the fact that USD/CAD forms a singleton
community and that NOK/XAU belongs to a community with the XXX/XAU rates.
This observation might provide information as to the relative importance of NOK
and XAU in the market over this period.
In Fig. 5.7(a), we show the dendrogram generated using the same single-linkage
clustering algorithm used to produce the MST in Fig. 5.6. If the distances between
different dendrogram levels are reasonably uniform, then no clustering appears more
“natural” than any other [84]. However, large distances between levels (i.e., the same
clusters persist over a large range of distances) might indicate the most appropriate
level at which to view the clusters. This is analogous to investigating communities
that are robust over a range of resolutions. The clusterings observed at some levels of
Fig. 5.7(a) correspond quite closely with the communities identified using the Potts
method, but there is no level at which they correspond exactly. The levels are also rea-
sonably evenly distributed along the distance axis. In the dendrogram in Fig. 5.7(b),
which was generated using average-linkage clustering, there is a range of distances
over which the clustering does not change. The clustering observed over this interval
is identical to the community configuration corresponding to the main plateau found
using the Potts method. Therefore, in this case, average-linkage clustering and the
Potts method identify the same robust communities.
111
of the main plateau identified using the Potts method.
(b) highlights the range over which the communities correspond to the communities
linkage clustering and (b) average-linkage clustering. The dashed grey lines in panel
the Potts method with λ = 1.41. We generated the dendrograms using (a) single-
colour each exchange rate according to its community membership determined using
one half of the network for a time window of returns beginning on 18/09/1991. We
Figure 5.7: Dendrograms showing the hierarchical clustering of exchange rates for
0.4
0.6
0.8
1.2
1.4
1.6
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1
CHF/XAU CHF/XAU
DEM/XAU DEM/XAU
GBP/XAU GBP/XAU
JPY/XAU USD/XAU
USD/XAU CAD/XAU
CAD/XAU AUD/XAU
AUD/XAU JPY/XAU
NOK/XAU NZD/XAU
NZD/XAU NOK/XAU
SEK/XAU SEK/XAU
CHF/GBP SEK/USD
CHF/DEM SEK/CAD
DEM/GBP SEK/AUD
USD/AUD SEK/JPY
CAD/AUD SEK/GBP
GBP/AUD SEK/CHF
CHF/AUD SEK/DEM
DEM/AUD SEK/NZD
JPY/AUD NOK/USD
GBP/USD NOK/CAD
GBP/CAD NOK/JPY
CHF/USD NOK/AUD
CHF/CAD NOK/GBP
DEM/USD NOK/CHF
DEM/CAD NOK/DEM
112
(b)
(a)
GBP/JPY NOK/NZD
CHF/JPY USD/AUD
DEM/JPY CAD/AUD
JPY/USD USD/NZD
JPY/CAD CAD/NZD
USD/NZD JPY/NZD
CAD/NZD GBP/NZD
AUD/NZD CHF/NZD
GBP/NZD DEM/NZD
CHF/NZD AUD/NZD
DEM/NZD GBP/USD
JPY/NZD GBP/CAD
USD/CAD CHF/USD
NOK/USD CHF/CAD
NOK/CAD DEM/USD
NOK/JPY DEM/CAD
NOK/AUD GBP/AUD
NOK/NZD CHF/AUD
NOK/GBP DEM/AUD
NOK/CHF GBP/JPY
NOK/DEM CHF/JPY
SEK/USD DEM/JPY
SEK/CAD JPY/AUD
SEK/JPY SEK/NOK
SEK/AUD JPY/USD
SEK/NZD JPY/CAD
SEK/GBP CHF/GBP
SEK/CHF CHF/DEM
SEK/DEM DEM/GBP
SEK/NOK USD/CAD
5.7 | Exchange rate centralities and community persistence
i
where Gst is the total number of shortest paths from node s to node t and gst is the
number of shortest paths from s to t passing through i. Betweenness centrality is
widely used in social network analysis to quantify the extent to which people lie on
paths that connect others. Nodes with high betweenness can be considered to be im-
portant for facilitating communication between others in the network, so betweenness
is used to help measure the importance of nodes for the spread of information around
the network [299].
We also consider the community centrality of each node [220]. We employ the
scaled energy matrix J, with components Jij = Aij − λPij , where we again set Pij =
ki kj /2m = (N − 2)/2N. Following the notation in Ref. [220], the energy matrix can
be expressed as J = UDUT , where U = (u1 |u2 | . . .) is the matrix of eigenvectors of
113
Chapter 5 | Dynamic Communities in the FX Market
where [xi ]j indicates the jth (j ∈ 1, . . . , q) element of the node vector of node i. The
magnitude |xi | is the community centrality. Nodes with high community centrality
play an important role in their local neighbourhood, irrespective of the community
boundaries.
One can also define a community vector
X
wk = xi (5.15)
i∈C k
for each community k with members C k . Nodes with high community centrality
are strongly attached to their community if their node vector is also aligned with
their community vector. Continuing to use the definitions in Ref. [220], a projected
community centrality yi is defined by
and we refer to the quantity cos θik as the community alignment. The community
alignment is near 1 when a node is at the centre of its community and near 0 when
it is on the periphery. Nodes with high community alignment are located near the
centre of their community and have a high projected community centrality, so they are
strongly attached to their community and can be considered to be highly influential
within it. The number of positive eigenvalues of J can vary between time steps, so
we normalize |xi | and yi by their maximum value at each time step.
114
5.7 | Exchange rate centralities and community persistence
115
Chapter 5 | Dynamic Communities in the FX Market
1 1
a) c)
0.95
0.9
0.9
|x|
0.85
0.8
0.8
0.75 0.7
0 5 10 15 20 25 30 35
a(τ)
community size
0.6
0.92
0.9
0.5 τ=1
τ=2
cosθ
0.88
0.86 τ=5
0.4 τ=10
0.84 τ=20
0.82 b) τ=50
0 20 40 60 80 100 120 140 0.4 0.5 0.6 0.7 0.8 0.9 1
b y
Figure 5.8: (a) Mean community centrality versus the size of the community to which
the node belongs. (b) Mean community alignment versus the betweenness central-
ity of nodes. (c) Mean community autocorrelation versus the projected community
centrality. (All error bars indicate the standard error [42].)
the pre-euro period. In contrast, XAU exchange rates are important for information
transfer for the pre-euro period but not after the euro was introduced.
In Fig. 5.8(c), we show the mean community autocorrelation versus the projected
community centrality. We calculate the mean autocorrelation by splitting the range
of y into 15 bins containing equal numbers of data points and then averaging over
all autocorrelations falling within these bins. (Again, the observed relationships are
robust with respect to variations in the number of bins.) As one would expect, the
community autocorrelation for the projected community centrality of a given node is
smaller for larger τ . More interesting is that for all values of τ , the mean community
autocorrelation increases with y. This suggests that nodes that are strongly connected
to their community are likely to persistently share that community membership with
the same subset of nodes. In contrast, exchange rates with a low y experience regular
changes in the set of rates with which they are clustered. This result agrees with the
observation in Ref. [233] for social networks that nodes that are only loosely connected
to their community were more likely to leave that community than nodes with strong
intra-community connections (see Section 4.6.5).
Table 5.2 shows the exchange rates with the highest projected community central-
ity, which in turn reveals the most persistent communities. For 1991–2003, approxi-
116
5.8 | Major community changes
mately half of the ten exchange rates with the highest projected community centrality
also appear in the list of the ten rates with the highest community centrality. For
2005–2008, however, the lists of exchange rates with the highest community centrality
and projected community centrality are dominated by the same set of XAU exchange
rates (though the rankings differ). For 1991–2003, the exchange rates with the high-
est projected community centrality again includes rates formed of DEM (and EUR)
or CHF and one of the commodity currencies. However, there are also a number of
USD exchange rates with high projected community centrality that don’t have high
community centrality. This suggests that these USD rates do not have strong links
with a large number of other exchange rates, but that they strongly influence the
rates within their community.
117
1991–1998 1999–2003 2005–2008
Rank
Chapter 5 | Dynamic Communities in the FX Market
118
8 DEM/SEK SEK/XAU DEM/AUD USD/JPY EUR/NZD NOK/XAU USD/AUD NOK/XAU JPY/XAU
9 NZD/CAD NOK/AUD USD/AUD GBP/JPY SEK/NZD EUR/NZD CHF/NOK CHF/NZD AUD/XAU
10 DEM/NOK DEM/NZD DEM/NZD CHF/SEK NOK/NZD CHF/NZD GBP/AUD AUD/XAU NOK/XAU
Table 5.2: The ten exchange rates with the highest betweenness centrality b, community centrality |x|, and projected community
centrality y for each of the three periods. We rank the exchange rates for each centrality according to their average rank over
all time steps. For each exchange rate XXX/YYY the equivalent inverse rate YYY/XXX had the same betweenness centrality,
community centrality, and projected community centrality.
5.8 | Major community changes
0.5
0
0.1 0.3
b) Q σ(A )
s ij
σ(A )
0.25
ij
Qs
0.05
0.2
0 0.15
0.4
c)
0.2
V̂
0
03/10/91 29/06/94 25/03/97 20/12/99 15/09/02 11/06/05 07/03/08
date
Figure 5.9: (a) Normalized distribution of the link weights at each time step. (b)
Scaled energy Qs (black line) and standard deviation of the link weights (blue line).
(c) Normalized variation of information V̂ between the community configurations at
consecutive time steps. The horizontal lines show (from bottom to top) the mean of
V̂ and 1, 2, 3, 4, 5, and 6 standard deviations above the V̂ mean. The green vertical
line in panels (b) and (c) separates the pre- and post-euro periods. The red vertical
lines show the time steps when 22/12/94, 07/02/97, and 15/08/07 enter the rolling
time window. These dates correspond, respectively, to the devaluation of the Thai
baht during the Asian currency crisis, the flotation of the Mexican peso following its
sudden devaluation during the tequila crisis, and significant unwinding of the carry
trade during the 2007–2008 credit crisis. The grey blocks mark 2004 (for which we
have no data).
119
Chapter 5 | Dynamic Communities in the FX Market
120
5.8 | Major community changes
121
Chapter 5 | Dynamic Communities in the FX Market
The most important effect of the credit crisis on the FX market during the period
2005–2008 was its impact on the carry trade. The carry trade consists of selling
low interest rate funding currencies such as the JPY and CHF and investing in high
interest rate investment currencies such as the AUD and NZD. It yields a profit if the
interest rate differential between the funding and investment currencies is not offset
by a commensurate depreciation of the investment currency [59]. The carry trade is
one of the most commonly used FX trading strategies and requires a strong appetite
for risk, so the trade tends to “unwind” during periods in which there is a decrease in
available credit. A trader unwinds a carry trade position by selling his/her holdings
in investment currencies and buying funding currencies.
One approach to quantifying carry trade activity is to consider the returns that
can be achieved using a carry trade strategy. In Fig. 5.11(b) we show the cumulative
return index Υ for a common carry trade strategy. We consider a strategy in which
one buys equal weights of the three major currencies with the highest interest rates
and sells equal weights of the three currencies with the lowest interest rates. This
is a dynamic trading strategy because the relative interest rates of currencies change
over time. For example, consider the situation in which the interest rate of currency
A (which initially has the third highest interest rate) decreases below the rate of
currency B (which initially has the fourth highest interest rate). In order to maintain
the strategy of only holding the three currencies with the highest interest rates at any
time, one would re-balance the carry portfolio by selling the holding of currency A
and buying currency B. The frequency at which such re-balances occur will depend
on the frequency at which the relative interest rates change. The returns from a carry
strategy like this are widely seen by market participants to provide a good gauge of
carry trade activity. Large negative returns result in large decreases in Υ which are
therefore likely to indicate significant unwinding of the carry trade.
In Fig. 5.11(a) we focus on the period 2005–2008 from Fig. 5.9(c). Again, large
spikes indicate significant changes in the community configuration over a single time
step. Figure 5.11(a) shows that a significant community reorganization occurred on
15/08/07 and in Fig. 5.10(c) we show the observed communities before and after this
date. This community change is a result of massive unwinding of the carry trade.
Figure 5.11(b) shows that leading up to 15/08/07 there was some unwinding of the
carry trade so the initial configuration includes a community containing exchange
rates of the form AUD/YYY, NZD/YYY, and XXX/JPY (which all involve one of
the key carry-trade currencies). In Fig. 5.11(b) it is also clear that following this
date there is a sharp increase in carry unwinding. The second community partition
122
5.8 | Major community changes
in Fig. 5.10(c) highlights this increase as the carry community increases in size by
incorporating other XXX/JPY rates as well as some XXX/CHF and XXX/USD
rates. The presence of a large number of exchange rates involving one of the key
carry-trade currencies in a single community clearly demonstrates the significance of
the trade over this period. Importantly, some of the exchange rates included in the
carry community are also somewhat surprising and provide insights into the range of
currencies used in the carry trade over this period.
0.25
a)
0.2
0.15
V̂
0.1
0.05
0
130
b)
120
110
ϒ
100
90
27/01/05 15/08/05 03/03/06 19/09/06 07/04/07 24/10/07 11/05//08 27/11/08
date
Figure 5.11: (a) Normalized variation of information between the community con-
figuration at consecutive time steps for 2005–2008. The horizontal lines show (from
bottom to top) the mean of V̂ and 1, 2, 3, and 4 standard deviations above the mean.
The red vertical line in (a) shows the 15/08/07 when there was a marked increase in
unwinding of the carry trade. (b) Carry trade index Υ. The vertical line again shows
15/08/07 and the shaded blocks (from left to right) Q3 2007, Q4 2007, Q1 2008, and
Q4 2008.
The analysis above demonstrates that one can identify major changes in the cor-
relation structure of the FX market by finding large values of V̂ between time steps.
Having identified significant changes, one can gain a better understanding of the na-
ture of these changes and potentially also gain insights into trading changes taking
place in the market by investigating the adjustments in specific communities. We have
discussed three examples in which the observed changes are obviously attributable
to a major FX market event. However, there are also a number of time steps when
significant community reorganizations occur for which the cause is much less obvious,
123
Chapter 5 | Dynamic Communities in the FX Market
and the analysis of dynamic communities might help shed light on related market
changes.
bi − b̄Ci
κbi = , (5.19)
σCb i
where b̄Ci is the average of b over all nodes in Ci and σCb i is the standard deviation
of b in Ci . The quantity κbi indicates the importance of node i to the spread of
information compared with other nodes in its community.5 The positions of nodes
in the (κb , κy ) plane thereby illuminate the roles of the associated exchange rates in
the FX market and provide information that cannot be gained by simply considering
individual exchange rate time series.
We remark that the methods are robust with respect to the choice of measures used
to construct the parameter plane: we obtain similar results using other notions, such
as dynamical importance [256] instead of the betweenness centrality and the within-
community strength z-score [139] instead of the projected community centrality.
124
5.9 | Visualizing changes in exchange rate roles
USD/DEM (and then EUR/USD after the introduction of the euro) regularly had
the strongest connection to its community from 1991-2003, but EUR/XAU was more
strongly connected to its community from 2005–2008. The importance of USD/DEM
and EUR/USD is unsurprising given that these rates had the highest daily trading
volume [117]. This provides a reality check that the methods uncover useful in-
formation about the roles of minor exchange rates. Other exchange rates, such as
NOK/SEK and AUD/NZD, were less influential within their communities but were
very important for the transfer of information around the network.
The (κb , κy ) plots also highlight exchange rates that play similar roles in the
FX market. For example, exchange rates formed from one of the major European
currencies—DEM or CHF—and one of the commodity currencies—AUD, CAD, and
NZD (or the commodity XAU)—are located close together in the upper left quadrant
of the (κb , κy ) plane for 1991–2003. This prominent similarity is not present for
2005–2008.
125
Chapter 5 | Dynamic Communities in the FX Market
1.5
a)
1
0.5
0
y
κ
USD/DEM
−0.5
USD/CHF
{CHF,DEM}/commodity
DEM/CHF
−1 NOK/SEK
AUD/XAU
AUD/NZD
−1.5 USD/XAU
USD/AUD
other
−2
−1 −0.5 0 0.5 1 1.5 2
b
κ
1.5
EUR/USD
USD/CHF
1 USD/XAU
{CHF,EUR}/commodity
EUR/CHF
0.5 AUD/NZD
AUD/CAD
other
0
y
κ
−0.5
−1
−1.5
b)
−2
−1 −0.5 0 0.5 1 1.5 2
b
κ
1.5
EUR/USD
EUR/XAU
EUR/CAD
1
EUR/CHF
AUD/NZD
USD/CAD
0.5 CHF/XAU
NZD/XAU
other
0
y
κ
−0.5
−1
−1.5
c)
−2
−1 −0.5 0 0.5 1 1.5 2
b
κ
Figure 5.12: Node positions in the (κb , κy ) plane averaged over all time steps for the
periods (a) 1991–1998, (b) 1999–2003, and (c) 2005–2008. The radii of each elliptical
marker equal the standard deviations in the parameters for the corresponding node
scaled by a factor of 1/15 for visual clarity.
126
5.9 | Visualizing changes in exchange rate roles
2
USD/DEM USD/GBP USD/JPY
(EUR/USD) 1991
1
1992
1993
0
1994
1995
−1
1996
1997
−2 1998
y
κ
2 1999
GBP/CHF GBP/AUD NZD/JPY 2000
1 2001
2002
0 2003
2005
−1 2006
2007
2008
−2
−1 0 1 2 −1 0 1 2 −1 0 1 2
b
κ
Figure 5.13: Annual node role evolutions in the (κb , κy ) plane for the full period
1991–2008.
127
Chapter 5 | Dynamic Communities in the FX Market
being more central to this community. This period of higher influence coincides closely
with the period of financial turmoil during 2007–2008. The CHF is widely regarded as
a “safe haven” currency [248], so one might expect USD/CHF to behave in a similar
manner to USD/XAU. However, the CHF is also a key carry trade currency. Because
CHF is used both as a safe haven and as a carry trade currency, the USD/CHF node
does not move in the same direction as USD/XAU in the (κb , κy ) plane. Instead, the
USD/CHF exchange rate is an important information carrier during the 2007–2008
credit crisis. Over the same period, the AUD/JPY and NZD/JPY exchange rates
change from being important for information transfer to being influential within their
communities. The AUD/JPY and NZD/JPY were most influential within their com-
munity during Q3 and Q4 2007 and during Q1 and Q4 2008. Figure 5.11(b) shows
that over all of these periods there was significant carry trade activity so it is unsur-
prising that two exchange rates that are widely used for this trade should increase in
importance. This, however, is a further demonstration that the positions of exchange
rates in the (κb , κy ) parameter plane can provide important insights into the role of
exchange rates in the FX market.
2
USD/DEM GBP/USD DEM/JPY
1 Q1 95
Q2 95
0 Q3 95
Q4 95
−1 Q1 96
Q2 96
−2 Q3 96
Q4 96
y
κ
2 Q1 97
AUD/JPY AUD/NZD AUD/XAU
Q2 97
1 Q3 97
Q4 97
0 Q1 98
Q2 98
−1 Q3 98
Q4 98
−2
−1 0 1 2 −1 0 1 2 −1 0 1 2
b
κ
Figure 5.14: Quarterly node role evolutions in the (κb , κy ) plane for the period 1995–
1998.
128
5.10 | Robustness of results
2
USD/CHF USD/XAU
Q1 05
1
Q2 05
Q3 05
0
Q4 05
Q1 06
−1
Q2 06
Q3 06
−2
Q4 06
y
κ
2
Q1 07
AUD/JPY NZD/JPY
Q2 07
1
Q3 07
Q4 07
0
Q1 08
Q2 08
−1
Q3 08
−2 Q4 08
−1 0 1 2 3 −1 0 1 2 3
b
κ
Figure 5.15: Quarterly node role evolutions in the (κb , κy ) plane for the period 2005–
2008.
129
Chapter 5 | Dynamic Communities in the FX Market
conclusions are the same. We identify the same changes taking place in the FX market
whether we use the greedy algorithm or simulated annealing to minimize energy. The
fact that we obtain very similar results using different optimization techniques, despite
these techniques sampling different regions of the energy landscape, gives confidence
that the effects that we uncover are genuine and that the results are robust. We
present the results of the comparison of the computational heuristics in detail in
Appendix B.
5.11 Summary
We have demonstrated that a network analysis of the FX market is useful for visualiz-
ing and providing insights into the correlation structure of the market. In particular,
we investigated community structure at different times to provide insights into the
clustering dynamics of exchange rate time series. We focused on a node-centric com-
munity analysis that allows one to follow the time dynamics of the functional role of
exchange rates within the market, demonstrating that there is a relationship between
an exchange rate’s functional role and its position within its community. We indi-
cated that exchange rates that are located on the edges of communities are important
for information transfer in the FX market, whereas exchange rates that are located
in the centre of their community have a strong influence on other rates within that
community. We also demonstrated that the community structure of the market can
be used to determine which exchange rates dominate the market at each time step
and identified exchange rates that experienced significant changes in market role.
Our analysis successfully uncovered significant structural changes that occurred in
the FX market, including ones that resulted from major market events that did not
impact the studied exchange rates directly. We further demonstrated that community
reorganizations at specific time steps can provide insights into changes in trading
behaviour and highlighted the prevalence of the carry trade during the 2007–2008
credit and liquidity crisis. Although we focused on networks of exchange rates, the
methodology should be similarly insightful for multivariate time series of other asset
classes. Importantly, the results are robust with respect to the choice of optimization
heuristic.
130
Chapter 6
A Taxonomy of Networks
We now expand the scope of our analysis beyond financial markets and investigate
the community structure of networks from a wide variety of other fields. We have
submitted a paper based on this work for publication [P4].
6.1 Introduction
The study of networks is an interdisciplinary endeavour, with contributions from
researchers in the natural, social, and information sciences [9, 60, 217, 223]. Often
the questions asked by researchers in these different fields are quite similar; however,
techniques have sometimes had difficulty penetrating disciplinary boundaries, perhaps
because the relevance or applicability of the methods from other disciplines is not
always clear. In this chapter, we attempt to connect disciplines by introducing a
framework for clustering networks and using this technique to establish a network
taxonomy. The clustering scheme and taxonomy serve several purposes. First, if
networks from two different fields are close together in the taxonomy, this implies that
there are some similarities in their structure and that techniques from one field might
be insightful in the other field. Second, a taxonomy might help to determine how an
unstudied network might best be analyzed once its position within the taxonomy is
known. Third, the taxonomy can be used to highlight differences within disciplines.
For example, if a network that is expected to be similar to a group of networks based
on their origin is not clustered with those networks, this suggests that the network is
not typical of its family.
In aiming to cluster networks, it is necessary to consider a scale at which to
investigate their structure. Much prior research on networks has focused on micro-
scopic properties (such as the degree) or macroscopic properties (such as the average
131
Chapter 6 | A Taxonomy of Networks
path length), typically finding that most empirically observed networks have heavy-
tailed degree distributions and possess the small-world property (see Section 1.1 and
Refs. [9, 60, 217, 223]). The interpretations of both approaches often implicitly as-
sume that networks are homogeneous and have no mesoscopic structure; however,
as discussed in detail in Chapter 4 and demonstrated in Chapter 5, many networks
possess community structures. The ubiquity of heavy-tailed degree distribution and
the small-world property across a wide variety of networks from different domains
suggests that mesoscopic heterogeneities should be exploited to differentiate effec-
tively between networks. This has led some researchers to try to cluster networks
based on mesoscopic structures, e.g., [142, 210]. In Ref. [142], Milo et al. compare
the statistics of a priori specified subgraphs in networks from different fields, while
in Ref. [142] Guimerà et al. investigate the relationships between individual nodes
and communities (see Section 4.5 for a detailed discussion of these methods). Both
approaches focus on a single mesoscopic scale.
In this chapter, we introduce a framework for clustering networks by identifying
communities at multiple mesoscopic scales and create a taxonomy of networks using
this approach. In addition to the fact that mesoscopic heterogeneities enable us to
differentiate between networks, we focus on mesoscopic scales because structures at
this scale have been shown to have a strong bearing on functional units in many
networks [105, 244]. In contrast to Ref. [210], we do not specify the topology or sizes
of the structure that we investigate a priori. Networks from different domains can
possess very different sizes and connectivities, which makes comparison difficult [210].
The technique that we propose therefore involves a normalization that enables us
to compare networks of significantly different sizes and connectivities.1 Using this
approach, we create a taxonomy of 714 networks from a variety of disciplines including
sociology, biology, politics, technology, and finance, and include many synthetic model
and benchmark networks. As well as creating an aggregate taxonomy, we also create
taxonomies for sub-sets of networks that represent multiple realizations of the same
type of system and temporal snapshots of time-dependent systems. In the latter case,
we demonstrate that the framework we propose can detect changes in time-ordered
sequences of evolving networks.
1
We study networks that range in size from 34 to over 40,000 nodes and possess from 0.1% to
100% of possible edges.
132
6.2 | Multi-resolution community detection
2
In Chapter 5, we summed over all i and j in the Potts Hamiltonian (see Eq. 4.3). In contrast,
in this chapter, we only sum over all i 6= j. We explain the reason for this difference in Section D.1.
133
Chapter 6 | A Taxonomy of Networks
where ǫ > 0 is a small number that ensures that all links are antiferromagnetic at
resolution λ = Λmax . In other words, Λmin is the largest value of the resolution
parameter λ for which the network still forms a single community. However, note
that this is not necessarily the minimum non-zero value of Λij . We do not simply
sweep over the interval [Λ∗ , Λmax ] because for some sparse networks η > 1 at Λ∗ ,
whereas for fully-connected networks η does not become greater than one until a
resolution λ ≫ Λ∗ . By sweeping λ from Λmin to Λmax , and exploring the full range
of partitions from η = 1 to η = N, we ensure that the profiles are comparable for
different networks.
134
6.2 | Multi-resolution community detection
plateaus is different for different networks; if one wishes to use plots of summary
statistics versus resolution to compare networks, the value of f ought to be constant.
Therefore, this seems like an unsuitable solution.
1 800
700
0.8
600
0.6 500
P(Λij≤x)
η
400
0.4
300
200
0.2
100
0 0
1 100 1000 8000 0 1000 2000 3000 4000 5000 6000 7000 8000
Λij λ
Figure 6.1: A demonstration using the Caltech Facebook network [295] of the prob-
lem of using the resolution parameter for networks with low-degree nodes that are
connected to each other. The left figure shows the cumulative distribution of Λij
values P (Λij ≤ x); the majority of Λij are less than 100, but there are a few nodes
with Λij > 8000. The right figure shows the number of communities as a function
of λ and demonstrates that the presence of low strength nodes that are connected to
each other results in a long plateau. Only a few interactions then dominate the plot
and obscure the structure at smaller λ.
135
Chapter 6 | A Taxonomy of Networks
where ℓA (λ) is the total number of antiferromagnetic interactions (Jij < 0) in the
system for the given value of λ and ℓA (Λmin ) is the largest number of antiferromag-
netic interactions for which the network still forms a single community. The effective
number of antiferromagnetic interactions ξ(λ) is therefore the number of antiferro-
magnetic interactions in excess of ℓA (Λmin) (normalized to the unit interval) and is a
monotonically increasing function of λ.
To simplify the discussion of the fraction of antiferromagnetic links, it is also
useful to divide the elements of the adjacency matrix A into links (Aij > 0) and non-
links (Aij = 0). Based on the values Λij , we further distinguish between two types
of links: links with 0 < Λij ≤ Λmin are called Λ− -links, and links with Λij > Λmin
are called Λ+ -links. The sum of the number of Λ− -links and Λ+ -links is then equal
to L, the number of links in the network.5 When λ = Λmin all of the Λ− -links are
antiferromagnetic, but the network nevertheless consists of a single community.
We illustrate the differences between the different types of links in Fig. 6.2 in which
we show examples of the cumulative distributions of links P (Λij ≤ x) for a fully-
connected, weighted network and for an unweighted network. For the unweighted
network, Λmin < Λ∗ , so the network does not possess any Λ− -links. As one increases
the resolution parameter, the network begins to break up into communities before
any of the Λ+ -links become antiferromagnetic. In contrast, for the fully-connected,
weighted network Λ∗ < Λmin .
In the definition of ξ that we select, we sweep over the values λ ∈ [Λmin , Λmax ], so that
the number of communities varies between 1 and N. Although the regime λ < Λmin
affects the energy H(λ) (see Eq. 6.1), there are no further changes in the partition into
communities and, consequently, only the region λ ∈ [Λmin, Λmax ] is interesting. The
normalization in our definition of ξ accounts for the existence of antiferromagnetic
Λ− -links, which do not cause the network to break up into communities and ensures
that 0 ≤ ξ ≤ 1. Note that ξ is equal to the fraction of antiferromagnetic Λ+ -links.
By working in terms of ξ rather than λ, we ensure that interactions that require
a large resolution to become antiferromagnetic do not dominate plots of community
summary statistics. The existence of such interactions also implies that we do not
necessarily sweep λ uniformly over the interval [Λmin, Λmax ]. As demonstrated in
Fig. 6.1, some networks have several orders of magnitude between Λmin and Λmax and
most interactions become antiferromagnetic at λ ≈ Λmin . To ensure that ξ is sampled
5
In an unweighted network L = m, the total link weight in the network.
136
6.3 | Mesoscopic response functions
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
P(Λ ≤x)
P(Λ ≤x)
ij
ij
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 1 2 3 0 20 40 60 80 100
Λ Λ
ij ij
Figure 6.2: The cumulative distribution P (Λij ≤ x) for: (left) the 100th U.S. House
of Representatives roll-call voting network [203,241,306], (right) the dolphins network
[192]. We show the proportions of Λ+ and Λ− links in the distributions and highlight
the position of Λ∗ , Λmin , and Λmax . For the House network, Λmin > Λ∗ , so there are
both Λ+ and Λ− links. For the dolphins network, Λmin < Λ∗ , so all links are Λ+ links
and the network begins to break up into communities at a resolution λ ≤ Λmin.
uniformally, we select the values of λ using the cumulative distribution of Λij for
Λ+ -links, instead of simply sampling λ uniformally over the interval [Λmin , Λmax ].6
In Appendix D we provide further technical details of the Potts Hamiltonian and
of the structure of the networks that we study.
137
Chapter 6 | A Taxonomy of Networks
by the resolution λ, and Λmax depends strongly on the link structure of the network.
As a result, for it to be possible to compare networks using profiles of the summary
statistics versus ξ, we need to normalize H, S, and η. Therefore, we define an effective
energy as
H(λ) − Hmin H(λ)
Heff (λ) = =1− , (6.6)
Hmax − Hmin Hmin
where Hmin = H(Λmin ) and Hmax = H(Λmax ). Similarly, we define an effective entropy
where Smin = S(λmin) and Smax = S(λmax ), and an effective number of communities
where ηmin = η(Λmin) and ηmax = η(Λmax ). In sweeping ξ from 0 to 1, the number
of communities increases from η(ξ = 0) = 1 to η(ξ = 1) = N, producing a signature
that we call the mesoscopic response function (MRF). Because Heff ∈ [0, 1], Seff ∈
[0, 1], ηeff ∈ [0, 1], and ξ ∈ [0, 1] for any network, we can compare the response
functions across networks and use the MRFs to identify groups of networks with
similar mesoscopic structures.
For a given network, at each resolution, Heff , Seff , and ηeff respectively provide
a measure of the frustration level of the spin system, the disorder in the associated
community size distribution (whether most nodes are in a few large communities or
are spread across many small communities), and the number of communities. The
MRFs indicate the way in which these quantities change as the resolution parameter
is increased – at higher resolutions there is a larger incentive for nodes to belong to
smaller communities, so communities fragment. The shapes of the MRFs (gradient,
concavity/convexity, points of inflection etc.) are the non-trivial result of many fac-
tors, including the fraction of possible edges in the network; the relative weights of
inter- versus intra-community edges; the edge weights compared with the expected
edge weights in the random null model; the number of edges that need to become
antiferromagnetic for a community to fragment; and the way in which the commu-
nities fragment (e.g., whether a single node leaves a community if an edge becomes
antiferromagnetic or a community splits in half). The effects of some of these factors
on the shapes of the MRFs can be better understood by considering some examples.
138
6.3 | Mesoscopic response functions
139
Chapter 6 | A Taxonomy of Networks
ξ = 0, η = 1 ξ = 0.2, η = 8 ξ = 0.4, η = 12
ξ = 0.6, η = 17 ξ = 0.8, η = 24 ξ = 1, η = 34
0.8
0.6
0.4
H
eff
0.2 Seff
η
eff
0
ξ
Figure 6.3: The upper half of the figure shows the Zachary karate club network
[315] for different values of the effective fraction of antiferromagnetic links ξ. The
edges are coloured blue if the corresponding interaction is ferromagnetic or red if the
interaction is antiferromagnetic. The nodes are coloured according to their community
membership. In the lower half of the figure, we show the Heff , Seff , and ηeff MRFs
and the interaction matrix J for different values of ξ. We have coloured elements of
the interaction matrix corresponding to non-links white and elements corresponding
to ferromagnetic and antiferromagnetic links blue and red, respectively.
140
6.3 | Mesoscopic response functions
The Biogrid D. melanogaster [280] and the Garfield Scientometrics citation [119]
MRFs demonstrate this effect [Fig. 6.4(e) and Fig. 6.4(m), respectively].
The MRFs for the voting network of the U.K. House of Commons over the period
2001–2005 [104] [Fig. 6.4(g)] and the roll-call voting network for the 108th U.S. House
of Representatives (2003–2004) [203, 241, 306] [Fig. 6.4(q)] also reveal that sharp in-
creases in Heff can be accompanied by small changes in ηeff and Seff . This observation
can also be explained by considering the distribution of Λij . If the Λij distribution
is multi-modal, there can be a large difference between consecutive Λij values. A
large increase in λ is then needed to increase ξ, which leads to a large change in Heff .
However, because this only results in a single additional antiferromagnetic interac-
tion, the change in ηeff is small. We discuss the distribution of Λij for U.S. House of
Representative roll-call voting networks in more detail in Section 6.9.1.3.
(a) DIP: (b) NYSE: (c) STRING: (d) BA: (e) Biogrid: (f) Human brain cortex:
C. elegans 1980−1999 C. elegans (500,1) D. melanogaster participant C
1
0.5
(g) U.K. House of Commons (i) ER: (j) LF benchmark: (k) Fractal: (l) Fungal:
voting: 2001−2005 (h) Dolphins (500,75) (1000,15,50,05,01,2,2,2) (10,2,8) (17,8)
1
Heff
0.5 Seff
ηeff
0
(m) Garfield: (n) Zachary (o) Metabolic: (q) Roll call: (r) WS:
Scientometrics citations karate club DR (p) U.S. airlines U.S. House 108 (100,4,10)
1
0.5
0
0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1
ξ
Figure 6.4: Example MRFs. The curves show Heff (red), Seff (blue), and ηeff (green)
as a function of the effective fraction of antiferromagnetic links ξ for the following
networks: (a) DIP: C. elegans [262,310] (b) New York Stock Exchange (NYSE): 1980–
1999 [229] (c) STRING: C. elegans [155] (d) Barabási-Albert (BA): (500,1) [26] (e)
Biogrid: D. melanogaster [280] (f) Human brain cortex: participant C [144] (g) U.K.
House of Commons Voting: 2001–2005 [104] (h) Dolphins [192] (i) Erdős-Rényi (ER):
(500,75) [86] (j) LF benchmark: (1000,15,50,05,01,2,2,2) [180] (k) Fractal: (10,2,8)
[279] (l) Fungal: (17,8) [33, 113, 114, 288] (m) Garfield: Scientometrics citations [119]
(n) Zachary karate club [315] (o) Metabolic: DR [156] (p) U.S. airlines [29, 77] (q)
Roll call: U.S. House 108 [203, 241, 306] (r) Watts-Strogatz (WS): (100,4,10) [305].
See Table C.1 for more details on the networks.
.
141
Chapter 6 | A Taxonomy of Networks
Figure 6.5 shows that for a fixed number of nodes N, as the fraction of possible
edges fe is increased, the maximum value of ηeff at ξ = 0 decreases. This can be
explained by considering the number of non-links for each network (i.e., the number
of elements of the adjacency matrix for which Aij = 0). For any resolution λ > 0,
the interaction strength Jij between pairs of spins joined by non-links is less than
zero, so the spins seek to align in different spin-states (join different communities).
For some networks, as the resolution λ is increased, the negative interaction strength
between nodes joined by non-links can become so strong that the network breaks up
into communities before any of the links (Aij > 0) become antiferromagnetic (i.e., for
some networks η(ξ = 0) ≥ 2). This effects explains the different levels of ηeff (ξ = 0)
for different values of fe . For a set of ER networks with the same number of nodes
N, networks with smaller fractions of possible edges fe possess more non-links than
networks with higher fe ; this results in more negative elements in the interaction
matrix J for λ > 0, which in turn causes networks with lower fe to break up into
more communities at ξ = 0 than networks with higher fe . Hence, ηeff (ξ = 0) reaches
a higher level for networks with smaller fractions of possible edges. For example, in
Fig. 6.5, for networks with N = 50 nodes, when fe = 0.25 the maximum value of
η(ξ = 0) is 15, and when fe = 0.75 the maximum value is 4; similarly, for networks
with N = 1000 nodes, when fe = 0.25 the maximum value of η(ξ = 0) is 176, and
when fe = 0.75 the maximum value is 48.
The number of communities at ξ = 0 also affects the energy and entropy MRFs:
for larger values of fe , because there are fewer communities at ξ = 0, the Seff and Heff
MRFs reach lower levels.
8
Strictly, we generate ER networks with different probabilities for connecting each pair of nodes,
but this probability is generally equal (or almost equal) to the fraction of possible edges present.
142
6.3 | Mesoscopic response functions
0.5
1
Heff
fe = 0.5
0.5 Seff
ηeff
0
1
fe = 0.75
0.5
0
0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1
ξ
Figure 6.5: A comparison of the Heff , Seff , and ηeff MRFs for Erdős-Rényi networks
with different numbers of nodes N and fractions of possible edges fe .
Figure 6.5 also shows that as the number of nodes increases, for the same fraction of
possible edges, the maximum value of ηeff at ξ = 0 decreases. For larger values of N,
the network breaks up into more communities at ξ = 0 than for smaller N; however,
the number of communities as a fraction of the number of nodes in the network is
lower, hence ηeff (ξ = 0) is lower. For example, for networks with fe = 0.25, when
N = 50 the maximum value of η(ξ = 0) is 15, which corresponds to η/N = 0.3, and
when N = 1000 the maximum value is 176, which corresponds to η/N = 0.18.
In contrast to the ηeff MRFs, for larger values of N at the same value of fe Seff
reaches a higher value at ξ = 0. The entropy can be considered as the uncertainty in
the community membership of a particular node. For a larger number of communities
there is generally greater uncertainty in the community membership of a randonly
chosen node, which results in a higher entropy and is consistent with the observation
that Seff (ξ = 0) is higher for larger N. However, because we normalize the entropy
using the transformation Seff = (S − Smin )/(Smax − Smin ) = S/ log N (see Eq. 6.7),
more precisely, the higher value of Seff at ξ = 0 for higher N indicates that there is
greater uncertainty in the community membership of a node relative to the maximum
possible uncertainty (which occurs when all nodes are in singleton communities and
is given by Smax = log N).
143
Chapter 6 | A Taxonomy of Networks
Finally, Fig. 6.5 shows that for larger N the maximum value of Heff at ξ = 0
is higher. This is again explained in part by the fact that networks with larger
numbers of nodes fragment into more communities at ξ = 0 than smaller net-
works. Recall from Eq. 4.3 that the Hamiltonian of the Potts spin glass is given
P P
by H = i6=j Jij δ(Ci , Cj ) = − i6=j (Aij − λPij ) δ(Ci , Cj ). The Kronecker delta
δ(Ci , Cj ) means that the interaction energies Jij = (Aij − λPij ) are only summed
over nodes that belong to the same community. Therefore, as the networks fragment,
the summation includes fewer terms so H becomes progressively less negative and
Heff increases. The higher values Heff at ξ = 0 for networks with more nodes can be
further explained by considering the distribution of Λij for the different networks. In
Fig. 6.6, we show that for larger N the distribution of Λij values is concentrated in a
sharper peak, which means that the resolution λ only needs to be swept over a small
range of values for all of the links to become antiferromagnetic. The small range of λ
also means that there is a relatively small change in energy as the network breaks up
into N communities, which explains the small difference in energy that we observe
between ξ = 0 and ξ = 1 in the Heff MRFs for larger networks in Fig. 6.5.
0 0 0 0
2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10
normalized frequency
0 0 0 0
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
ER: (50,75) ER: (100,75) ER: (500,75) ER: (1000,75)
0.2 0.2 0.2 0.2
0 0 0 0
1 1.5 2 1 1.5 2 1 1.5 2 1 1.5 2
Λij
Figure 6.6: A comparison of the distribution of Λij values for Erdős-Rényi networks
with different numbers of nodes N and fractions of possible edges fe .
144
6.3 | Mesoscopic response functions
145
Chapter 6 | A Taxonomy of Networks
2. We examine the case in which each increase in η results from the largest com-
munity splitting in half. For example, at η = 2 we assume that each community
contains N/2 nodes; at η = 3, we assume that there are two communities con-
taining N/4 nodes and a third community containing N/2 nodes; at η = 4, we
assume that each community contains N/4 nodes; and so on.10
To plot the MRFs we assume that the ξ are uniformly distributed over the interval
[0, 1] such that the ith value is given by
i−1
ξ(i) = , (6.11)
N −1
where i = 1, . . . , N. We show in Section 6.3.1 that the number of communities
increases to η > 1 at ξ = 0 for many networks, so for each splitting regime we
examine two behaviours for the MRFs at ξ = 0:
We create MRFs that represent the second type of behaviour by setting the first ι
elements of the ξ vector to zero; increasing ι results in the MRFs reaching a higher
values at ξ = 0.
In Fig. 6.7, we show synthetic MRFs for networks with N = 500 nodes and
ι = 20.11 For all of the curves where we assume that each increase in η results
from a single node leaving the largest community, the Seff MRF closely tracks the ηeff
MRF. For each example in which increases in η result from the largest community
splitting in half, the entropy increases faster than in the equivalent MRF for single
nodes splitting from the largest community. This is because in the former case there
is greater uncertainty in the community membership of individual nodes. Figure 6.7
also demonstrates that for the fragmentation mechanism in which communities split in
10
When splitting the k th community into two, if nk /2 is not an integer (where nk is the number
of nodes in the k th community), we assume that one of the communities contains ⌊nk /2⌋ nodes and
that the other community contains ⌈nk /2⌉ nodes. If two communities contain the same number of
nodes, we choose one at random to split – this choice has no effect on the resulting MRF.
11
We have also investigated networks with different numbers of nodes N and observe similar
differences in the MRFs for different sized networks to those described in Section 6.3.2 for ER
networks.
146
6.3 | Mesoscopic response functions
half the MRFs have very different shapes for the different assumptions. For example,
there is a plateau in some, but not all, of the Seff MRFs and there is a large variation
in the amount by which the Seff MRFs increase at ξ = 0.
This is just a simple demonstration of how different fragmentation processes lead
to different shaped MRFs. For real-world networks, the community splitting mech-
anism is likely to be somewhere between these two extreme cases, with single nodes
leaving communities for some changes in ξ and communities splitting more equally
at other values.
A B C D
1
linear
0.5
1
quadratic
ηeff
0.5 S
eff
1
cubic
0.5
0
0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1
ξ
Figure 6.7: Synthetic MRFs for ηeff and Seff . We assume that the ηeff response
functions are either linear, quadratic, or cubic. We also consider that: (A) Each
increase in the number of communities η results from a single node leaving the largest
community. (B) Again, each increase in η results from a single node leaving the
largest community, but we make the additional assumption that, as the resolution is
increased, η initially increases without there being an increase in the effective number
of antiferromagnetic interactions ξ. (C) Each increase in the number of communities η
results from the largest community splitting in half. (D) Each increase in the number
of communities η results from the largest community splitting in half and there is
an initial increase in η without an increase in ξ. We assume that each network has
N = 500 nodes.
147
Chapter 6 | A Taxonomy of Networks
Similarly, for the effective entropy and effective number of communities, the distances
are given by Z 1
j
dSij = i
|Seff − Seff |dξ (6.13)
0
and Z 1
dηij = i
|ηeff j
− ηeff |dξ. (6.14)
0
148
6.4 | Distance measures
Figure 6.8: A comparison of the MRFs for the NYSE (1984–1987) [229] and fun-
gal (4,8) networks. For each figure, the shaded area between the curves equals the
distance between the networks for the corresponding measure. In this example, the
. . .
distance are: dH = 0.1594, dS = 0.5496, and dη = 0.4524.
η
distances dH S
ij , dij , and dij . From these plots, it is clear that the most significant cor-
relation is between dSij and dηij and the linear correlation between these two measures
.
is only r(dSij , dηij ) = 0.58. The correlations between the other pairs of distances are
. H η .
r(dH S
ij , dij ) = 0.36 and r(dij , dij ) = 0.24. None of these correlations are sufficiently
high to justify excluding one of the summary statistics, so we use all three.
149
Chapter 6 | A Taxonomy of Networks
dη
d
0.4 0.4 0.4
0 0 0
0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
H
d dS dH
Figure 6.9: Scatter density plots showing the correlation between the distances
η
measures dH S
ij , dij , and dij . The linear correlations r between the distances are:
. η . H η .
r(dH S S
ij , dij ) = 0.36, r(dij , dij ) = 0.58, and r(dij , dij ) = 0.24.
. . .
to the unit interval. The PC coefficients are wH = 0.24, wS = 0.79, and wη = 0.57.
The first component accounts for 69% of the variance of the system, so the distances
Dp provide a reasonable single-variable representation of the distances DH , DS , and
Dη .
which weights each matrix element with its distance to the diagonal (X ∈ {dH , dS , dη , dp }).
Figure 6.11 also suggests that the distance dSij better separates the networks than the
η
distances dH S
ij and dij and it appears from this figure that dij might separate the net-
works better than the PCA distance dpij . However, we demonstrate in Section 6.5.4
that this is not the case: the PCA distance dpij provides the best measure that we
have investigated for separating the networks into categories.
150
6.5 | Clustering networks
0.1 0.1
normalized frequency
normalized frequency
0.08 0.08
0.06 0.06
0.04 0.04
0.02 0.02
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
H S
d d
0.1
normalized frequency
normalized frequency
0.06
0.08
0.06 0.04
0.04
0.02
0.02
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
η
d dp
η p
Figure 6.10: Distributions of the distances dH S
ij , dij , dij , and dij .
The block structure of the distance matrices in Fig. 6.11 suggests the existence
of different clusters of networks. However, for these clusters to be meaningful it is
important that the distance measures we define for comparing networks are robust
to small perturbations in network structure. Because many networks are obtained
empirically, it is expected that the network data will contain false positive and nega-
tive links; that is, there will be links erroneously included in the network that do not
exist, and links that do actually exist will be erroneously omitted from the network.
To test the robustness of our distance measures with respect to such false posi-
tive and negatives, we recalculate the MRFs for a subset of unweighted networks in
which some percentage of the links have been rewired. We investigate two rewiring
mechanisms: one in which the degree distribution and connectivity of each network
is maintained and another in which only the connectivity is maintained. We provide
details of the analysis of the sensitivity of the distance measures to false positive and
negative links in Appendix E. We find in both cases that the block structure in the
distance matrices is robust to random perturbations of the networks. This implies
that the MRF distance measures we define in Section 6.4 are robust and can be used
to identify networks with similar mesoscopic structures across multiple scales.
151
Chapter 6 | A Taxonomy of Networks
H S
D D
0.5 0.6
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
η p
D D
1
0.8
0.7
0.8
0.6
0.5 0.6
0.4
0.4
0.3
0.2
0.2
0.1
0 0
152
6.5 | Clustering networks
153
Chapter 6 | A Taxonomy of Networks
Table 6.1: Network categories and the number of networks assigned to each category.
See Table C.1 for the identities of the networks.
154
6.5 | Clustering networks
1 XX p
tij = dave (C, C ′ ) = dij , (6.17)
|C||C ′ | i∈C
j∈C
′
The cophenetic correlation, which measures how well each dendrogram preserves the
pairwise distances between the networks, is defined as [276]
p ¯p tij − t̄
P
i<j dij − d
ζ = rh , (6.19)
2 ih P 2 i
P p ¯
i<j dij − d i<j tij − t̄
p
where d¯p is the mean of the distances dpij and t̄ the mean value of the tij .
The cophenetic correlations for the different linkage clustering algorithms are:
. . .
ζsing = 0.65, ζave = 0.78, and ζcomp = 0.62. This implies that dendrograms con-
structed using average linkage clustering preserve the distances in Dp better than
those constructed using the other clustering techniques, so we use average linkage
clustering to construct all dendrograms in the remainder of this chapter.
155
Chapter 6 | A Taxonomy of Networks
156
6.5 | Clustering networks
Synthetic
0.5 Social
Facebook
Political: voting
0.4 Political: co−sponsorship
Political: committee
Protein interaction
0.3 Metabolic
dave
Brain
H
Fungal
0.2 Financial
Language
Collaboration
0.1 WWW
Electronic circuit
Citation
0 Trade
Other
Figure 6.12: Dendrogram for the 270 networks constructed using the distance DH
and average linkage clustering. We order the leaves of the dendrogram to minimize the
distance between adjacent nodes and colour the leaves to indicate the type of network.
The vertical scale in the dendrogram is set to the interval [0, 0.55] to facilitate a visual
comparison with the DS and Dη dendrograms in Figs. 6.13 and 6.14. The distance
dH
ave at which clusters combine is given by Eq. 6.17.
Synthetic
0.5 Social
Facebook
Political: voting
0.4 Political: co−sponsorship
Political: committee
Protein interaction
0.3 Metabolic
dave
Brain
S
Fungal
0.2 Financial
Language
Collaboration
0.1 WWW
Electronic circuit
Citation
0 Trade
Other
Figure 6.13: Dendrogram for the 270 networks constructed using the distance DS and
average linkage clustering. We order the leaves of the dendrogram to minimize the
distance between adjacent nodes and colour the leaves to indicate the type of network.
The vertical scale in the dendrogram is set to the interval [0, 0.55] to facilitate a visual
comparison with the DH and Dη dendrograms in Figs. 6.12 and 6.14. The distance
dSave at which clusters combine is given by Eq. 6.17.
157
Chapter 6 | A Taxonomy of Networks
Synthetic
0.5 Social
Facebook
Political: voting
0.4 Political: co−sponsorship
Political: committee
Protein interaction
0.3 Metabolic
dave
Brain
η
Fungal
0.2 Financial
Language
Collaboration
0.1 WWW
Electronic circuit
Citation
0 Trade
Other
Figure 6.14: Dendrogram for the 270 networks constructed using the distance Dη and
average linkage clustering. We order the leaves of the dendrogram to minimize the
distance between adjacent nodes and colour the leaves to indicate the type of network.
The vertical scale in the dendrogram is set to the interval [0, 0.55] to facilitate a visual
comparison with the DH and DS dendrograms in Figs. 6.12 and 6.13. The distance
dηave at which clusters combine is given by Eq. 6.17.
A visual comparison is a good starting point for assessing the effectiveness of differ-
ent distance measures at clustering networks, but it is a subjective assessment. We
therefore introduce a metric to quantify how effectively each distance measure clusters
networks of the same type.
The assignment of networks to family categories is also subjective and some of
the categories include networks of very different types (see Section 6.5.1), so it is
inappropriate to assess the effectiveness of a distance measure based on how well it
clusters all types of networks. Instead, we focus only on groups of networks that are
clustered together in any one of the dendrograms in Figs. 6.12–6.15. This includes
the following 8 categories: Facebook, metabolic, political co-sponsorship, political
committee, political voting, financial, brain, and fungal networks.
For each distance measure, we construct a dendrogram and for each level of the
dendrogram we calculate the maximum fraction of networks of a particular type that
appear in the same cluster. That is, for a particular level of the dendrogram, we
take a network category and find all clusters that contain at least one network from
that category. We then calculate the fraction of networks from that category in each
158
Synthetic
0.5 Social
Facebook
Political: voting
0.4 Political: co−sponsorship
Political: committee
Protein interaction
0.3 Metabolic
dave
Brain
p
Fungal
0.2 Financial
Language
Collaboration
159
0.1 WWW
Electronic circuit
Citation
0 Trade
Other
of the identified clusters and find the cluster that contains the maximum fraction of
networks of that type. We repeat this calculation for each network category and sum
the maximum fraction over all categories. We perform similar calculations for each
level of the dendrogram and use the value of the summation at each level to assess
the effectiveness of the different distance measures at clustering the networks. For
example, at the root of the dendrogram all of the networks are in a single cluster,
so for every type of network the maximum fraction of networks in the same cluster
is 1 and the value of the metric is 8 (the number of categories). However, as one
moves to lower levels of the dendrogram (i.e., towards the leaves of the dendrogram)
the clusters break up, so the maximum fraction of networks of each type in the same
cluster decreases. If one compares the same level for dendrograms constructed using
different distance measures, the sum of the maximum fraction of networks of each
type in the same cluster will be larger for the more effective distance measure.
8
DH
aggregate fraction of networks clustered together
S
7 D
η
D
p
6 D
0
0 50 100 150 200 250
dendrogram level
In Fig. 6.16, we compare the total fraction of networks clustered together at each
level of the dendrogram for DH , DS , Dη , and Dp . Over most of the dendrogram levels,
the PCA-distance Dp is the most effective at clustering networks of the same type,
160
6.6 | Network taxonomies
which agrees with our visual assessment of the different distances in Section 6.5.4.1.13
We therefore focus on PCA-distance dendrograms for the remainder of this chapter.
161
Synthetic
0.4
Social
Facebook
0.35
Political: voting
Political: co−sponsorship
0.3
Political: committee
Protein interaction
0.25
Metabolic
dave
0.2 Brain
p
Fungal
0.15 Financial
Chapter 6 | A Taxonomy of Networks
Language
162
0.1 Collaboration
WWW
0.05 Electronic circuit
Citation
0 Trade
Other
Figure 6.17: Dendrogram for the 714 networks constructed using the distance Dp and average linkage clustering. We order
the leaves of the dendrogram to minimize the distance between adjacent nodes and colour the leaves to indicate the type of
network.
6.6 | Network taxonomies
United Nations General Assembly. The clustering of these voting networks suggests
that there are some common features in the MRFs that represent universal properties
of the different parliaments and legislatures. We also obtain single blocks that consist
of all political committee networks and all metabolic networks.
The political cosponsorship networks are split into two blocks that are separated by
a cluster of financial and trade networks. The leftmost block of political cosponsorship
networks consists of all of the House of Representatives cosponsorship networks, and
the rightmost cluster consists of all of the Senate cosponsorship networks. There are
also several categories for which all except one or two networks are clustered together
in a contiguous block. For example, all but two of the fungal networks appear in the
same block and all but one of the Facebook networks are clustered together. The
isolated Facebook network is the Caltech network (which is the smallest network of
this type), and it appears in a cluster next to the cluster containing all of the other
Facebook networks.
There are other categories of network that do not appear in near-contiguous blocks.
For example, protein interaction networks appear in several clusters. These networks
represent interactions within several different organisms, so one might not expect
all of them to be clustered together. However, there are also examples of protein
interaction networks for the same organism in which the interactions were identified
using different experimental techniques that are not clustered together. This supports
previous work that suggests that the properties of protein interaction networks are
extremely sensitive to the experimental procedure used to identify the interactions,
e.g., [145, 313].
Social networks are also distributed throughout the dendrogram. This is unsur-
prising given the extremely broad nature of the category: the social network category
includes networks of very different sizes with links representing a rather diverse range
of social interactions (see Table C.1 for details of the different networks). The leftmost
outlying social network is the network of Marvel comic book characters, which is ar-
guably not a typical social network. Finally, the synthetic networks appear in clusters
in different regions of the dendrogram, which is again unsurprising given that many
of these graphs were developed to model systems with very different characteristics.
It is also worth highlighting the initial split of the dendrogram into two clusters.
One of the clusters contains only three networks, whereas the other cluster contains
all of the other networks. This suggests that the three networks in the smaller cluster
possess unusual mesoscopic structures. The three networks are the NCAA football
163
Chapter 6 | A Taxonomy of Networks
schedule network [61] and two fractal networks [279]. The key feature that distin-
guishes these networks from most of the other networks are their degree distributions.
For each of the networks in the small cluster all of the nodes have one of a limited
number of possible degrees; for example, in the NCAA football networks, all but one
of the nodes have a degree of 12 or 13. Such degree distributions result in MRFs that
contain several plateaus (see Section 6.3.1) and lead to large distances between these
networks and the other networks in the taxonomy.
The observations in this section demonstrate that the MRF framework is able to
cluster categories of networks that are known to have similar structures, which verifies
the effectiveness of this technique. However, sometimes networks that one might
expect to be similar are not clustered together. These outliers might correspond to
anomalous members of a class of networks, so understanding the differences in their
structure is potentially insightful.
164
6.6 | Network taxonomies
to any of the other categories. This observation is unsurprising given the tree-like
structure of fungal networks [33]. The other categories of networks within this cluster
appear to be more closely related. For example, protein interaction networks cluster
with collaboration networks, Facebook networks cluster with language networks, and
metabolic networks cluster with social networks.
It is tempting to speculate on the reasons for these similarities, but the taxonomy
needs to be treated with some caution because of the differences in the intra-class
MRFs highlighted in Fig. 6.18. Nonetheless, Fig. 6.19 does suggest that networks
that are clustered together have some similarities in their mesoscopic structures; con-
sequently, a detailed comparison of the properties of networks from related categories
might help to identify common structures that support the functions of the differ-
ent types of networks. In addition, the taxonomy in Fig. 6.19 might also help to
identify network analysis techniques that might fruitfully be applied to a particular
network. For example, if biologists have developed techniques that provide insights
into protein interactions networks, the same techniques might successfully be applied
to collaboration networks.
Figure 6.18: MRFs for all of the network categories containing 8 or more networks
(see Table 6.1). At each value of ξ, the upper curve shows the maximum value of
Heff , Seff , or ηeff for all networks in the category, the dashed curve shows the mean,
and the lower curve shows the minimum value.
165
Chapter 6 | A Taxonomy of Networks
Fungal
Protein interaction
Collaboration
Language
Metabolic
Social
Political: committee
Brain
Financial
Political: voting
Political: co−sponsorship
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55
dpave
Figure 6.19: Taxonomy of network categories based on the average intra-class MRFs.
The dendrogram is constructed using the distance Dp and average linkage clustering.
We order the leaves of the dendrogram to minimize the distance between adjacent
nodes
166
6.6 | Network taxonomies
167
Chapter 6 | A Taxonomy of Networks
Synthetic
Social
Facebook
Political: voting
ER
Political: co−sponsorship
WS
Political: committee
BA
Protein interaction
Fractal
Metabolic
Random fully−connected
Brain
Weighted
Fungal
LFR benchmark
Financial
LF benchmark
Language
H13−4 benchmark
Collaboration
LF−NG benhcmark
WWW
Non−synthetic
Electronic circuit
Citation
Trade
Other
Figure 6.20: Dendrogram for the 270 networks constructed using the distance dpij and
average linkage clustering. We order the leaves of the dendrogram to minimize the
distance between adjacent nodes. The upper colour bar indicates the type of network
and the lower colour bar shows all non-synthetic networks in grey and highlights the
position in the dendrogram of different types of synthetic network (see Appendix C
for a description of the networks).
168
6.7 | Clustering networks using other properties
weighted
unweighted
10 0.005
9.6 −0.95
log(number of nodes)
8 −2.9
7.3 −3.8
6.5 −4.8
5.7 −5.7
4.9 −6.7
4.2 −7.6
3.4 −8.6
Figure 6.21: Dendrogram for the 270 networks constructed using the distance dpij and
average linkage clustering. We order the leaves of the dendrogram to minimize the
distance between adjacent nodes. The three colour bars below the dendrogram indi-
cate (top): whether the network corresponding to each leaf is weighted or unweighted,
(middle) the number of nodes in the networks, and (bottom) the fraction of possible
edges that are present. These colour bars clearly demonstrate that although some
of the clustering in the dendrogram is attributable to these simple properties, they
cannot explain much of the observed structure.
169
Chapter 6 | A Taxonomy of Networks
(or nearly all) possible edges17 . Again, however, this property alone cannot explain
the observed clusters, as several of the weighted networks that possess nearly all
possible links do not appear in the leftmost cluster of the dendrogram. In fact,
there are many clusters in the dendrogram that contain networks with very different
fractions of possible edges. The third property we consider, the number of nodes,
again explains some of the clustering as networks with similar numbers of nodes
are clustered together in some regions of the dendrogram; however, there are also
numerous examples in which networks with the same number of nodes appear in
different clusters. Therefore, none of these simple network metrics can explain the
observed clustering.
where sup denotes the supremum, and we represent these distances in matrix form
as Dd .
In Fig. 6.22, we show the dendrogram constructed using Dd . In Fig. 6.23, we use
the metric described in Section 6.5.4.2 to compare the effectiveness of Dd and the
PCA-distance Dp at clustering groups of networks belonging to the same category
(see Section 6.5.4.2). Figure 6.23 demonstrates that the PCA-distance performs better
than the strength distribution distance. In addition, the wealth of similar heights of
the branches in the dendrogram in Fig. 6.22 indicates that the strength distribution
17
This cluster contains the networks constructed from similarity measures that we discussed in
Sections 6.6.5 and 6.6.3.
170
6.8 | Robustness of MRFs for different heuristics
clustering is not very robust because very slight methodological differences might lead
to large differences in the clusters.
1
Synthetic
0.9 Social
Facebook
0.8 Political: voting
Political: co−sponsorship
0.7 Political: committee
Protein interaction
0.6
Metabolic
ave
0.5 Brain
d
Fungal
d
0.4 Financial
Language
0.3
Collaboration
WWW
0.2
Electronic circuit
0.1 Citation
Trade
0 Other
Figure 6.22: Dendrogram for the 270 networks constructed using the distance be-
tween the strength distributions of the networks and average-linkage clustering. We
order the leaves of the dendrogram to minimize the distance between adjacent nodes
and colour them to indicate the type of network. The dendrogram demonstrates that
clustering using the strength distribution does not separate the networks into their
groups as effectively as the distance matrix Dp that we obtained using MRFs. The
distance ddave at which clusters combine is given by Eq. 6.17.
171
Chapter 6 | A Taxonomy of Networks
8
p
D
aggregate fraction of networks clustered together
d
D
7
0
0 50 100 150 200 250
dendrogram level
Figure 6.23: Comparison of the effectiveness of the PCA-distance Dp and the strength
distribution distance Dd at clustering networks of the same type. For each level of
the dendrograms shown in Figs. 6.15 and 6.22, we calculate the maximum fraction of
networks of a certain type that appear in a single cluster and we sum this fraction for
networks from the following groups: Facebook, Metabolic, Political: co-sponsorship,
Political: committee, Political: voting, Financial, Brain, and Fungal.
172
6.9 | Case studies
435 members of the House. We analyze roll-call voting for the 1st –110th Congresses,
covering the period 1789–2008. We construct networks from the roll call data [241] for
each two-year Congress as follows [306]. The roll calls for each chamber are encoded
in an N × b matrix M, in which each element Mik equals 1 if legislator i voted
yea on bill k, −1 if the legislator voted nay, and 0 otherwise. We are interested in
characterizing the similarities between legislators, so we transform M into an N × N
adjacency matrix A with elements
1 X
Aij = αijk , (6.21)
bij
k
where αijk = 1 if legislators i and j voted the same on bill k and 0 otherwise, and bij is
the total number of bills on which i and j both voted [243, 306]. The elements Aii all
equal 1, indicating the perfect similarity between each legislator’s voting record with
himself/herself. We set all Aii = 0 to remove self-edges. The matrix A, with elements
Aij in the interval [0, 1], then represents a network of weighted ties between legislators,
where the weights are determined by the similarity of their roll-call voting over a single
two-year Congress. Following Ref. [241], we only consider “non-unanimous” roll call
votes, where a roll call vote is considered “non-unanimous” if more than 3% of the
legislators are in the minority.
For each Congress, we calculate MRFs for both the House and Senate and cluster
the Congresses for each chamber by comparing the MRFs. In Figs. 6.24(a) and
6.24(b), we show dendrograms for the House of Representatives and Senate, respec-
tively. Much work on the U.S. Congress has been devoted to the extent of partisan
polarization, the influence of party on roll-call voting, and the degree to which this
has varied over time (see Refs. [203, 306] and references therein). In highly-polarized
legislatures, representatives tend to vote along party lines, so there are strong simi-
larities in the voting patterns of members of the same party, and strong differences
between members of different parties. In contrast, during periods of low polarization,
the party lines become blurred and there are greater similarities in the voting patterns
of members of different parties.
We use the notion of party polarization to understand the taxonomy of Con-
gresses shown in Figs. 6.24(a) and 6.24(b). We consider two measures of polarization.
The first uses DW-Nominate scores, a multi-dimensional scaling technique that is
very popular among political scientists [203, 241]. The DW-Nominate polarization is
173
Chapter 6 | A Taxonomy of Networks
given by the absolute value of the difference between the mean first-dimension DW-
Nominate scores for members of one party and the same mean for members of the
other party (see Refs. [203, 241] for a detailed description of DW-Nominate scores).
The problem with the DW-Nominate polarization is that it assumes a competitive
two-party system and therefore cannot be calculated prior to the 46th Congress. The
second measure we consider is the modularity Q, which was recently shown to be a
good measure of polarization [306], even for Congresses without a clear division into
parties. Modularity is given, in terms of the energy H in Eq. 6.1 as
H(λ = 1)
Q=− . (6.22)
2m
The two measures agree fairly closely on the level of polarization of each Congress for
which they can both be calculated, although there are some differences [306].
In Figs. 6.24(a) and 6.24(b), we include bars under the dendrograms that represent
these polarization measures (we have normalized both measures to the interval [0, 1]).
The bars demonstrate that (for both the House and Senate) Congresses with similar
levels of polarization, as measured using both modularity and DW-Nominate, usually
appear in the same cluster. This suggests that our MRF clustering technique groups
Congresses based on the polarization of roll call votes. We have also coloured branches
in the same cluster in the dendrogram according to the level of polarization of the
corresponding Congresses, where brown indicates highly-polarized Congresses and
blue less polarized Congresses.
We now consider in more detail, the Congresses lying within each cluster. In Figs. 6.25(a)
and 6.25(b), we show the variation in the polarization measured using both DW-
Nominate scores and modularity as a function of time. For each Congress, the height
of each stem indicates the level of polarization measured using modularity. The colour
of each stem indicates the cluster membership of each Congress in the dendrograms.
The black curve, running from the 46th Congress onwards, shows the DW-Nominate
polarization. The DW-Nominate and modularity curves suggest that the periods of
maximal polarization in the House and Senate do not correspond exactly. In both
chambers, the 104th − 110th Congresses are highly polarized (the 104th immediately
followed the 1994 “Republican Revolution” in which the Republicans earned major-
ity status in the House for the first time in more than 40 years [203]). However, the
House has a second polarization peak from the 55th − 58th Congresses, and the Senate
has a high polarization for the 46th − 51st Congresses. The MRF clustering scheme is
174
6.9 | Case studies
0.89
DW−Nominate polarization
0.78
0.67
modularity(Q)
0.56
0.44
0.33
0.22
0.11
0.89
DW−Nominate polarization
0.78
0.67
modularity(Q)
0.56
0.44
0.33
0.22
0.11
(b) Senate
Figure 6.24: Dendrograms for (a) House of Representatives and (b) Senate roll-call
voting networks for the 1st –110th Congresses (covering the period 1789–2008). Each
leaf in the dendrograms represents a single Congress. The upper colour bar below the
dendrograms show the polarization of the Congresses measured using modularity. The
lower colour bar shows the polarization measured using DW-Nominate scores. The
DW-Nominate polarization assumes a competitive two-party system and therefore
cannot be calculated prior to the 46th Congress. We do not include rectangles at the
leaf nodes corresponding to these earlier Congresses in the DW-Nominate colour bar.
We also colour groups of branches in the dendrogram that correspond to periods of
similar polarization (see the discussion in the text).
175
Chapter 6 | A Taxonomy of Networks
able to detect these differences. In Figs. 6.25(a) and 6.25(b), the brown stems high-
light a group of Congresses that lie within the same cluster in the dendrogram. The
clusters for both the House and Senate closely match the periods of high polarization
identified using modularity and DW-Nominate.
For both the House and Senate, the 104th − 110th and 55th Congresses are iden-
tified in the high-polarization cluster. As mentioned above, this first set corresponds
to a period of high polarization following the 1994 elections. The 55th Congress cor-
responds to a period when a third party known as the Populist Party was strong.
There are also several other Congressional sessions that are part of the highly po-
larized cluster for the House but not the Senate (and vice versa). For example, the
House was also highly polarized for the 5th − 7th Congresses, which is a period fol-
lowing George Washington’s resignation during which John Adams headed a divided
Federalist Party. The same cluster includes the 38th Congress, which occurred during
the Civil War, and the 56th − 58th Congresses, when the Populist party was again
strong. The highly-polarized cluster for the Senate includes the 26th −29th Congresses.
The 25th Congress saw the emergence of the Whigs and the Democratic Party and
during this period, the abolitionist movement was also prevalent, with the Amistad
seizure occurring in 1839 during the 26th Congress. The cluster also includes the
46th − 51st Congresses (1879-1891), which occurred during the period immediately
following Reconstruction.
The MRF clustering also identifies periods during which polarization was low. We
highlight these periods in green in Figs. 6.24(a)–6.25(b). The 75th − 95th Congresses
are recognized as a period of party decline, during which fractionalization decreased
[71]. For the House, we find that the 77th − 97th Congresses are all grouped within
the same cluster. For the Senate, there is a cluster that includes the 68th − 102nd
Congresses. Although this includes all of the Congresses during the period of party
decline, interestingly, the cluster spans a much longer period. The House cluster also
includes most of the 15th − 21st Congresses, which corresponds to the period 1817–
1825 and is known as the “Era of Good Feeling” because of the decline of partisan
politics. One can make similar observations for each of the other clusters identified
in the dendrogram.
For each legislative chamber, it is also worth commenting on the Congresses that
are not assigned to a cluster. For the House, the 17th Congress forms an isolated
cluster. Using modularity, this House, which occurred during the Era of Good Feel-
ing, seems to have an extremely low polarization. For the Senate, the 2nd and 20th
Congresses form isolated clusters. The 20th Congress took place from 1827–1829 and
176
6.9 | Case studies
0.9
0.8
DW−Nominate polarization
0.7
modularity (Q)
0.6
0.5
0.4
0.3
0.2
0.1
0
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110
Congress
0.9
0.8
DW−Nominate polarization
0.7
modularity (Q)
0.6
0.5
0.4
0.3
0.2
0.1
0
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110
Congress
(b) Senate
Figure 6.25: Variation in the polarization of (a) the U.S. House of Representatives
(b) the Senate as a function of time. The height of each stem indicates the level of
polarization measured using modularity. The colour of each stem indicates the cluster
membership of each Congress in the dendrograms in Fig. 6.24(a) and 6.24(b). The
black curve shows the DW-Nominate polarization. We have rescaled the modularity
and the DW-Nominate polarization to the interval [0, 1].
177
Chapter 6 | A Taxonomy of Networks
included the election of 1828, which was a significant benchmark in the trend towards
broader voter participation.
178
179
Figure 6.26: (a) MRFs for all Houses lying in each of the main clusters in the dendrogram in Fig. 6.24(a) (b) MRFs for all
0.8 0.8
0.6 0.6
Heff
Heff
0.4 0.4
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
ξ ξ
1 1
0.8 0.8
P(Λij≤x)
P(Λij≤x)
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0.5 1 1.5 2 2.5 0.5 1 1.5 2
Λij Λij
Figure 6.27: Comparison of the 85th (low polarization) and 108th (high polarization)
House of Representatives. The upper curves show the H MRFs, and the lower curves
show the cumulative distribution of Λij values P (Λij ≤ x).
180
6.9 | Case studies
with some assemblies from the 1970s and a cluster consisting of 1946, 1948, and 1950,
which we highlight in magenta. These assemblies are all noteworthy: 1946 was the
first assembly; during the 1948 assembly, the universal declaration of human rights
was introduced; and in 1950, the “Uniting for Peace” resolution was passed. At the
right of the dendrogram, we highlight in black a group consisting of all assemblies
from 1979–1991 (excluding 1980). The end of this period marks the end of the Cold
War; the beginning marks the end of the period of Détente between the Soviet Union
and the U.S. following the former’s invasion of Afghanistan at the end of 1979. The
large blue cluster at the left of the dendrogram consists primarily of sessions from
before 1971, but also includes the sessions in 1977 and 1995.
6.9.3 Facebook
We now consider networks of the online social networking site Facebook for 100 U.S.
universities [295]. The nodes in the network represent users of the site, and the
links represent reciprocated “friendships” between users at a single-time snapshot in
September 2005. We consider only links between students at the same university,
which allows us to compare the structure of the networks at the different institutions.
These networks represent complete data sets that we obtained from Facebook. We
181
Chapter 6 | A Taxonomy of Networks
provide details of these networks in Table C.1. In contrast to the previous examples,
we are not comparing snapshots of the same network at different times but rather
are comparing multiple realizations of the same type of network that have evolved
independently.
−2.9 10
−3.2 9.8
log(number of nodes)
−4 9.2
−4.3 8.8
−4.7 8.3
−5.1 7.9
−5.4 7.5
−5.8 7.1
−6.2 6.6
Figure 6.29: Dendrogram for 100 Facebook networks of U.S. universities at a single-
time snapshot in September 2005. We order the leaves of the dendrogram to minimize
the distance between adjacent nodes. The colour bars below the dendrogram indicate:
(top) the number of nodes in the networks and (bottom) the fraction of possible edges
that are present.
In Fig. 6.29, we show the dendrogram for Facebook networks that we produced by
comparing MRFs. The two colour bars below the dendrogram indicate the number
of nodes in each network and the fraction of possible edges that are present. It
is clear that, in this case, these two simple network properties explain most of the
observed cluster structure. In Fig. 6.30, we show the distribution of MRFs for all
of the networks. For each of the properties Heff , Seff , and ηeff , the MRFs are very
similar in shape and lie within a narrow range. If we consider that the Facebook
networks range in size from 762 to 41,536 nodes and that the fraction of possible
edges present varies from 0.2% to 6%, this similarity is surprising and implies that all
of the networks have very similar mesoscopic features. However, it is quite possible
that there are heterogeneities in the mesoscopic structures of the Facebook networks
that we do not uncover using the MRF framework. Equally, there might be other
differences in these networks at the microscopic and macroscopic scales that we do
not detect.
182
6.9 | Case studies
Figure 6.30: Distribution of MRFs for 100 Facebook networks of U.S. universities at
a single-time snapshot in September 2005. At each value of ξ, the upper curve shows
the maximum value of Heff , Seff , or ηeff for all of the networks, the lower curve shows
the minimum value, and the dashed line shows the mean.
183
Chapter 6 | A Taxonomy of Networks
too small the correlation coefficients can be noisy; on the other hand, large values
of T can mask interesting market changes. For example, a value of T = 250 days
corresponds to one year, but a single year might include a market crash followed by a
period of recovery. We set N = 100 and construct networks for each half year period
over 1985–2008. This corresponds to time windows of T ≃ 125 returns, yielding
.
Θ = T /N = 1.25.19 In contrast to the U.S. Congress and U.N. voting networks, the
evolving NYSE network always has the same number of nodes, which always represent
the same stocks.
H1 2007
H1 2004
H1 1998
H2 1993
H2 1996
H1 1994
H1 1995
H1 1992
H2 1992
H2 1985
H2 1994
H1 1985
H1 1986
H2 1986
H1 1989
H2 1988
H1 1987
H1 1988
H1 1990
H2 1991
H1 1991
H2 1997
H1 1997
H2 2003
H2 1995
H1 1996
H1 1993
H1 2003
H2 2005
H2 1999
H1 2006
H1 2005
H2 1990
H2 1998
H2 2006
H2 1989
H2 2008
H1 1999
H2 2004
H2 2002
H1 2002
H2 2007
H2 2000
H1 2008
H1 2000
H2 2001
H1 2001
H2 1987
Figure 6.31: Dendrogram for 48 networks of the NYSE over the period 1985–2008.
Each network represents the correlations between the returns of 100 stocks over each
half year during this period. We order the leaves of the dendrogram to minimize the
distance between adjacent nodes. The dendrogram is clearly split into two clusters
(see the discussion in the text).
In Fig. 6.31, we show a dendrogram for the NYSE networks constructed using the
MRF method. There are two clear clusters (which we highlight in red and blue).
Networks in the red cluster appear to correspond to periods of market turmoil. For
example, the cluster contains the networks for the second half of 1987 (July–December
19
We note that we have return data for 235 stocks over this period, but we select 100 stocks at
random for our analysis so that Θ > 1. This ensures that the correlation coefficients are not overly
noisy (see Section 3.2.3 and 5.2.2). We have reproduced the analysis that we present in this section
using 10 other random stock selections and find similar results for the different selections.
184
6.9 | Case studies
1987), which includes the Black Monday stock market crash of October 1987; all of
2000–2002, following the bursting of the dot-com bubble; and the second half of 2007
and all of 2008, which includes the recent credit and liquidity crisis.
We provide support for the hypothesis that the constituents of the red cluster
are networks for periods of market turmoil by considering the NYSE composite index
(NYSECI). The NYSECI measures the performance of all common stocks listed on the
NYSE by calculating the changes in their aggregate market value adjusted to eliminate
the effects of capitalization changes, new listings, and delistings. In Fig. 6.32, we
show the NYSECI as a function of time over the period 1985–2008 and highlight the
time periods that correspond to networks in the red cluster in the dendrogram in
Fig. 6.31. In Fig. 6.33, we show the volatility of the NYSECI over each half year
period. Volatility is usually high during periods of market turmoil. If we let χ(t)
represent the value of the index at time t, we can define a log-return zχ for the index
as zχ (t) = ln[χ(t)/χ(t − 1)]. We then define the volatility νχ of the NYSECI over a
window of T returns as [73]
T
1X
νχ = | zχ (t) | . (6.24)
T t=1
Figure 6.33 demonstrates that the networks assigned to the red cluster in Fig. 6.31
correspond (with one or two exceptions) to the periods of highest volatility. As we
noted above, the time window for a single network might include periods of both high
and low volatility, so it is unsurprising that the networks assigned to the red cluster
do not correspond exactly to the half year periods with the highest volatilities. This
remains true for all choices of T .
Although we study the same stocks over the full period 1985–2008, it is worth not-
ing that many of the companies might have changed significantly during these years.
For example, some of the companies might have expanded through acquisitions, while
others might have grown organically (i.e., through increased output, sales, or both).
Both of these processes could have altered the industries in which these companies
operate and led to significant changes in the nature of the stocks that we investigate.
The fact that we uncover a cluster corresponding to periods of market turmoil (and
do not simply uncover clusters corresponding to similar time periods) despite such
non-stationarities in the data is a testament to the effectiveness of the MRF clustering
framework.
185
Chapter 6 | A Taxonomy of Networks
10000
9000
NYSE Composite Index
8000
7000
6000
5000
4000
3000
2000
1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
Date
Figure 6.32: The NYSE composite index over the period 1985–2008. The grey
blocks indicate the time periods corresponding to networks in the red cluster in the
dendrogram in Fig. 6.31.
0.025
0.02
0.015
volatility
0.01
0.005
0
H1 1985
H2 1985
H1 1986
H2 1986
H1 1987
H2 1987
H1 1988
H2 1988
H1 1989
H2 1989
H1 1990
H2 1990
H1 1991
H2 1991
H1 1992
H2 1992
H1 1993
H2 1993
H1 1994
H2 1994
H1 1995
H2 1995
H1 1996
H2 1996
H1 1997
H2 1997
H1 1998
H2 1998
H1 1999
H2 1999
H1 2000
H2 2000
H1 2001
H2 2001
H1 2002
H2 2002
H1 2003
H2 2003
H1 2004
H2 2004
H1 2005
H2 2005
H1 2006
H2 2006
H1 2007
H2 2007
H1 2008
H2 2008
Figure 6.33: The average daily volatility of the NYSE composite index over each half
year period from 1985–2008. We have coloured each stem according to the cluster
membership of the corresponding network in the dendrogram in Fig. 6.31.
186
6.9 | Case studies
20
We exclude 2003 and 2004 because we do not have data for these years.
187
Chapter 6 | A Taxonomy of Networks
−4.9
−5
−5.1
−5.2
log(νϒ)
−5.4
−5.5
−5.6
−5.7
−5.8
−5.9
2006
1992
1993
1996
1995
2007
2008
1999
1994
1997
2005
1998
2002
2000
2001
Figure 6.34: Dendrogram for 15 foreign exchange market networks over the period
1992–2008. Each network represents the correlations between the returns of 110 ex-
change rates over each year during this period. We order the leaves of the dendrogram
to minimize the distance between adjacent nodes. The dendrogram is split into two
clusters. We highlight one of these clusters in red and we highlight in green a sub-
cluster within one of these clusters for which the networks are particularly closely
related. The colour bar under the dendrogram shows the volatility in the carry trade
return index νΥ .
188
6.10 | Summary
6.10 Summary
We have developed a framework based on MRFs for comparing and clustering net-
works using their mesoscopic structures. We used this framework to create a taxon-
omy of networks and to identify groups of closely related networks and anomalous
networks that were not grouped with members of their class. In addition to clustering
networks of different types, we also created taxonomies for sub-sets of networks that
represent multiple realizations of the same type of system (Facebook networks) and
temporal snapshots of time-dependent systems (Congressional roll-call voting, U.N.
resolution, NYSE, and FX networks). For each example of time-ordered sequences of
networks, we identified periods during which there were significant changes in meso-
scopic structure.
189
Chapter 7
Conclusions
We began this thesis by investigating triangular arbitrage in the spot foreign exchange
market in Chapter 2. The aim of Chapter 2 was twofold: first, to answer a question of
interest to market practitioners, namely whether triangular arbitrage opportunities
exist; second, to demonstrate that when investigating financial markets it is essential
to ensure that one is using data that is appropriate to the question being posed. Using
executable price data, we demonstrated that although triangular arbitrage opportu-
nities appear to exist in the foreign exchange market, most of these opportunities
are very short in duration and represent very small profit opportunities. We further
showed that, when one also considers the strong competition for each arbitrage and
trading and technology costs, a trader would need to beat other market participants
to an unfeasibly large proportion of arbitrage opportunities for triangular arbitrage
to remain profitable in the long-term. These results provide a limited verification of
foreign exchange market efficiency.
In Chapter 3, we extended our analysis of financial markets beyond the foreign
exchange market and considered a variety of other markets, including equities, bonds,
and commodities. We investigated these markets by constructing correlation matri-
ces for the time series of asset returns and analyzed the temporal changes in the
structure of these correlations. The number of correlations scales as the square of the
number of assets N, so some simplification was necessary to identify the key relation-
ships. We coarse-grained the correlation matrix using principal component analysis
to identify the important relationships and analyzed the evolution of the system by
considering the changes in the principal components. We found that the percentage
of the variance in market returns accounted for by the first principal component rose
steadily between 1999 and 2008 but then increased sharply following the 2008 collapse
of Lehman Brothers. We further found that during this period the number of signifi-
cant components decreased and the number of assets making significant contributions
191
Chapter 7 | Conclusions
to the first principal component increased. To gain some insights into the relation-
ships between the different assets, we considered the correlations between the asset
return time series and the principal components. Initially, the first few components
appeared to represent different asset classes. For example, the first component corre-
sponded to bonds, the second to equities, and the third to currencies. However, later
in the studied period these relationships began to break down. In fact, by January
2010 nearly all of the studied assets were significantly correlated with the first prin-
cipal component and relatively few assets were significantly correlated with the other
components. This suggests that there are more common features across markets in
2010 than there were in 1999.
In Chapter 4, we described some of the most widely used methods for detecting
communities in networks and presented a relatively comprehensive review of the liter-
ature on communities in dynamic networks. In Chapter 5, we presented an alternative
method for investigating evolving correlation matrices in which we represented the
correlation matrices as time-dependent networks and investigated the dynamics of
network communities. We proposed a method for tracking communities through time
from the perspective of individual nodes, which removes the undesirable requirement
of determining which community at each time step represents the descendant of a
community at the previous time step. We applied this technique to foreign exchange
market networks in which each node represents an exchange rate and each edge rep-
resents a time-dependent correlation between the rates. Our analysis successfully
uncovered significant structural changes that occurred in the foreign exchange mar-
ket, including ones that resulted from major market events, and we demonstrated that
community reorganizations at specific time steps can provide insights into changes in
trading behaviour. We also considered the evolving relationship between individual
nodes and their community and demonstrated that an exchanges rate’s functional
role is related to its position within its community, with exchange rates that are im-
portant for information transfer located on the edges of communities and exchange
rate that have a strong influence on other rates within that community located at the
centre of their community.
Finally, in Chapter 6, we introduced mesoscopic response functions to characterize
the community structures of networks. Importantly, the response functions are nor-
malized so that it is possible to compare networks of significantly different sizes and
connectivities. We used the response functions to compare and cluster networks and
created a taxonomy of networks from a wide variety of different fields. The taxon-
omy contained many clusters that included networks from the same family, but there
192
7.1 | Outlook
were also examples of networks not clustered with networks that were nominally of
the same type. In this way, the framework that we propose can be used to identify
anomalous members of a family of networks.
As well as creating a taxonomy of networks from different fields, we also created
taxonomies for subsets of networks that represent multiple realizations of the same
type of system and temporal snapshots of time-dependent systems. For example, we
created a taxonomy for a time-sequence of New York Stock Exchange networks which
was split into two clusters based on the volatility of the market over the corresponding
period. We also created taxonomies of roll-call voting in the U.S. Senate and House
of Representatives in which the cluster membership of the different Congressional
sessions was determined by the levels of party polarization. Although we can explain
many of the observed clusters by properties that are unique to each type of network
in the intra-family taxonomies, the mesoscopic response function framework provides
insights in all of the different case studies.
7.1 Outlook
For much of this thesis we focused on financial networks and network communities, so
we conclude with some comments on the current state of both fields and we discuss
potential directions for future research.
In Chapter 5, we studied communites in foreign exchange market networks, but
the techniques that we presented are general and can be applied to networks for other
asset classes. In fact, the Potts methods has already been used to study communities
in a network of equities [148]. However, in Ref. [148] Heimo et al. only examine com-
munities in a static network of stocks traded on the New York Stock Exchange and do
not study longitudinal networks. A potential area for future research is, therefore, to
investigate community dynamics in evolving networks of equities. In equity markets,
stocks are usually assigned to industry sectors based on the business activities of the
companies; these sectors provide a useful tool for sorting and comparing different
companies. For example, it is often insightful to compare the performance of stocks
within the same sector to find out if any are under-performing (or over-performing)
compared with the rest of the sector. An interesting avenue for future research would
be to compare the communities identified using the Potts method with the sector
classifications to see if they line up well or if there are periods during which there are
significant differences in the classifications.
193
Chapter 7 | Conclusions
One issue with using the Potts method to investigate communities in longitudinal
equity networks is that in previous work [148] plateaus have not been observed in plots
of the number of communities as a function of the resolution parameter (see Section
5.4).1 The absence of a plateau means that it is not possible to use the approach
described in Section 5.4 to select the resolution at which to investigate community
dynamic, so other methods will need to be developed.
We have focused on financial networks constructed from correlation matrices of
asset price time series, but other types of financial networks have also been studied
(see Ref. [11] for an overview of the applications of networks in finance). For example,
networks have been used to analyze the trade relationships between nations, e.g.,
[266,275], and the credit relationships between financial institutions, e.g., [31,52,116].
The latter subject has received particular attention in the last few years as a result
of the 2007–2008 credit and liquidity crisis. During the credit crisis, difficulties in
the U.S. subprime mortgage market soon spread to debt markets all over the world,
and eventually resulted in credit drying up as banks became unwilling to lend as
freely [11]. The market turmoil clearly demonstrated the interconnectedness of the
global financial system and how this connectivity can lead to outcomes that are
difficult to forecast. As a consequence, the study of credit networks is currently one
of the most active areas of research in financial networks and is an important direction
for future research.
In the standard representation of a credit network, the nodes represent financial
institutions and the edges represent credit relationships. A key concern in the study
of such networks is how the insolvency of particular institutions affects the network;
in particular, whether the failure of individual banks is “contagious” and leads to
a systemic crisis in which a large fraction of the firms in the system fail at the
same time [31]. One of the most important questions for understanding this risk is
how connectivity affects network robustness [31, 116, 202], which is closely related to
questions asked in ecology [202]. In Ref. [12], Allen and Gale studied how the banking
system responds to contagion when banks are connected under different network
structures. They found that networks with higher connectivities are more resilient
and have a lower likelihood of widespread default because the losses sustained by one
bank are transferred to more banks through interbank agreements. However, Gai and
Kapadia [116] reached a different conclusion. They agreed that greater connectivity
1
Recall that community partitions that are robust across a range of resolutions are significant
because the communities do not break up despite an increasing incentive to do so. Communities
that persist over a large range of resolutions thus potentially represent important substructures.
194
7.1 | Outlook
reduces the likelihood of contagion, but it also means that if a problem does occur
the shocks could be on a significantly larger scale. Battiston et al. [31] make similar
observations. They found that although increased connectivity can reduce the risk
of the collapse of an individual node through risk sharing, it can also result in crises
being more severe and more frequent.
The lack of a consensus on the most robust structure for banking networks means
that this is a particularly crucial area for future research. One of the major questions
in this context is whether networks organized into communities have lower systemic
risk [31]. Ecologists have suggested that decoupling a system into relatively discrete
components can promote robustness [202]; however, this is not a simple question
because some partitions of financial networks into communities could potentially pre-
clude stabilizing effects such as mechanisms for maintaining liquidity [161]. Some
steps have been taken to investigate communities in credit networks, e.g., [195], but
there are still many unanswered questions. An interesting direction for future re-
search would be to apply the methods that we presented in Chapter 5 to directed
networks representing the evolving credit relationships between financial institutions
during the 2007–2008 credit crisis.2
In the wider study of community structure in networks, significant technical
advances have been made in recent years, and it remains a thriving area of re-
search [105,244]. Typically, the first stage of community analysis is the identification
of the communities; indeed, this is the subject of many of the papers in the literature
and a wealth of different techniques have been proposed for the algorithmic detec-
tion of communities. However, researchers have not yet agreed on which methods
are most appropriate or reliable or when particular methods should be adopted or
avoided. The problem of assessing the reliability of the output of different algorithms
is exacerbated by the fact that there is no rigorous definition of a community. The
most rigorous approach that is currently available is to identify communities using
different detection algorithms and only to consider structures that are similar across
multiple methods as meaningful [244, 289]. In this way, one can be more sure that
the identified structures are genuine features of the data and not simply byproducts
of the detection algorithm.
Even with the focus on developing community detection algorithms, very little
attention has been paid to validating the output of the detection algorithms and
trying to understand what the communities mean, what they actually look like or
2
Although we focused on undirected networks in Chapters 5 and 6, the techniques that we
introduced in these chapters can easily be extended to directed networks.
195
Chapter 7 | Conclusions
how they can be used [105, 181, 222, 244]. Some steps have now been taken towards
answering some of these questions. For example, in Ref. [295] the composition of
communities is related to the demographic characteristics of nodes and in Ref. [181]
the structure and properties of communities in a range of different networks, including
biological, social, and communication networks, have been studied and compared. In
this thesis, we have also contributed to this endeavour. For example, in Chapter 5
we used the composition of different communities to uncover changes taking place
in trading behaviour within the foreign exchange market and in Chapter 6 we used
communities to create a taxonomy of networks. However, much still remains to be
done in this direction and this is arguably the most pressing area of research if we
are to gain real insights from studies of network community structures.
There are similar open questions in the study of dynamic communities (see Sec-
tion 4.6.13). With a few exceptions, the studies of dynamic communities present a
method for detecting communities, check that the identified communities make sense,
and then stop. There are very few studies that investigate the mechanisms that drive
the community evolution or try to answer some of the fundamental questions relating
to dynamic communities, such as what community properties result in stable commu-
nities and what features of a community determine whether an individual will join (or
leave) that community. In Chapter 5, we attempted to answer some of these questions
for the foreign exchange market. For example, we demonstrated that nodes that have
strong connections with their communities tend to have more stable community rela-
tionships than nodes with weak connections. The limited progress in the study of the
properties of dynamic communities is perhaps unsurprising given the relative infancy
of the field; hence, it represents an important direction for future research [318].
Another direction for future research is the development of algorithms that can
identify overlapping communities. Most existing community detection algorithms
generate partitions in which each node is assigned to exactly one community. How-
ever, this does not reflect the structure of many real-world systems. For example,
in social networks people can belong to communities of friends, communities of work
colleagues, and family communities. Some methods have been developed that can
identify overlapping communities, e.g., [32, 234]. Perhaps the most widely used is
the clique percolation method [234] that we described in Section 4.3.1; however, as
we highlighted, this approach has limitations, so alternative techniques need to be
developed.
The methods in Refs. [32,234] allow nodes to belong to more than one community,
but do not give any indication of the community with which a node is most closely
196
7.1 | Outlook
3
Central pattern generators are the neuronal circuits that give rise to repetitive or oscillatory
patterns of muscle activity that produce rhythmic movements, such as locomotion, breathing, and
chewing [54].
197
Appendix A
Table A.1: Details of all of the financial assets studied in Chapter 3. The data that we
use in this chapter was downloaded from Bloomberg. See http://www.bloomberg.
com/ for more information on the different financial instruments.
199
Appendix A
200
Details of Financial Assets
201
Appendix B
Robustness of FX Communities to
Alternative Heuristics
In this section, we demonstrate that the results described in Chapter 5 are robust with
respect to the choice of computational heuristic used to minimize the Hamiltonian in
Eq. 4.3.
203
Appendix B
0.2
spectral
greedy
simulated annealing
0.15
normalized frequency
0.1
0.05
0
−250 −200 −150 −100 −50 0 50
H
Figure B.1: Distribution of the energy H of the optimal partition for networks over
the period 2005–2008 for different optimization algorithms.
tween the community partitions identified using the greedy and simulated annealing
algorithms. The two methods identify identical partitions for 19% of the networks;
for 83% of the networks, the partitions differ in their assignment of nodes to com-
munities by fewer than 10 nodes. Therefore, there is strong agreement between the
partitions obtained by the two heuristics, but there are also differences that warrant
further investigation.
In Section 5.8, we identified significant changes in the community configuration
by comparing changes in the scaled energy Qs (see Eq. 4.4) between consecutive time
steps and by calculating the variation of information between community partitions
at consecutive time steps (see Fig. 5.9). The correlation between Qs as a function
of time for the two heuristics is 0.99 and the correlation between the changes in Qs
is 0.93. The correlation between the variation of information between partitions at
consecutive time steps is 0.36. The scaled energy correlations are clearly extremely
high. However, there are differences in the timings of some major reorganizations
identified by the variation of information. To compare the timings of major events,
we identify time steps at which the variation of information between consecutive
partitions is more than a certain number of standard deviations larger the mean
variation of information between consecutive partitions. We find that the algorithms
identify 40% of one standard deviation events at the same time steps and 33% of
2.5 standard deviation events. The methods therefore agree reasonably well, with
one in three 2.5 standard deviation events identified at exactly the same time step.
However, the differences also suggest that one should be cautious using variation of
information to identify major community reorganizations.
204
Robustness of FX Communities
0.25
normalized frequency
0.2
0.15
0.1
0.05
0
0 0.05 0.1 0.15 0.2 0.25 0.3
V̂
205
Appendix B
two other exchange rates also join the community. The resulting merged community
is very similar to the largest community identified at the same time step using the
greedy algorithm.
Figure B.3 therefore illustrates that there are only small differences in the com-
munity configurations that are identified by the two heuristics. In fact, as Fig. B.2
shows, the two algorithms agree in the assignment of all but about ten nodes approx-
imately 80% of the time. Importantly, Fig. B.3 highlights that, even when there are
differences in the exact community configurations, the communities that are identi-
fied by the two heuristics nonetheless indicate the same changes taking place in the
FX market.
Figure B.3: Comparison of the change in community structure in one half of the
FX market network over the same period for different optimization heuristics. We
show a schematic of the communities for the period following 15/08/07, when there
was significant unwinding of the carry trade during the 2007–2008 credit and liquidity
crisis. We identified communities using (a) a greedy algorithm [44] and (b) a simulated
annealing algorithm [141]. The node colours after the community reorganization
correspond to their community before the change. If the parent community of a
community after the reorganization is obvious, we draw it using the same colour as
its parent. The nodes drawn as triangles resided in the opposite half of the network
before the community reorganization.
206
Robustness of FX Communities
207
Appendix B
2
USD/CHF USD/CHF USD/XAU USD/XAU Q1 05
1 Q2 05
Q3 05
0 Q4 05
Q1 06
−1
Q2 06
Q3 06
−2
Q4 06
y
2
κ
Q1 07
AUD/JPY AUD/JPY NZD/JPY NZD/JPY Q2 07
1
Q3 07
Q4 07
0
Q1 08
−1 Q2 08
Q3 08
−2 Q4 08
−1 0 1 2 3 −1 0 1 2 3 −1 0 1 2 3 −1 0 1 2 3
κb
Figure B.4: Comparison of the quarterly node role evolutions in the (κb , κy ) plane for
the period 2005–2008 for communities identified using a greedy algorithm [44] and
simulated annealing [141]. The white/grey shading plots show results for the greedy
algorithm and the pink/dark pink plots show results for simulated annealing.
despite these techniques sampling different regions of the energy landscape, gives con-
fidence that the effects that we uncover are genuine and that the results are robust.
In practice, the greedy algorithm is preferable to simulated annealing because of the
computational cost of the latter.
208
Appendix C
Network Details
Fractal: We generate fractal networks using the method described in Ref. [279]. We
begin by generating an isolated group of 2m fully-connected nodes, where m is
209
Appendix C
the size of the clusters. These groups correspond to the hierarchical level h = 0.
We then create a second identical group and we link the two groups with a link
density of E −h (h = 1), where the link density is the number of links out of all
possible links between the groups and E gives the connection density fall-off per
hierarchical level. We then duplicate this network and connect the two dupli-
cates at the level h = 2 with a link density E −2 . We repeat this until we reach
the desired network size N = 2n , where n is the number of hierarchical lev-
els. At each step the connection density is decreased, resulting in progressively
sparser interconnectivity at higher hierarchical levels. The resulting network
exhibits self-similar properties. We label each network “Fractal: (n, m, E)”.
210
Network Details
power law distribution with exponent γ, where the extremes of the distribution
kmin and kmax are chosen such that the mean degree is hki, and the nodes are
connected using the configuration model [212] to maintain their degree distribu-
tion. Each node shares a fraction µ of its links with nodes in other communities
and 1 − µ with nodes in its own community. The community sizes are taken
from a power law distribution with exponent β, subject to the constraint that
the sum of all of the community sizes equals the number of nodes N in the net-
work. The minimum and maximum community sizes (qmin and qmax ) are then
chosen to satisfy the additional constraint that qmin > kmin and qmax > kmax ,
which ensures that each node is included in at least one community. We label
each network “LFR: (N, hki, kmax , γ, β, µ, qmin, qmax )”.
1
In this case, all of the nodes have the same degree and each community is constrained to contain
the same number of nodes so the values of the exponent γ of the degree distribution and the exponent
β of the community size distribution are unimportant.
211
Appendix C
Table C.1: Network summary statistics. We symmetrize all networks, remove self-edges, and only consider the largest connected
component. We give the network category, whether it is weighted or unweighted, the number of nodes N in the largest connected
component, the number of edges L, the fraction of possible edges present fe = 2L/[N(N − 1)], and a reference providing details
of the data source. We highlight in bold all of the networks included in the subset of 25 networks described in Appendix E
and we colour red all of the networks included in the subset of 270 networks used to produce the results described in several
sections in Chapter 6. We indicate with an asterisk ∗ all networks used in Appendix F to test the robustness of the taxonomy
to different optimization heuristics.
ID Name Category Weighted N L fe References
1 Human brain cortex: participant A1 Brain Y 994 13,520 0.0274 [144]
2 Human brain cortex: participant A2 Brain Y 987 14,865 0.0305 [144]
3 Human brain cortex: participant B Brain Y 980 14,222 0.0296 [144]
4 Human brain cortex: participant D Brain Y 996 14,851 0.0300 [144]
5 Human brain cortex: participant E Brain Y 992 14,372 0.0292 [144]
6 Human brain cortex: participant C Brain Y 996 14,933 0.0301 [144]
7 Cat brain: cortical∗ Brain Y 52 515 0.3884 [264]
212
Network Details
53 Facebook: Haverford Facebook N 1,446 59,589 0.0570 [295]
54 Facebook: Howard Facebook N 4,047 204,850 0.0250 [295]
55 Facebook: Indiana Facebook N 29,732 1,305,757 0.0030 [295]
56 Facebook: JMU Facebook N 14,070 485,564 0.0049 [295]
57 Facebook: Johns Hopkins Facebook N 5,157 186,572 0.0140 [295]
58 Facebook: Lehigh Facebook N 5,073 198,346 0.0154 [295]
59 Facebook: Maine Facebook N 9,065 243,245 0.0059 [295]
60 Facebook: Maryland Facebook N 20,829 744,832 0.0034 [295]
Appendix C
ID Name Category Weighted N L fe References
61 Facebook: Mich Facebook N 3,745 81,901 0.0117 [295]
62 Facebook: Michigan Facebook N 30,106 1,176,489 0.0026 [295]
63 Facebook: Middlebury Facebook N 3,069 124,607 0.0265 [295]
64 Facebook: Mississippi Facebook N 10,519 610,910 0.0110 [295]
65 Facebook: MIT Facebook N 6,402 251,230 0.0123 [295]
66 Facebook: MSU Facebook N 32,361 1,118,767 0.0021 [295]
67 Facebook: MU Facebook N 15,425 649,441 0.0055 [295]
68 Facebook: Northeastern Facebook N 13,868 381,919 0.0040 [295]
69 Facebook: Northwestern Facebook N 10,537 488,318 0.0088 [295]
70 Facebook: Notre Dame Facebook N 12,149 541,336 0.0073 [295]
71 Facebook: NYU Facebook Y 21,623 715,673 0.0031 [295]
72 Facebook: Oberlin Facebook N 2,920 89,912 0.0211 [295]
73 Facebook: Oklahoma Facebook N 17,420 892,524 0.0059 [295]
74 Facebook: Penn Facebook N 41,536 1,362,220 0.0016 [295]
75 Facebook: Pepperdine Facebook N 3,440 152,003 0.0257 [295]
76 Facebook: Princeton Facebook N 6,575 293,307 0.0136 [295]
77 Facebook: Reed Facebook N 962 18,812 0.0407 [295]
214
Network Details
123 Facebook: William Facebook N 6,472 266,378 0.0127 [295]
124 Facebook: Williams Facebook N 2,788 112,985 0.0291 [295]
125 Facebook: Wisconsin Facebook N 23,831 835,946 0.0029 [295]
126 Facebook: Yale Facebook N 8,561 405,440 0.0111 [295]
127 NYSE: 1980-1999 Financial Y 477 113,526 1.0000 [229]
128 NYSE: 1980-1983 Financial Y 477 113,526 1.0000 [229]
129 NYSE: 1984-1987 Financial Y 477 113,526 1.0000 [229]
130 NYSE: 1988-1991 Financial Y 477 113,526 1.0000 [229]
Appendix C
ID Name Category Weighted N L fe References
131 NYSE: 1992-1995 Financial Y 477 113,526 1.0000 [229]
132 NYSE: 1996-1999 Financial Y 477 113,526 1.0000 [229]
133 NYSE: H1 1985 Financial Y 100 4,950 1.0000 [1]
134 NYSE: H2 1985 Financial Y 100 4,950 1.0000 [1]
135 NYSE: H1 1986 Financial Y 100 4,950 1.0000 [1]
136 NYSE: H2 1986 Financial Y 100 4,950 1.0000 [1]
137 NYSE: H1 1987 Financial Y 100 4,950 1.0000 [1]
138 NYSE: H2 1987 Financial Y 100 4,950 1.0000 [1]
139 NYSE: H1 1988 Financial Y 100 4,950 1.0000 [1]
140 NYSE: H2 1988 Financial Y 100 4,950 1.0000 [1]
141 NYSE: H1 1989 Financial Y 100 4,950 1.0000 [1]
142 NYSE: H2 1989 Financial Y 100 4,950 1.0000 [1]
143 NYSE: H1 1990 Financial Y 100 4,950 1.0000 [1]
144 NYSE: H2 1990 Financial Y 100 4,950 1.0000 [1]
145 NYSE: H1 1991 Financial Y 100 4,950 1.0000 [1]
146 NYSE: H2 1991 Financial Y 100 4,950 1.0000 [1]
147 NYSE: H1 1992 Financial Y 100 4,950 1.0000 [1]
216
Network Details
193 FX: 2006 Financial Y 110 5,995 1.0000 [204, 205]
194 FX: 2007 Financial Y 110 5,995 1.0000 [204, 205]
195 FX: 2008 Financial Y 110 5,995 1.0000 [204, 205]
196 Fungal: (11,11)∗ Fungal Y 823 954 0.0028 [33, 113, 114, 288]
197 Fungal: (11,2) Fungal Y 117 136 0.0200 [33, 113, 114, 288]
198 Fungal: (11,5) Fungal Y 526 588 0.0043 [33, 113, 114, 288]
199 Fungal: (11,8) Fungal Y 721 821 0.0032 [33, 113, 114, 288]
200 Fungal: (17,11) Fungal Y 1,205 1,469 0.0020 [33, 113, 114, 288]
Appendix C
ID Name Category Weighted N L fe References
201 Fungal: (17,2) Fungal Y 232 240 0.0090 [33, 113, 114, 288]
202 Fungal: (17,5) Fungal Y 816 874 0.0026 [33, 113, 114, 288]
203 Fungal: (17,8) Fungal Y 1,113 1,303 0.0021 [33, 113, 114, 288]
204 Fungal: (4,11) Fungal Y 2,190 2,431 0.0010 [33, 113, 114, 288]
205 Fungal: (4,2) Fungal Y 461 490 0.0046 [33, 113, 114, 288]
206 Fungal: (4,5) Fungal Y 1,380 1,476 0.0016 [33, 113, 114, 288]
207 Fungal: (4,8) Fungal Y 1,869 2,061 0.0012 [33, 113, 114, 288]
208 AIDS blogs∗ WWW N 146 180 0.0170 [132]
209 Political blogs WWW Y 1,222 16,714 0.0224 [3]
210 WWW (Stanford) WWW N 8,929 26,320 0.0007 [124]
211 Online Dictionary of Computing Language Y 13,356 91,471 0.0010 [30]
212 Online Dictionary Of Information Science Language Y 2,898 16,376 0.0039 [77, 255]
213 Reuters 9/11 news Language Y 13,308 148,035 0.0017 [158]
214 Roget’s thesaurus Language N 994 3,640 0.0074 [77, 168]
215 Word adjacency: English Language N 7,377 44,205 0.0016 [210]
216 Word adjacency: French Language N 8,308 23,832 0.0007 [210]
217 Word adjacency: Japanese Language N 2,698 7,995 0.0022 [210]
218
Network Details
263 Power grid Other N 4,941 6,594 0.0005 [305]
264 Slovenian magazine co-purchase Other Y 124 5,972 0.7831 [28]
265 Transcription: E. coli Other N 328 456 0.0085 [196]
266 Transcription: Yeast Other N 662 1,062 0.0049 [211]
267 U.S. airlines Other Y 324 2,081 0.0398 [29, 77]
268 2008 NCAA football schedule∗ Other Y 121 764 0.1052 [61]
269 Internet: autonomous systems Other N 22,963 48,436 0.0002 [219]
270 Protein: serine protease inhibitor (1EAW)∗ Other N 53 123 0.0893 [210]
Appendix C
ID Name Category Weighted N L fe References
271 Protein: immunoglobulin (1A4J)∗ Other N 95 213 0.0477 [210]
272 Protein: oxidoreductase (1AOR)∗ Other N 97 212 0.0455 [210]
273 Bill cosponsorship: U.S. House 96 Political: cosponsorship Y 438 95,529 0.9982 [108, 109]
274 Bill cosponsorship: U.S. House 97 Political: cosponsorship Y 435 94,374 0.9998 [108, 109]
275 Bill cosponsorship: U.S. House 98 Political: cosponsorship Y 437 95,256 0.9999 [108, 109]
276 Bill cosponsorship: U.S. House 99 Political: cosponsorship Y 437 94,999 0.9972 [108, 109]
277 Bill cosponsorship: U.S. House 100 Political: cosponsorship Y 439 96,125 0.9998 [108, 109]
278 Bill cosponsorship: U.S. House 101 Political: cosponsorship Y 437 95,263 1.0000 [108, 109]
279 Bill cosponsorship: U.S. House 102 Political: cosponsorship Y 437 95,051 0.9977 [108, 109]
280 Bill cosponsorship: U.S. House 103 Political: cosponsorship Y 437 95,028 0.9975 [108, 109]
281 Bill cosponsorship: U.S. House 104 Political: cosponsorship Y 439 95,925 0.9978 [108, 109]
282 Bill cosponsorship: U.S. House 105 Political: cosponsorship Y 442 97,373 0.9991 [108, 109]
283 Bill cosponsorship: U.S. House 106 Political: cosponsorship Y 436 94,820 0.9999 [108, 109]
284 Bill cosponsorship: U.S. House 107 Political: cosponsorship Y 442 97,233 0.9977 [108, 109]
285 Bill cosponsorship: U.S. House 108 Political: cosponsorship Y 439 96,104 0.9996 [108, 109]
286 Bill cosponsorship: U.S. Senate 96 Political: cosponsorship Y 101 5,050 1.0000 [108, 109]
287 Bill cosponsorship: U.S. Senate 97 Political: cosponsorship Y 101 5,050 1.0000 [108, 109]
220
288 Bill cosponsorship: U.S. Senate 98 Political: cosponsorship Y 101 5,050 1.0000 [108, 109]
289 Bill cosponsorship: U.S. Senate 99 Political: cosponsorship Y 101 5,049 0.9998 [108, 109]
290 Bill cosponsorship: U.S. Senate 100 Political: cosponsorship Y 101 5,050 1.0000 [108, 109]
291 Bill cosponsorship: U.S. Senate 101 Political: cosponsorship Y 100 4,950 1.0000 [108, 109]
292 Bill cosponsorship: U.S. Senate 102 Political: cosponsorship Y 102 5,142 0.9983 [108, 109]
293 Bill cosponsorship: U.S. Senate 103 Political: cosponsorship Y 101 5,050 1.0000 [108, 109]
294 Bill cosponsorship: U.S. Senate 104 Political: cosponsorship Y 102 5,151 1.0000 [108, 109]
295 Bill cosponsorship: U.S. Senate 105 Political: cosponsorship Y 100 4,950 1.0000 [108, 109]
296 Bill cosponsorship: U.S. Senate 106 Political: cosponsorship Y 102 5,151 1.0000 [108, 109]
297 Bill cosponsorship: U.S. Senate 107 Political: cosponsorship Y 101 5,049 0.9998 [108, 109]
298 Bill cosponsorship: U.S. Senate 108 Political: cosponsorship Y 100 4,950 1.0000 [108, 109]
299 Committees: U.S. House 101, comms. Political: committee N 159 3,610 0.2874 [242, 243]
300 Committees: U.S. House 102, comms. Political: committee N 163 4,093 0.3100 [242, 243]
301 Committees: U.S. House 103, comms. Political: committee N 141 2,983 0.3022 [242, 243]
302 Committees: U.S. House 104, comms. Political: committee N 106 1,839 0.3305 [242, 243]
303 Committees: U.S. House 105, comms. Political: committee N 108 1,997 0.3456 [242, 243]
304 Committees: U.S. House 106, comms. Political: committee N 107 2,031 0.3581 [242, 243]
305 Committees: U.S. House 107, comms. Political: committee N 113 2,429 0.3838 [242, 243]
ID Name Category Weighted N L fe References
306 Committees: U.S. House 108, comms. Political: committee N 118 2,905 0.4208 [242, 243]
307 Committees: U.S. House 101, Reps. Political: committee N 434 18,714 0.1992 [242, 243]
308 Committees: U.S. House 102, Reps. Political: committee N 436 20,134 0.2123 [242, 243]
309 Committees: U.S. House 103, Reps. Political: committee N 437 18,212 0.1912 [242, 243]
310 Committees: U.S. House 104, Reps. Political: committee N 432 17,130 0.1840 [242, 243]
311 Committees: U.S. House 105, Reps. Political: committee N 435 18,297 0.1938 [242, 243]
312 Committees: U.S. House 106, Reps. Political: committee N 435 18,832 0.1995 [242, 243]
313 Committees: U.S. House 107, Reps. Political: committee N 434 19,824 0.2110 [242, 243]
314 Committees: U.S. House 108, Reps. Political: committee N 437 21,214 0.2227 [242, 243]
315 Roll call: U.S. House 1 Political: voting Y 66 2,122 0.9893 [203, 241, 306]
316 Roll call: U.S. House 2 Political: voting Y 71 2,428 0.9771 [203, 241, 306]
317 Roll call: U.S. House 3 Political: voting Y 108 5,669 0.9811 [203, 241, 306]
318 Roll call: U.S. House 4 Political: voting Y 114 6,342 0.9846 [203, 241, 306]
319 Roll call: U.S. House 5 Political: voting Y 117 6,600 0.9726 [203, 241, 306]
320 Roll call: U.S. House 6 Political: voting Y 113 6,222 0.9832 [203, 241, 306]
321 Roll call: U.S. House 7 Political: voting Y 110 5,921 0.9877 [203, 241, 306]
322 Roll call: U.S. House 8 Political: voting Y 149 10,888 0.9875 [203, 241, 306]
221
323 Roll call: U.S. House 9 Political: voting Y 147 10,582 0.9861 [203, 241, 306]
324 Roll call: U.S. House 10 Political: voting Y 149 10,857 0.9847 [203, 241, 306]
325 Roll call: U.S. House 11 Political: voting Y 153 11,482 0.9874 [203, 241, 306]
326 Roll call: U.S. House 12 Political: voting Y 146 10,535 0.9953 [203, 241, 306]
327 Roll call: U.S. House 13 Political: voting Y 195 18,723 0.9898 [203, 241, 306]
328 Roll call: U.S. House 14 Political: voting Y 195 18,540 0.9802 [203, 241, 306]
329 Roll call: U.S. House 15 Political: voting Y 195 18,666 0.9868 [203, 241, 306]
330 Roll call: U.S. House 16 Political: voting Y 197 19,118 0.9903 [203, 241, 306]
331 Roll call: U.S. House 17 Political: voting Y 199 19,429 0.9862 [203, 241, 306]
332 Roll call: U.S. House 18 Political: voting Y 221 23,812 0.9795 [203, 241, 306]
Network Details
333 Roll call: U.S. House 19 Political: voting Y 220 23,993 0.9960 [203, 241, 306]
334 Roll call: U.S. House 20 Political: voting Y 219 23,666 0.9914 [203, 241, 306]
335 Roll call: U.S. House 21 Political: voting Y 220 23,985 0.9956 [203, 241, 306]
336 Roll call: U.S. House 22 Political: voting Y 217 23,404 0.9986 [203, 241, 306]
337 Roll call: U.S. House 23 Political: voting Y 257 32,502 0.9880 [203, 241, 306]
338 Roll call: U.S. House 24 Political: voting Y 255 32,062 0.9900 [203, 241, 306]
339 Roll call: U.S. House 25 Political: voting Y 256 32,366 0.9916 [203, 241, 306]
340 Roll call: U.S. House 26 Political: voting Y 255 32,067 0.9902 [203, 241, 306]
Appendix C
ID Name Category Weighted N L fe References
341 Roll call: U.S. House 27 Political: voting Y 257 32,743 0.9953 [203, 241, 306]
342 Roll call: U.S. House 28 Political: voting Y 234 26,788 0.9826 [203, 241, 306]
343 Roll call: U.S. House 29 Political: voting Y 236 27,562 0.9939 [203, 241, 306]
344 Roll call: U.S. House 30 Political: voting Y 236 27,669 0.9978 [203, 241, 306]
345 Roll call: U.S. House 31 Political: voting Y 241 28,804 0.9960 [203, 241, 306]
346 Roll call: U.S. House 32 Political: voting Y 239 28,318 0.9957 [203, 241, 306]
347 Roll call: U.S. House 33 Political: voting Y 240 28,570 0.9962 [203, 241, 306]
348 Roll call: U.S. House 34 Political: voting Y 236 27,545 0.9933 [203, 241, 306]
349 Roll call: U.S. House 35 Political: voting Y 245 29,630 0.9913 [203, 241, 306]
350 Roll call: U.S. House 36 Political: voting Y 243 29,312 0.9969 [203, 241, 306]
351 Roll call: U.S. House 37 Political: voting Y 197 18,735 0.9704 [203, 241, 306]
352 Roll call: U.S. House 38 Political: voting Y 187 17,326 0.9963 [203, 241, 306]
353 Roll call: U.S. House 39 Political: voting Y 199 19,593 0.9945 [203, 241, 306]
354 Roll call: U.S. House 40 Political: voting Y 233 26,605 0.9843 [203, 241, 306]
355 Roll call: U.S. House 41 Political: voting Y 256 32,109 0.9837 [203, 241, 306]
356 Roll call: U.S. House 42 Political: voting Y 253 31,626 0.9921 [203, 241, 306]
357 Roll call: U.S. House 43 Political: voting Y 302 45,151 0.9934 [203, 241, 306]
222
358 Roll call: U.S. House 44 Political: voting Y 308 46,723 0.9883 [203, 241, 306]
359 Roll call: U.S. House 45 Political: voting Y 302 45,315 0.9970 [203, 241, 306]
360 Roll call: U.S. House 46 Political: voting Y 301 44,987 0.9964 [203, 241, 306]
361 Roll call: U.S. House 47 Political: voting Y 306 46,214 0.9903 [203, 241, 306]
362 Roll call: U.S. House 48 Political: voting Y 338 56,484 0.9918 [203, 241, 306]
363 Roll call: U.S. House 49 Political: voting Y 330 54,160 0.9977 [203, 241, 306]
364 Roll call: U.S. House 50 Political: voting Y 326 52,907 0.9987 [203, 241, 306]
365 Roll call: U.S. House 51 Political: voting Y 347 59,303 0.9879 [203, 241, 306]
366 Roll call: U.S. House 52 Political: voting Y 340 57,285 0.9940 [203, 241, 306]
367 Roll call: U.S. House 53 Political: voting Y 376 69,943 0.9921 [203, 241, 306]
368 Roll call: U.S. House 54 Political: voting Y 368 67,085 0.9934 [203, 241, 306]
369 Roll call: U.S. House 55 Political: voting Y 371 68,270 0.9947 [203, 241, 306]
370 Roll call: U.S. House 56 Political: voting Y 369 67,059 0.9877 [203, 241, 306]
371 Roll call: U.S. House 57 Political: voting Y 371 67,383 0.9818 [203, 241, 306]
372 Roll call: U.S. House 58 Political: voting Y 397 75,891 0.9655 [203, 241, 306]
373 Roll call: U.S. House 59 Political: voting Y 397 76,299 0.9707 [203, 241, 306]
374 Roll call: U.S. House 60 Political: voting Y 398 77,921 0.9863 [203, 241, 306]
375 Roll call: U.S. House 61 Political: voting Y 402 80,174 0.9947 [203, 241, 306]
ID Name Category Weighted N L fe References
376 Roll call: U.S. House 62 Political: voting Y 408 82,442 0.9929 [203, 241, 306]
377 Roll call: U.S. House 63 Political: voting Y 452 101,498 0.9958 [203, 241, 306]
378 Roll call: U.S. House 64 Political: voting Y 441 96,780 0.9975 [203, 241, 306]
379 Roll call: U.S. House 65 Political: voting Y 454 102,108 0.9930 [203, 241, 306]
380 Roll call: U.S. House 66 Political: voting Y 453 101,199 0.9885 [203, 241, 306]
381 Roll call: U.S. House 67 Political: voting Y 452 101,482 0.9956 [203, 241, 306]
382 Roll call: U.S. House 68 Political: voting Y 442 96,885 0.9941 [203, 241, 306]
383 Roll call: U.S. House 69 Political: voting Y 437 95,226 0.9996 [203, 241, 306]
384 Roll call: U.S. House 70 Political: voting Y 443 97,497 0.9959 [203, 241, 306]
385 Roll call: U.S. House 71 Political: voting Y 455 102,502 0.9924 [203, 241, 306]
386 Roll call: U.S. House 72 Political: voting Y 447 99,028 0.9934 [203, 241, 306]
387 Roll call: U.S. House 73 Political: voting Y 445 98,647 0.9986 [203, 241, 306]
388 Roll call: U.S. House 74 Political: voting Y 440 96,170 0.9958 [203, 241, 306]
389 Roll call: U.S. House 75 Political: voting Y 445 98,474 0.9968 [203, 241, 306]
390 Roll call: U.S. House 76 Political: voting Y 456 102,495 0.9880 [203, 241, 306]
391 Roll call: U.S. House 77 Political: voting Y 450 99,956 0.9894 [203, 241, 306]
392 Roll call: U.S. House 78 Political: voting Y 450 100,513 0.9949 [203, 241, 306]
223
393 Roll call: U.S. House 79 Political: voting Y 448 99,246 0.9912 [203, 241, 306]
394 Roll call: U.S. House 80 Political: voting Y 448 99,902 0.9977 [203, 241, 306]
395 Roll call: U.S. House 81 Political: voting Y 444 98,054 0.9970 [203, 241, 306]
396 Roll call: U.S. House 82 Political: voting Y 447 99,281 0.9960 [203, 241, 306]
397 Roll call: U.S. House 83 Political: voting Y 440 96,506 0.9992 [203, 241, 306]
398 Roll call: U.S. House 84 Political: voting Y 437 95,253 0.9999 [203, 241, 306]
399 Roll call: U.S. House 85 Political: voting Y 444 97,955 0.9960 [203, 241, 306]
400 Roll call: U.S. House 86 Political: voting Y 443 97,377 0.9946 [203, 241, 306]
401 Roll call: U.S. House 87 Political: voting Y 449 99,774 0.9920 [203, 241, 306]
401 Roll call: U.S. House 88 Political: voting Y 443 97,842 0.9994 [203, 241, 306]
Network Details
403 Roll call: U.S. House 89 Political: voting Y 442 97,139 0.9967 [203, 241, 306]
404 Roll call: U.S. House 90 Political: voting Y 437 95,251 0.9998 [203, 241, 306]
405 Roll call: U.S. House 91 Political: voting Y 448 99,815 0.9969 [203, 241, 306]
406 Roll call: U.S. House 92 Political: voting Y 443 97,579 0.9967 [203, 241, 306]
407 Roll call: U.S. House 93 Political: voting Y 443 97,848 0.9994 [203, 241, 306]
408 Roll call: U.S. House 94 Political: voting Y 441 96,837 0.9981 [203, 241, 306]
409 Roll call: U.S. House 95 Political: voting Y 441 96,493 0.9946 [203, 241, 306]
410 Roll call: U.S. House 96 Political: voting Y 440 96,379 0.9979 [203, 241, 306]
Appendix C
ID Name Category Weighted N L fe References
411 Roll call: U.S. House 97 Political: voting Y 442 96,761 0.9928 [203, 241, 306]
412 Roll call: U.S. House 98 Political: voting Y 439 95,922 0.9977 [203, 241, 306]
413 Roll call: U.S. House 99 Political: voting Y 439 95,875 0.9972 [203, 241, 306]
414 Roll call: U.S. House 100 Political: voting Y 440 96,544 0.9996 [203, 241, 306]
415 Roll call: U.S. House 101 Political: voting Y 440 96,505 0.9992 [203, 241, 306]
416 Roll call: U.S. House 102 Political: voting Y 441 96,811 0.9978 [203, 241, 306]
417 Roll call: U.S. House 103 Political: voting Y 441 96,348 0.9931 [203, 241, 306]
418 Roll call: U.S. House 104 Political: voting Y 445 98,720 0.9993 [203, 241, 306]
419 Roll call: U.S. House 105 Political: voting Y 443 97,841 0.9994 [203, 241, 306]
420 Roll call: U.S. House 106 Political: voting Y 440 96,557 0.9998 [203, 241, 306]
421 Roll call: U.S. House 107 Political: voting Y 443 97,816 0.9991 [203, 241, 306]
422 Roll call: U.S. House 108 Political: voting Y 440 96,561 0.9998 [203, 241, 306]
423 Roll call: U.S. House 109 Political: voting Y 440 96,549 0.9997 [203, 241, 306]
424 Roll call: U.S. House 110 Political: voting Y 448 99,603 0.9948 [203, 241, 306]
425 Roll call: U.S. Senate 1 Political: voting Y 29 393 0.9680 [203, 241, 306]
426 Roll call: U.S. Senate 2 Political: voting Y 31 449 0.9656 [203, 241, 306]
427 Roll call: U.S. Senate 3 Political: voting Y 32 472 0.9516 [203, 241, 306]
224
428 Roll call: U.S. Senate 4 Political: voting Y 43 760 0.8416 [203, 241, 306]
429 Roll call: U.S. Senate 5 Political: voting Y 44 808 0.8541 [203, 241, 306]
430 Roll call: U.S. Senate 6 Political: voting Y 37 644 0.9670 [203, 241, 306]
431 Roll call: U.S. Senate 7 Political: voting Y 35 537 0.9025 [203, 241, 306]
432 Roll call: U.S. Senate 8 Political: voting Y 44 864 0.9133 [203, 241, 306]
433 Roll call: U.S. Senate 9 Political: voting Y 37 645 0.9685 [203, 241, 306]
434 Roll call: U.S. Senate 10 Political: voting Y 37 660 0.9910 [203, 241, 306]
435 Roll call: U.S. Senate 11 Political: voting Y 44 855 0.9038 [203, 241, 306]
436 Roll call: U.S. Senate 12 Political: voting Y 37 663 0.9955 [203, 241, 306]
437 Roll call: U.S. Senate 13 Political: voting Y 46 947 0.9150 [203, 241, 306]
438 Roll call: U.S. Senate 14 Political: voting Y 44 898 0.9493 [203, 241, 306]
439 Roll call: U.S. Senate 15 Political: voting Y 46 977 0.9440 [203, 241, 306]
440 Roll call: U.S. Senate 16 Political: voting Y 51 1,249 0.9796 [203, 241, 306]
441 Roll call: U.S. Senate 17 Political: voting Y 52 1,294 0.9759 [203, 241, 306]
442 Roll call: U.S. Senate 18 Political: voting Y 52 1,304 0.9834 [203, 241, 306]
443 Roll call: U.S. Senate 19 Political: voting Y 59 1,589 0.9287 [203, 241, 306]
444 Roll call: U.S. Senate 20 Political: voting Y 53 1,343 0.9746 [203, 241, 306]
445 Roll call: U.S. Senate 21 Political: voting Y 54 1,339 0.9357 [203, 241, 306]
ID Name Category Weighted N L fe References
446 Roll call: U.S. Senate 22 Political: voting Y 53 1,348 0.9782 [203, 241, 306]
447 Roll call: U.S. Senate 23 Political: voting Y 54 1,378 0.9630 [203, 241, 306]
448 Roll call: U.S. Senate 24 Political: voting Y 61 1,732 0.9464 [203, 241, 306]
449 Roll call: U.S. Senate 25 Political: voting Y 58 1,627 0.9843 [203, 241, 306]
440 Roll call: U.S. Senate 26 Political: voting Y 60 1,689 0.9542 [203, 241, 306]
451 Roll call: U.S. Senate 27 Political: voting Y 59 1,662 0.9714 [203, 241, 306]
452 Roll call: U.S. Senate 28 Political: voting Y 57 1,575 0.9868 [203, 241, 306]
453 Roll call: U.S. Senate 29 Political: voting Y 63 1,895 0.9703 [203, 241, 306]
454 Roll call: U.S. Senate 30 Political: voting Y 72 2,320 0.9077 [203, 241, 306]
455 Roll call: U.S. Senate 31 Political: voting Y 70 2,341 0.9694 [203, 241, 306]
456 Roll call: U.S. Senate 32 Political: voting Y 73 2,511 0.9555 [203, 241, 306]
457 Roll call: U.S. Senate 33 Political: voting Y 70 2,308 0.9557 [203, 241, 306]
458 Roll call: U.S. Senate 34 Political: voting Y 64 2,002 0.9931 [203, 241, 306]
459 Roll call: U.S. Senate 35 Political: voting Y 73 2,542 0.9673 [203, 241, 306]
460 Roll call: U.S. Senate 36 Political: voting Y 70 2,370 0.9814 [203, 241, 306]
461 Roll call: U.S. Senate 37 Political: voting Y 70 2,051 0.8493 [203, 241, 306]
462 Roll call: U.S. Senate 38 Political: voting Y 54 1,402 0.9797 [203, 241, 306]
225
463 Roll call: U.S. Senate 39 Political: voting Y 59 1,610 0.9410 [203, 241, 306]
464 Roll call: U.S. Senate 40 Political: voting Y 69 2,274 0.9693 [203, 241, 306]
465 Roll call: U.S. Senate 41 Political: voting Y 80 3,084 0.9759 [203, 241, 306]
466 Roll call: U.S. Senate 42 Political: voting Y 75 2,773 0.9993 [203, 241, 306]
467 Roll call: U.S. Senate 43 Political: voting Y 79 3,041 0.9870 [203, 241, 306]
468 Roll call: U.S. Senate 44 Political: voting Y 82 3,261 0.9819 [203, 241, 306]
469 Roll call: U.S. Senate 45 Political: voting Y 82 3,265 0.9831 [203, 241, 306]
470 Roll call: U.S. Senate 46 Political: voting Y 81 3,219 0.9935 [203, 241, 306]
471 Roll call: U.S. Senate 47 Political: voting Y 83 3,362 0.9880 [203, 241, 306]
472 Roll call: U.S. Senate 48 Political: voting Y 78 2,998 0.9983 [203, 241, 306]
Network Details
473 Roll call: U.S. Senate 49 Political: voting Y 81 3,210 0.9907 [203, 241, 306]
474 Roll call: U.S. Senate 50 Political: voting Y 76 2,850 1.0000 [203, 241, 306]
475 Roll call: U.S. Senate 51 Political: voting Y 91 3,998 0.9763 [203, 241, 306]
476 Roll call: U.S. Senate 52 Political: voting Y 93 4,249 0.9932 [203, 241, 306]
477 Roll call: U.S. Senate 53 Political: voting Y 95 4,413 0.9884 [203, 241, 306]
478 Roll call: U.S. Senate 54 Political: voting Y 90 4,000 0.9988 [203, 241, 306]
479 Roll call: U.S. Senate 55 Political: voting Y 96 4,445 0.9748 [203, 241, 306]
480 Roll call: U.S. Senate 56 Political: voting Y 93 4,201 0.9820 [203, 241, 306]
Appendix C
ID Name Category Weighted N L fe References
481 Roll call: U.S. Senate 57 Political: voting Y 90 3,939 0.9835 [203, 241, 306]
482 Roll call: U.S. Senate 58 Political: voting Y 93 4,174 0.9757 [203, 241, 306]
483 Roll call: U.S. Senate 59 Political: voting Y 93 4,251 0.9937 [203, 241, 306]
484 Roll call: U.S. Senate 60 Political: voting Y 95 4,382 0.9814 [203, 241, 306]
485 Roll call: U.S. Senate 61 Political: voting Y 102 5,033 0.9771 [203, 241, 306]
486 Roll call: U.S. Senate 62 Political: voting Y 109 5,719 0.9716 [203, 241, 306]
487 Roll call: U.S. Senate 63 Political: voting Y 101 5,029 0.9958 [203, 241, 306]
488 Roll call: U.S. Senate 64 Political: voting Y 100 4,931 0.9962 [203, 241, 306]
489 Roll call: U.S. Senate 65 Political: voting Y 111 5,899 0.9663 [203, 241, 306]
490 Roll call: U.S. Senate 66 Political: voting Y 101 5,005 0.9911 [203, 241, 306]
491 Roll call: U.S. Senate 67 Political: voting Y 105 5,413 0.9914 [203, 241, 306]
492 Roll call: U.S. Senate 68 Political: voting Y 102 5,081 0.9864 [203, 241, 306]
493 Roll call: U.S. Senate 69 Political: voting Y 105 5,353 0.9804 [203, 241, 306]
494 Roll call: U.S. Senate 70 Political: voting Y 102 5,082 0.9866 [203, 241, 306]
495 Roll call: U.S. Senate 71 Political: voting Y 109 5,779 0.9818 [203, 241, 306]
496 Roll call: U.S. Senate 72 Political: voting Y 103 5,220 0.9937 [203, 241, 306]
497 Roll call: U.S. Senate 73 Political: voting Y 100 4,879 0.9857 [203, 241, 306]
226
498 Roll call: U.S. Senate 74 Political: voting Y 100 4,933 0.9966 [203, 241, 306]
499 Roll call: U.S. Senate 75 Political: voting Y 102 5,126 0.9951 [203, 241, 306]
500 Roll call: U.S. Senate 76 Political: voting Y 104 5,106 0.9533 [203, 241, 306]
501 Roll call: U.S. Senate 77 Political: voting Y 108 5,575 0.9649 [203, 241, 306]
502 Roll call: U.S. Senate 78 Political: voting Y 104 5,304 0.9903 [203, 241, 306]
503 Roll call: U.S. Senate 79 Political: voting Y 107 5,466 0.9639 [203, 241, 306]
504 Roll call: U.S. Senate 80 Political: voting Y 97 4,655 0.9998 [203, 241, 306]
505 Roll call: U.S. Senate 81 Political: voting Y 108 5,646 0.9772 [203, 241, 306]
506 Roll call: U.S. Senate 82 Political: voting Y 98 4,748 0.9989 [203, 241, 306]
507 Roll call: U.S. Senate 83 Political: voting Y 110 5,724 0.9548 [203, 241, 306]
508 Roll call: U.S. Senate 84 Political: voting Y 99 4,845 0.9988 [203, 241, 306]
509 Roll call: U.S. Senate 85 Political: voting Y 101 5,014 0.9929 [203, 241, 306]
510 Roll call: U.S. Senate 86 Political: voting Y 103 5,246 0.9987 [203, 241, 306]
511 Roll call: U.S. Senate 87 Political: voting Y 105 5,444 0.9971 [203, 241, 306]
512 Roll call: U.S. Senate 88 Political: voting Y 103 5,249 0.9992 [203, 241, 306]
513 Roll call: U.S. Senate 89 Political: voting Y 103 5,247 0.9989 [203, 241, 306]
514 Roll call: U.S. Senate 90 Political: voting Y 101 5,048 0.9996 [203, 241, 306]
515 Roll call: U.S. Senate 91 Political: voting Y 102 5,148 0.9994 [203, 241, 306]
ID Name Category Weighted N L fe References
516 Roll call: U.S. Senate 92 Political: voting Y 102 5,147 0.9992 [203, 241, 306]
517 Roll call: U.S. Senate 93 Political: voting Y 103 5,246 0.9987 [203, 241, 306]
518 Roll call: U.S. Senate 94 Political: voting Y 101 5,049 0.9998 [203, 241, 306]
519 Roll call: U.S. Senate 95 Political: voting Y 104 5,345 0.9979 [203, 241, 306]
520 Roll call: U.S. Senate 96 Political: voting Y 101 5,049 0.9998 [203, 241, 306]
521 Roll call: U.S. Senate 97 Political: voting Y 101 5,049 0.9998 [203, 241, 306]
522 Roll call: U.S. Senate 98 Political: voting Y 101 5,049 0.9998 [203, 241, 306]
523 Roll call: U.S. Senate 99 Political: voting Y 101 5,049 0.9998 [203, 241, 306]
524 Roll call: U.S. Senate 100 Political: voting Y 101 5,049 0.9998 [203, 241, 306]
525 Roll call: U.S. Senate 101 Political: voting Y 100 4,950 1.0000 [203, 241, 306]
526 Roll call: U.S. Senate 102 Political: voting Y 102 5,148 0.9994 [203, 241, 306]
527 Roll call: U.S. Senate 103 Political: voting Y 102 5,080 0.9862 [203, 241, 306]
528 Roll call: U.S. Senate 104 Political: voting Y 103 5,247 0.9989 [203, 241, 306]
529 Roll call: U.S. Senate 105 Political: voting Y 100 4,950 1.0000 [203, 241, 306]
530 Roll call: U.S. Senate 106 Political: voting Y 102 5,148 0.9994 [203, 241, 306]
531 Roll call: U.S. Senate 107 Political: voting Y 102 5,148 0.9994 [203, 241, 306]
532 Roll call: U.S. Senate 108 Political: voting Y 100 4,950 1.0000 [203, 241, 306]
227
533 Roll call: U.S. Senate 109 Political: voting Y 101 5,049 0.9998 [203, 241, 306]
534 Roll call: U.S. Senate 110 Political: voting Y 102 5,147 0.9992 [203, 241, 306]
535 U.K. House of Commons voting: 1992-1997 Political: voting Y 668 220,761 0.9909 [104]
536 U.K. House of Commons voting: 1997-2001 Political: voting Y 671 223,092 0.9925 [104]
537 U.K. House of Commons voting: 2001-2005 Political: voting Y 657 215,246 0.9988 [104]
538 U.N. resolutions 1 Political: voting Y 54 1,431 1.0000 [302]
539 U.N. resolutions 2 Political: voting Y 57 1,594 0.9987 [302]
540 U.N. resolutions 3 Political: voting Y 59 1,711 1.0000 [302]
541 U.N. resolutions 4 Political: voting Y 59 1,711 1.0000 [302]
542 U.N. resolutions 5 Political: voting Y 60 1,770 1.0000 [302]
Network Details
543 U.N. resolutions 6 Political: voting Y 60 1,768 0.9989 [302]
544 U.N. resolutions 7 Political: voting Y 60 1,770 1.0000 [302]
545 U.N. resolutions 8 Political: voting Y 60 1,770 1.0000 [302]
546 U.N. resolutions 9 Political: voting Y 60 1,770 1.0000 [302]
547 U.N. resolutions 10 Political: voting Y 65 2,037 0.9793 [302]
548 U.N. resolutions 11 Political: voting Y 81 3,239 0.9997 [302]
549 U.N. resolutions 12 Political: voting Y 82 3,317 0.9988 [302]
550 U.N. resolutions 13 Political: voting Y 82 3,294 0.9919 [302]
Appendix C
ID Name Category Weighted N L fe References
551 U.N. resolutions 14 Political: voting Y 82 3,321 1.0000 [302]
552 U.N. resolutions 15 Political: voting Y 99 4,851 1.0000 [302]
553 U.N. resolutions 16 Political: voting Y 104 5,356 1.0000 [302]
554 U.N. resolutions 17 Political: voting Y 110 5,995 1.0000 [302]
555 U.N. resolutions 18 Political: voting Y 113 6,246 0.9870 [302]
556 U.N. resolutions 20 Political: voting Y 117 6,672 0.9832 [302]
557 U.N. resolutions 21 Political: voting Y 122 7,333 0.9935 [302]
558 U.N. resolutions 22 Political: voting Y 124 7,616 0.9987 [302]
559 U.N. resolutions 23 Political: voting Y 126 7,855 0.9975 [302]
560 U.N. resolutions 24 Political: voting Y 126 7,851 0.9970 [302]
561 U.N. resolutions 25 Political: voting Y 126 7,868 0.9991 [302]
562 U.N. resolutions 26 Political: voting Y 132 8,641 0.9994 [302]
563 U.N. resolutions 27 Political: voting Y 132 8,646 1.0000 [302]
564 U.N. resolutions 28 Political: voting Y 134 8,905 0.9993 [302]
565 U.N. resolutions 29 Political: voting Y 137 9,202 0.9878 [302]
566 U.N. resolutions 30 Political: voting Y 143 10,117 0.9965 [302]
567 U.N. resolutions 31 Political: voting Y 144 10,291 0.9995 [302]
228
Network Details
613 Human: Ophid Protein interaction N 5,464 23,238 0.0016 [57, 58]
614 STRING: C. elegans Protein interaction N 1,762 95,227 0.0614 [155]
615 STRING: S. cerevisiae Protein interaction N 534 57,672 0.4053 [155]
616 Yeast: Oxford Statistics Protein interaction N 2,224 6,609 0.0027 [67]
617 Yeast: DIP Protein interaction N 4,906 17,218 0.0014 [4, 262, 310]
618 Yeast: DIPC Protein interaction N 2,587 6,094 0.0018 [4, 262, 310]
619 Yeast: FHC Protein interaction N 2,233 5,750 0.0023 [4, 43]
620 Yeast: FYI Protein interaction N 778 1,798 0.0059 [4, 146]
Appendix C
ID Name Category Weighted N L fe References
621 Yeast: PCA Protein interaction N 889 2,407 0.0061 [4, 287]
622 Corporate directors in Scotland (1904-1905)∗ Social Y 131 676 0.0794 [77, 267]
623 Corporate ownership (EVA) Social N 4,475 4,652 0.0005 [226]
624 Dolphins∗ Social N 62 159 0.0841 [192]
625 Family planning in Korea Social N 33 68 0.1288 [259]
626 Unionization in a hi-tech firm∗ Social N 33 91 0.1723 [170]
627 Communication within a sawmill on strike∗ Social N 36 62 0.0984 [208]
628 Leadership course Social N 32 80 0.1613 [210]
629 Les Miserables∗ Social Y 77 254 0.0868 [168]
630 Marvel comics Social Y 6,449 168,211 0.0081 [8]
631 Mexican political elite Social N 35 117 0.1966 [120]
632 Pretty-good-privacy algorithm users Social N 10,680 24,316 0.0004 [48]
633 Prisoners Social N 67 142 0.0642 [210]
634 Bernard and Killworth fraternity: observed Social Y 58 967 0.5850 [39, 40, 260]
635 Bernard and Killworth fraternity: recalled Social Y 58 1,653 1.0000 [39, 40, 260]
636 Bernard and Killworth HAM radio: observed Social Y 41 153 0.1866 [37, 38, 165]
637 Bernard and Killworth HAM radio: recalled Social Y 44 442 0.4672 [37, 38, 165]
230
638 Bernard and Killworth office: observed Social Y 40 238 0.3051 [37, 38, 165]
639 Bernard and Killworth office: recalled Social Y 40 779 0.9987 [37, 38, 165]
640 Bernard and Killworth technical: observed Social Y 34 175 0.3119 [37, 38, 165]
641 Bernard and Killworth technical: recalled Social Y 34 561 1.0000 [37, 38, 165]
642 Kapferer tailor shop: instrumental (t1) Social N 35 76 0.1277 [163]
643 Kapferer tailor shop: instrumental (t2) Social N 34 93 0.1658 [163]
644 Kapferer tailor shop: associational (t1) Social N 39 158 0.2132 [163]
645 Kapferer tailor shop: associational (t2) Social N 39 223 0.3009 [163]
646 University Rovira i Virgili (Tarragona) e-mail Social N 1,133 5,451 0.0085 [140]
647 Zachary karate club∗ Social N 34 78 0.1390 [315]
648 BA: (100,1)∗ Synthetic N 100 99 0.0200 [26]
649 BA: (100,2)∗ Synthetic N 100 197 0.0398 [26]
650 BA: (1000,1) Synthetic N 1,000 999 0.0020 [26]
651 BA: (1000,2) Synthetic N 1,000 1,997 0.0040 [26]
652 BA: (500,1) Synthetic N 500 499 0.0040 [26]
653 BA: (500,2) Synthetic N 500 997 0.0080 [26]
654 ER: (100,25)∗ Synthetic N 100 1,264 0.2554 [86]
655 ER: (100,50) Synthetic N 100 2,436 0.4921 [86]
ID Name Category Weighted N L fe References
656 ER: (100,75) Synthetic N 100 3,697 0.7469 [86]
657 ER: (1000,25) Synthetic N 1,000 124,455 0.2492 [86]
658 ER: (1000,50) Synthetic N 1,000 249,512 0.4995 [86]
659 ER: (1000,75) Synthetic N 1,000 374,846 0.7504 [86]
660 ER: (50,25) Synthetic N 50 287 0.2343 [86]
661 ER: (50,50) Synthetic N 50 589 0.4808 [86]
662 ER: (50,75) Synthetic N 50 936 0.7641 [86]
663 ER: (500,25) Synthetic N 500 31,148 0.2497 [86]
664 ER: (500,50) Synthetic N 500 62,301 0.4994 [86]
665 ER: (500,75) Synthetic N 500 93,780 0.7517 [86]
666 Fractal: (10,2,1) Synthetic N 1,024 9,256 0.0177 [279]
667 Fractal: (10,2,2) Synthetic N 1,024 16,875 0.0322 [279]
668 Fractal: (10,2,3) Synthetic N 1,024 30,344 0.0579 [279]
669 Fractal: (10,2,4) Synthetic N 1,024 53,009 0.1012 [279]
670 Fractal: (10,2,5) Synthetic N 1,024 89,812 0.1715 [279]
671 Fractal: (10,2,6) Synthetic N 1,024 147,784 0.2822 [279]
672 Fractal: (10,2,7) Synthetic N 1,024 232,794 0.4445 [279]
231
Network Details
683 LF benchmark: (1000,15,50,0.1,0.1,1,2,1) Synthetic Y 1,000 7,680 0.0154 [178]
684 LF benchmark: (1000,15,50,0.1,0.1,1,2,2) Synthetic Y 1,000 7,791 0.0156 [178]
685 LF benchmark: (1000,15,50,0.5,0.1,1,2,1) Synthetic Y 1,000 7,657 0.0153 [178]
686 LF benchmark: (1000,15,50,0.5,0.1,2,2,2) Synthetic Y 1,000 7,912 0.0158 [178]
687 LF benchmark: (1000,15,50,0.5,0.5,1,2,1) Synthetic Y 1,000 7,693 0.0154 [178]
688 LF benchmark: (1000,15,50,0.5,0.5,1,2,2) Synthetic Y 1,000 7,906 0.0158 [178]
689 LF benchmark: (1000,25,50,0.1,0.1,1,2,1) Synthetic Y 1,000 12,660 0.0253 [178]
690 LF benchmark: (1000,25,50,0.1,0.1,2,2,2) Synthetic Y 1,000 12,641 0.0253 [178]
Appendix C
ID Name Category Weighted N L fe References
691 LF benchmark: (1000,25,50,0.5,0.1,1,2,1) Synthetic Y 1,000 12,771 0.0256 [178]
692 LF benchmark: (1000,25,50,0.5,0.1,2,2,2) Synthetic Y 1,000 12,772 0.0256 [178]
693 LF benchmark: (1000,25,50,0.5,0.5,1,2,1) Synthetic Y 1,000 12,962 0.0259 [178]
694 LF benchmark: (1000,25,50,0.5,0.5,2,2,2) Synthetic Y 1,000 12,881 0.0258 [178]
695 LF-NG benchmark Synthetic Y 128 1,024 0.1260 [178, 224]
696 Random fully-connected: (100) Synthetic Y 100 4,950 1.0000 [† ]
697 Random fully-connected: (500) Synthetic Y 500 124,750 1.0000 [† ]
698 WS: (100,1,0.1) Synthetic N 100 100 0.0202 [305]
699 WS: (100,1,0.5) Synthetic N 73 73 0.0278 [305]
700 WS: (100,4,0.1) Synthetic N 100 407 0.0822 [305]
701 WS: (100,4,0.5) Synthetic N 100 522 0.1055 [305]
702 WS: (1000,1,0.1) Synthetic N 850 850 0.0024 [305]
703 WS: (1000,1,0.5) Synthetic N 877 877 0.0023 [305]
704 WS: (1000,4,0.1) Synthetic N 1,000 4,053 0.0081 [305]
705 WS: (1000,4,0.5) Synthetic N 1,000 5,138 0.0103 [305]
706 KOSKK:(1000,1,10,10,5 × 10−5 ,1 × 10−3 ,100) Synthetic Y 519 2,096 0.0156 [174]
707 KOSKK:(1000,1,10,10,5 × 10−5 ,1 × 10−3 ,1000) Synthetic Y 895 7,682 0.0192 [174]
232
708 KOSKK:(1000,1,100,10,5 × 10−5 ,1 × 10−3 ,1000) Synthetic Y 870 4,725 0.0125 [174]
709 KOSKK:(1000,1,100,105 × 10−5 ,1 × 10−3 ,100) Synthetic Y 652 2,125 0.0100 [174]
710 KOSKK:(1000,1,50,10,5 × 10−5 ,1 × 10−3 ,100) Synthetic Y 459 1,554 0.0148 [174]
711 KOSKK:(1000,1,50,10,5 × 10−5 ,1 × 10−3 ,1000) Synthetic Y 851 4,960 0.0137 [174]
712 Trade product proximity Trade Y 775 283,094 0.9439 [150]
713 World trade in metal (1994): Net Trade Y 80 875 0.2769 [77, 275]
714 World trade in metal (1994): Total Trade Y 80 875 0.2769 [77, 275]
†
See the description at the beginning of this appendix for details of this network.
Appendix D
In this appendix we provide additional technical details of the Potts Hamiltonian and
the networks that we study in Chapter 6.
In some networks Λmax is several orders of magnitude larger than nearly all of the
other Λij values and consequently H(Λmax ) is several orders of magnitude larger than
the energies at most of of the other sampled resolutions. For these networks, Heff is
then very small over much of the range of ξ.1 This effect is demonstrated in Fig. D.1
for the Caltech Facebook network [295].
233
Appendix D
1
H=Σi≠jJijδ(Ci,Cj)
0.9
H=ΣijJijδ(Ci,Cj)
0.8
0.7
0.6
eff
0.5
H
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ξ
Figure D.1: A comparison of the Heff curves for the Caltech Facebook network [295]
when H(ξ) is calculated by only summing over all i 6= j (solid line) and summing
over all i, j (dashed line).
2
For example, self-edges in the Caltech Facebook network account for 0.45% of the total edge
weight.
234
Appendix E
In this appendix we perform tests to check that the distance measures that we define in
Section 6.4 are robust to perturbations of the network. Many networks are obtained
empirically, so it is expected that there will be links erroneously included in the
network that do not exist, and links that do actually exist will be erroneously omitted
from the network. For any taxonomies derived using the MRF framework to be
meaningful it is essential that the distance measures are robust to such false positive
and negative links. We check the robustness of the distance measures by investigating
the effect of rewiring some fraction of network links on the distances between the
networks.
235
Appendix E
236
Robustness of MRFs: Network Perturbations
237
Appendix E
that the connectivity and the degree distribution must be maintained, this restricts
which edges can be rewired and results in a non-uniform rewiring probability.
To provide some insight into the fraction of edges that get rewired, we perform
1,000 randomizations of 10L rewirings for the Zachary Karate Club network [315]. For
each simulation, we find the number of different edges that exist at any stage during
the rewiring process as a fraction of the total number of possible edges [the number
of possible edges is given by 21 N(N − 1)]. For the case where the only constraint is
that the connectivity is maintained, on average 83% of the possible edges exist at
some stage of the rewiring process. The minimum fraction of edges that are visited
during any of the 1,000 randomizations is 79% and the maximum is 86%. For the
case where we add the additional constraint that the degree distribution must be
maintained, on average 61% of edges exist at some stage during the rewiring process,
with a minimum of 57% and a maximum of 66% during a single simulation.
We also calculate the number of times that edges that exist at any stage of the
rewiring process are themselves rewired. In Fig. E.2, we show the distribution of the
number of times any edge is rewired. Over 1,000 simulations, when we only maintain
network connectivity 96% of edges are rewired and when we also maintain the degree
distribution 98% are rewired. In the former case, each edge is rewired on average 1.7
times and in the latter case each edge is rewired on average 2.3 times. The average
number of rewirings is higher in the case in which we maintain the degree distribution
because there are fewer edges that allow the additional constraint to be satisfied and
consequently these edges exist and get rewired more frequently.
Figure E.3 shows the DH , DS , and Dη distance matrices for 10L rewirings.
The first column again shows the distance matrices for the original networks block-
diagonalized using the cost function in Eq. 6.15. The second and third columns then
show the distance matrices for randomizations in which the degree distribution is
preserved and destroyed, respectively. The node orderings of the matrices in the
second and third columns are again the same as the orderings for the matrix of the
first column of the corresponding row. The second column in Fig. E.3 demonstrates
that, when the degree distribution is maintained (even for “completely” randomized
networks), some block structure remains in the distance matrices. The third column
shows that when the degree distribution is not maintained, much of this structure
is destroyed, but that some block structure is still visible. When the networks are
“completely” randomized, with the only constraint being that the connectivity is
maintained, then one is in effect producing random graphs. These random graphs
might, however, have some common properties, such as the number of nodes and the
238
Robustness of MRFs: Network Perturbations
0.45 0.45
0.4 0.4
normalized frequency
normalized frequency
0.35 0.35
0.3 0.3
0.25 0.25
0.2 0.2
0.15 0.15
0.1 0.1
0.05 0.05
0 0
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 9 10 11
no. of rewirings no. of rewirings
Figure E.2: Distribution of the number of times each edge is rewired when we perform
10L rewirings on the Zachary Karate Club network [315]. We show the distribution
when (left) only the connectivity is maintained and (right) the connectivity and the
degree distribution are maintained. The results are aggregated over 1,000 simulations.
fraction of possible edges present (see Section 6.3.2). The presence of, albeit weak,
block structure in the final column of Fig. E.3 suggests that the MRF method is able
to identify some of these fundamental network properties.
The block-diagonalized distance matrices in Fig. E.3 suggest that the MRF dis-
tance measures we propose are robust and that our approach provides a good method
for identifying networks with similar mesoscopic structure across multiple scales.
They also suggest that our technique can still identify similar networks even when
the community structure has been destroyed, although the block-structure is not as
well-defined. The MRF method also seems able to identify similar networks once the
strength distribution has been destroyed, although the block-structure in the distance
matrices is then poorly defined.
239
Appendix E
240
Appendix F
In this section, we check that the MRFs and taxonomies described in Chapter 6 are
robust with respect to the choice of computational heuristic used to minimize the
Hamiltonian in Eq. 6.1.
241
Appendix F
1 1 1
a)
0.8 0.8 0.8
spectral
greedy
0.6 0.6 0.6
simulated
Heff
Seff
ceff
annealing
0.4 0.4 0.4
0 0 0
0 0.5 1 0 0.5 1 0 0.5 1
ξ
1 1 1
b)
0.8 0.8 0.8
spectral
greedy
0.6 0.6 0.6 simulated
Heff
Seff
ceff
annealing
0.4 0.4 0.4
0 0 0
0 0.5 1 0 0.5 1 0 0.5 1
ξ
1 1 1
c)
0.8 0.8 0.8
spectral
greedy
0.6 0.6 0.6 simulated
Heff
Seff
ceff
annealing
0.4 0.4 0.4
0 0 0
0 0.5 1 0 0.5 1 0 0.5 1
ξ
Figure F.1: Comparison of the MRFs produced using greedy, spectral and simulated
annealing optimization heuristics. We show the MRFs for the (a) Zachary Karate
Club network [315] (b) Garfield: Small-world citations network [119] (c) Roll call:
U.S. Senate 102 network [203, 241, 306].
242
Robustness of MRFs: Alternative Heuristics
Figure F.2: Comparison of the dendrograms produced using a greedy algorithm and
simulated annealing for a subset of 15 networks.
243
Appendix F
constructed from the PCA-distance matrix Dp with elements dpij using average-linkage
clustering, the distance tij between a node i in cluster C and a node j in cluster C ′ is
given by
1 XX p
tij = dave (C, C ′ ) = d .
|C||C ′ | i∈C j∈C ′ ij
To compare dendrograms constructed using the greedy algorithm and simulated an-
nealing, with distances tij and sij between pairs of networks, respectively, we define
a dendrogram correlation coefficient ϕ as
P
i<j s ij − s̄ tij − t̄
ϕ = rh , (F.1)
P 2 ih P 2 i
i<j sij − s̄ i<j tij − t̄
where s̄ is the mean of the distances sij and t̄ the mean value of the tij .2 Dendrograms
with identical distances between clusters will have a dendrogram correlation ϕ = 1.
The dendrogram correlation for the example dendrograms shown in Fig. F.2 is 0.997.
This is clearly very high, but to judge exactly how high the dendrogram correlation
is we compare the observed correlations with those for random dendrograms.
244
Robustness of MRFs: Alternative Heuristics
4
We detect communities at 150 resolutions for each network because this seems like a reason-
able compromise between the computational cost of detecting communities at a larger number of
resolutions and the noisy MRFs that result when we find communities at only a small number of
resolutions.
245
Appendix F
0.7
actual dendrograms
randomized dendrogram
0.6
normalized frequency
0.5
0.4
0.3
0.2
0.1
0
−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1
correlation
246
References
[3] L. A. Adamic and N. Glance, The political blogosphere and the 2004 U.S.
election: divided they blog, in LinkKDD ’05: Proceedings of the 3rd international
workshop on Link discovery, 2005, pp. 36–43.
[10] R. Albert, H. Jeong, and A.-L. Barabási, Error and attack tolerance of
complex networks, Nature, 406 (2000), pp. 378–382.
247
References
[11] F. Allen and A. Babus, The Network Challenge: Strategy, Profit, and Risk
in an Interlinked World, Wharton School Publishing, Philadelphia, PA, USA,
2009, ch. Networks in Finance, pp. 367–382.
[18] W. B. Arthur, Complexity and the Economy, Science, 284 (1999), pp. 107–
109.
248
References
249
References
250
References
[49] B. Bollobás, Modern Graph Theory, Academic Press, New York, NY, USA,
2001.
[50] , Random Graphs, Cambridge University Press, Cambridge, UK, 2nd ed.,
2001.
[53] J.-P. Bouchaud and M. Potters, Theory of Financial Risk and Derivative
Pricing: From Statistical Physics to Risk Management, Cambridge University
Press, Cambridge, UK, 2003.
251
References
252
References
253
References
[78] J. Dean and M. R. Henzinger, Finding related pages in the World Wide
Web, Computer Networks, 31 (1999), pp. 1467–1479.
[80] P. Domingos, The role of Occam’s razor in knowledge discovery, Data Mining
and Knowledge Discovery, 3 (1999), pp. 409–425.
[83] D. Duan, Y. Li, Y. Jin, and Z. Lu, Community mining on dynamic weighted
directed graphs, in CNIKM ’09: Proceeding of the 1st ACM international work-
shop on Complex networks meet information & knowledge management, 2009,
pp. 11–18.
[85] N. Eagle and A. Pentland, Reality mining: sensing complex social systems,
Personal Ubiquitous Computing, 10 (2006), pp. 255–268.
254
References
[90] T. S. Evans and R. Lambiotte, Line graphs, link partitions, and overlapping
communities, Physical Review E, 80 (2009), p. 16105.
255
References
256
References
[118] A. Garas, P. Argyrakis, and S. Havlin, The structural role of weak and
strong links in a financial market network, European Physical Journal B, 63
(2008), pp. 265–271.
[122] K. S. Gleditsch, Expanded Trade and GDP Data, Journal of Conflict Reso-
lution, 46 (2002), pp. 712–724.
257
References
[132] S. Gopal, Societies and Cities in the Age of Instant Access, Springer, Berlin,
Germany, 2007.
258
References
259
References
260
References
261
References
[170] D. Krackhardt, The ties that torture: Simmelian tie analysis in organiza-
tions, Research in the Sociology of Organizations, 16 (1999), pp. 183–210.
[171] V. Krebs, 2004. Network compiled by Valdis Krebs and not previously pub-
lished. The data is available at http://www.orgnet.com/.
262
References
[182] F.-J. Lapointe and P. Legendre, A Statistical Framework to Test the Con-
sensus of Two Nested Classifications, Systematic Zoology, 39 (1990), pp. 1–13.
263
References
[191] Z. Liu, Y.-C. Lai, and N. Ye, Propagation and immunization of infection on
general networks with both homogeneous and heterogeneous components, Phys-
ical Review E, 67 (2003), p. 031911.
[196] S. Mangan and U. Alon, Structure and function of the feed-forward loop
network motif, Proceedings of the National Academy of Sciences of the U.S.A.,
100 (2003), pp. 11980–11985.
264
References
[205] , Impact of unexpected events, shocking news, and rumors on foreign ex-
change market dynamics, Physical Review E, 77 (2008), p. 46110.
[206] M. L. Mehta, Random Matrices, Elsevier, San Diego, CA, USA, 2004.
[209] S. Milgram, The small world problem, Psychology Today, 1 (1967), pp. 60–67.
[212] M. Molloy and B. Reed, A critical point for random graphs with a given
degree sequence, Random Structures and Algorithms, 6 (1995), pp. 161–180.
265
References
[219] , 2006. Network compiled by Mark Newman and not previously published.
The data is available at http://www-personal.umich.edu/~ mejn/netdata/.
[227] J.-P. Onnela, Taxonomy of Financial Assets, Master’s thesis, Helsinki Uni-
versity of Technology, Finland, 2002.
[228] , Complex Networks in the Study of Financial and Social Systems, Master’s
thesis, Helsinki University of Technology, 2006.
266
References
[233] G. Palla, A.-L. Barabási, and T. Vicsek, Quantifying social group evo-
lution, Nature, 446 (2007), pp. 664–667.
[240] P. Pons and M. Latapy, Computer and Information Sciences - ISCIS 2005,
vol. 3733 of Lecture Notes in Computer Science, Springer, Berlin/Heidelberg,
267
References
[249] A. Rapoport, Contribution to the theory of random and biased nets, Bulletin
of Mathematical Biophysics, 19 (1957), pp. 257–277.
268
References
[255] J. M. Reitz, 2000. ODLIS is maintained by Joan Reitz and was converted into
the network used here by A. Mrvar and V. Batagelj. The data is available at
http://vlado.fmf.uni-lj.si/pub/networks/data/dic/odlis/Odlis.htm.
269
References
[267] J. Scott and M. Hughes, The anatomy of Scottish capital: Scottish compa-
nies and Scottish capital, 1900-1979, Croom Helm, London, UK, 1980.
270
References
271
References
[292] , Web Community Chart: A Tool for Navigating the Web and Observing
Its Evolution, IEICE TRANSACTIONS on Information and Systems, E86-D
(2003), pp. 1024–1031.
272
References
[293] , A system for visualizing and analyzing the evolution of the web with a
time series of graphs, in HYPERTEXT ’05: Proceedings of the sixteenth ACM
conference on Hypertext and hypermedia, 2005, pp. 151–160.
[300] S. van Dongen, A New Cluster Algorithm for Graphs, tech. rep., National
Research Institute for Mathematics and Computer Science in the Netherlands,
1998.
273
References
[304] S. Wasserman and K. Faust, Social Network Analysis: Methods and Appli-
cations, Cambridge University Press, Cambridge, UK, 1994.
274
References
[318] Y. Zhao, E. Levina, and J. Zhu, Community extraction for social networks,
arXiv:1005.3265, (2010).
275