Taurus Towards A Unified Force Representation and Universal Solver For Graph Layout
Taurus Towards A Unified Force Representation and Universal Solver For Graph Layout
Abstract—Over the past few decades, a large number of graph layout techniques have been proposed for visualizing graphs from
various domains. In this paper, we present a general framework, Taurus, for unifying popular techniques such as the spring-electrical
model, stress model, and maxent-stress model. It is based on a unified force representation, which formulates most existing
techniques as a combination of quotient-based forces that combine power functions of graph-theoretical and Euclidean distances. This
representation enables us to compare the strengths and weaknesses of existing techniques, while facilitating the development of new
methods. Based on this, we propose a new balanced stress model (BSM) that is able to layout graphs in superior quality. In addition,
we introduce a universal augmented stochastic gradient descent (SGD) optimizer that efficiently finds proper solutions for all layout
techniques. To demonstrate the power of our framework, we conduct a comprehensive evaluation of existing techniques on a large
number of synthetic and real graphs. We release an open-source package, which facilitates easy comparison of different graph layout
methods for any graph input as well as effectively creating customized graph layout techniques.
Index Terms—Graph Layout, Gradient Descent, Framework
1 I NTRODUCTION
Graphs are commonly used for modeling complex data in many To alleviate the involved computational costs, sparse stress models [13,
domains such as social media, finance and biology. The most commonly 27, 32] have been proposed, which only impose springs for a subset of
used graph visualization technique, node-link diagrams, depict nodes node pairs.
as points in a plane and edges as lines connecting these points. In past Because of the divergent mechanisms, these models have different
decades, various graph layout methods [25, 36] have been developed characteristics when creating graph layouts. For example, FDP
for producing aesthetically-pleasing drawings, while maintaining the performs better in preserving neighborhood structures for many
underlying graph structures. graphs, while the stress model tends to maintain the overall structures,
Rather than directly optimizing aesthetic criteria [34] (e.g., even especially for mesh-like graphs. However, it is still unclear why the
node distribution and minimal edge crossing), most methods simulate models have such differences and how they are connected conceptually.
one of two kinds of physical systems as a basis for layouting graphs: Moreover, it is difficult to make a fair quantitative comparison because
the spring-electrical model or the stress model. The spring-electrical different optimization strategies are used. This not only hinders
model [8, 10] regards edges as springs that use attractive forces to pull researchers to develop new methods but also poses a challenge for
connected nodes close to each other, at the same time treating nodes as practitioners to choose a proper method for visualizing their graphs.
electrically-charged particles that repel each other with repulsive forces. In this paper, we present a general framework, we call Taurus,
Based on this model, many variants of force-directed placement (FDP) Towards a Unified force Representation and Universal Solver for graph
algorithms have been developed for better revealing different structures layout, that offers a unified view for understanding and comparing
and features of graphs. For example, FM3 [18] and SFDP [20] use a most of the popular graph layout algorithms. It relies on two novel
multilevel scheme for overcoming local minima, the extended models components: a unified force representation and a universal solver.
of LinLog [31] and ForceAtlas2 [22] allow to better reveal clusters The uniform force representation allows us to show that all existing
and local structures, respectively. While the spring-electrical model methods can be formulated as a combination of quotient-based forces,
produces good layouts for many graphs, it does not encode the target using a quotient between power functions of graph-theoretical and
(data-space) edge lengths between every pair of nodes. Euclidian distances. This unified representation enables us to compare
This is the focus of stress models [14, 23, 40], which assume a spring the strengths and weaknesses of different methods. The universal solver
between each pair of nodes with an ideal length equal to the graph- combines the advantage of SGD [30] in escaping local minima and
theoretical distance among the nodes. By minimizing the stress energy the effectiveness of the Barnes & Hut approximation [3] in reducing
of the spring system, a layout is obtained. For efficiently solving such computational cost, which allows us to solve different existing layout
models, which involve considerably more interactions between the methods with the same optimizer.
nodes, a few optimization strategies have been incorporated, such as Moreover, our framework can also be used as a general platform
stress majorization [14], and stochastic gradient descent (SGD) [42]. for developing new graph layout methods. In particular, we propose
a balanced stress model, which combines the advantages of spring-
electrical and stress models. Specifically, it exerts attractive and
• Mingliang Xue, Zhi Wang, Fahai Zhong, and Yunhai Wang are with the repulsive forces to all node pairs, where the attractive force is reciprocal
Department of Computer Science, Shandong University, China. E-mail:
and the repulsive force is proportional to the graph-theoretical distances.
{xml95007, wangzizi2020, zhongfahai, cloudseawang}@gmail.com
In doing so, the model avoids extremely large repulsive and attractive
• Yong Wang is with School of Computing and Information Systems, Singapore
Management University, Singapore. E-mail: [email protected]
forces for nearby nodes, while pulling neighboring nodes close to each
• Oliver Deussen is with Computer and Information Science, University of other.
Konstanz, Konstanz, Germany. E-mail: [email protected] We implement Taurus as a graph visualization package in C++,
• Yunhai Wang is the corresponding author which allows users to define their own attractive and repulsive forces.
To demonstrate its effectiveness, we comprehensively evaluate it by
Manuscript received xx xxx. 201x; accepted xx xxx. 201x. Date of Publication comparing various spring and stress layout methods on a large number
xx xxx. 201x; date of current version xx xxx. 201x. For information on
of synthesized graphs with different structures such as lattices, trees
obtaining reprints of this article, please send e-mail to: [email protected].
and clusters. The evaluation includes two parts: verifying whether
Digital Object Identifier: xx.xxxx/TVCG.201x.xxxxxxx
Taurus can produce similar results to the original implementations of
existing methods, and examining how different methods behave on Maxent-stress model (Maxent) [13] imposes stress constraints on pairs
graphs with different characteristics. The results show that our solver of neighboring nodes and entropy-based constraints on the remaining
enables all methods to perform as well as or even better than the original node pairs, the latter ones can be regarded as repulsive forces between
implementations, while our proposed balanced stress model makes a all node pairs.
good trade-off in distance preservation and maintaining neighborhoods Noack [31] shows that energy-based layout methods like LinLog can
as well as cluster structures. In addition, we show that our Taurus be formulated as force representations. Similarly, Gansner et al. [13]
allows users to flexibly customize the graph layout methods for meeting represent the repulsive force as an entropy term and incorporate it into
specific requirements. the stress-based energy model. However, there is still a lack of an
The main contributions of this paper can be summarized as follows: inherent representation for unifying existing layout methods. In this
work, we demonstrate that almost all methods from spring-electrical
• We propose a general framework for graph visualization based and stress models can be formulated as a combination of our proposed
on a novel quotient based force representation and an augmented quotient-based forces. Moreover, we show that this unified view not
SGD optimizer, which offers a unified view for understanding only facilitates the understanding and comparison of different methods
and comparing existing graph layout methods; but also allows the development of new methods.
• We present a new graph layout method based on our framework 2.2 Graph Layout Solvers
and conduct a systematic analysis and extensive evaluation for Most graph layout methods need an optimization solver to create
our framework on different graph datasets through quantitative desirable drawings. Solving a spring-electrical model has a time
comparisons; and complexity of O(n2 ) at each iteration, where n is the number of
nodes in the graph. To improve the computational efficiency of such
• We release a library with the proposed general framework that models, several multilevel methods [3, 11, 12, 18–20, 39] have been
enables rapid implementation and design of graph layout methods proposed. Among them, the Barnes-Hut (BH) approximation [3] is
for any graph input. the most commonly-used acceleration method. It uses hyper nodes
2 R ELATED W ORK to approximate repulsive forces, resulting in a time complexity of
O(n log n). The method has been used by different spring-electric
Related works can be categorized into three parts: graph layout model algorithms, such as [20, 22]. Another method is to use random
methods, graph layout solvers and graph layout packages. vertex sampling (RVS) [17] to accelerate the computation of repulsive
2.1 Graph Layout forces. This method generates layouts similar to Barnes-Hut.
There are also many algorithms to optimize solutions for the stress
Various graph layout methods have been proposed to visualize network
model. The earliest stress model [23] employs gradient descent to
data as node-link diagrams. Among them, the most common methods
find the optimal graph layout; however, it is often trapped into a local
often use virtual physical models to represent the relationships between
minimum. Gansner et al. [14] adapt stress majorization to the stress
objects. By referring to the taxonomy by Gansner et al. [13], we classify
model, which is rooted in solving multidimensional scaling. Ensuring
such methods into three types: spring-electrical models, stress models
a monotonic decrease of the stress, the method has advantages over the
and hybrid models.
original implementation. Recently, stochastic gradient descent (SGD),
Spring-electrical models [8, 10] regard nodes as electrically-charged
a powerful optimization solver widely used in machine learning, has
particles that push nodes away from each other and edges as springs
also been applied to graph drawing [42]. It converges fast and achieves
that pull nodes close to each other, often referred to as repulsive and
layouts with a lower stress error. Ahmed et al. [1] further proposed a
attractive forces. A graph layout result is achieved when attractive
SGD-based graph drawing approach (SGD)2 that can handle multiple
and repulsive forces strike a balance. For a complete review of the
readability criteria of graph drawing simultaneously. We propose
graph layout methods developed from this model, please refer to
an augmented SGD solver for finding optimal layouts at minimal
Kobourov [28] and Gouvêa et al. [16]. Here, we briefly review some
computation speed.
widely used models. Hu et al. [20] improve the repulsive force designed
by Fruchterman and Reingold [10] and use a repulsive force that decays 2.3 Graph Layout Packages
rapidly, avoiding edge-length distortion at the periphery of a layout. A number of open-source packages facilitate an easy implementation
Noack et al. [31] introduce the LinLog model that employed a constant of different graph layout techniques. For example, Graphviz [15],
attractive force and set the repulsive force to the inverse of the distance. Tulip [2] and OGDF [6] are C++ libraries that implement customized
As a result, this model can generate graph layouts with clearly-separated graph data structures and many graph drawing techniques. Data-Driven
node clusters. Kermarrec and Moin [26] further extend the LinLog Documentation (D3) [4], the most popular web-based visualization
model for revealing cluster structures at different levels. Inspired by toolkit, incorporates some graph drawing techniques (e.g., the spring-
these studies, the attractive and repulsive force of ForceAtlas2 [22] were electrical model [10]). All packages allow users to directly use
designed to be proportional and inversely proportional to the distance different graph layout methods without implementing them from
between nodes, obtaining graph layouts with a good preservation of scratch. Because of the underlying models, however, these packages
local structures and cluster separation. often expose different APIs and parameters for different methods,
Stress models [14, 23] also use a spring analogy but assume that there resulting in cumbersome parameter tuning for the user and the need for
are springs connecting every pair of nodes in the graph. Spring forces understanding different approaches. Building upon our unified force
are defined to create a layout with distances of nodes as close as representation and universal solver, our graph drawing package is much
possible to the graph-theoretical distances. Many variants of the stress more generic and easier to use. Different solutions can be compared
model aim to improve its efficiency through sparse approximations. and the right method for the wanted layout can be selected.
For example, progressive multidimensional scaling [5] and low-rank
stress majorization [27] have been used to approximate the shortest 3 P ROPOSED F RAMEWORK
path distances of all node pairs in a graph. The sparse stress model [32] As mentioned above, our general framework aims to unify existing
speeds up the stress model by aggregating the terms of the objective graph layout methods. It consists of a quotient-based force model to
function. Wang et al. [40] improved the stress model by imposing describe the relationship among nodes, and a universal optimization
constraints on edge vectors and edge lengths, further enhancing the solver to achieve optimal graph layouts. In this section, we first show
expressiveness of the stress model. how the proposed framework originates from the observations of prior
Hybrid models combine both models for overcoming their drawbacks. graph layout approaches. Then, we present our quotient-based force
For example, Hu and Koren [21] resolve the warping effect of spring- model as well as the guidelines for using it. Finally, we introduce our
electrical models by integrating attractive forces into the stress model. proposed balanced stress model.
To reduce the cost for computing graph-theoretical distances, the
Table 1. Quotient based force functions and their corresponding
3.1 Revisiting Existing Graph Layout Methods
parameters of different layout methods: ω is the weight, α and β are the
For a graph G(V, E) with V 2 representing the set of node pairs, graph exponents of the graph-theoretical distance and the Euclidean distance
layout methods aim to map the graph nodes V to coordinates in 2D between two nodes, respectively, and Ω is the force range. P is a set of
or 3D space and often require a model to represent the relationship pivot nodes [27], k f a is defined as −(deg(i) + 1)(deg( j) + 1) [22] with the
between them. Depending on the underlying mechanism of building node degree deg(i). V 2 refers to all node pairs, E to node pairs connected
the model, Hu et al. [13, 21] classified layout methods into two types: by an edge, S to a k-ring neighborhood graph.
spring-electrical models and stress models. They propose to use
hybrid models, which integrate spring-electrical and stress models. Method Attractive Force {ω1 , α1 , β1 , Ω1 } Repulsive Forces {ω2 , α2 , β2 , Ω2 }
Spring-electrical models often use force modeling, while stress and FDP ∑(i, j)∈E ||xi − x j ||2 ei j {1, 2, 0, E} ∑{i, j}∈V 2 ||xi−1
−x j || ei j {-1,-1,0,V 2 }
hybrid models are built on energy modeling to specify the graph layout. [10]
Since the force on an object is the negative derivative of the energy kfa
{k f a , −1, 0,V 2 }
FA2 ∑(i, j)∈E ||xi − x j ||ei j {1, 2, 0, E} ∑{i, j}∈V 2 ||xi −x j || ei j
with respect to the distance [41], we re-write all energy-based layout [22]
methods into the form of a force modeling for establishing a unified −1
representation. In the following, we take one representative method of LinLog ∑(i, j)∈E 1 ∗ ei j {1, 1, 0, E} ∑{i, j}∈V 2 ||xi −x j || ei j {−1, −1, 0,V 2 }
[31]
each model type as an example.
2||xi −x j || −2
Force-Directed Placement. As a typical instance of the spring- SM [14] ∑{i, j}∈V 2
di2j
ei j {2, 1, 2,V 2 } ∑{i, j}∈V 2 di j ei j {−2, 0, 1,V 2 }
electrical model, FDP [10] aims to meet the principles that connected 2||xi −x j ||
nodes should be drawn near each other and all nodes should not be MARS ∑(i, j)∈P×V di j ei j {2, 1, 1, P ×V } ∑(i, j)∈P×V −2ei j {−2, 0, 0, P ×
[27] V}
drawn too close to each other. It computes the position of each node xi
2||xi −x j ||
by exerting the attractive force Fi,aj and repulsive forces Fi,r j between SSM ∑(i, j)∈P×V di2j
ei j {2, 1, 2, P × ∑(i, j)∈P×V −2
di j ei j {−2, 0, 1, P ×
∪E ∪E
the node and its neighbours and all other nodes, respectively. [32] V ∪ E} V ∪ E}
2||xi −x j || −2
Maxent ∑{i, j}∈S ei j {2, 1, 2, S} (∑{i, j}∈S di j + {−2, 0, 1, S},
x j − xi di2j
{−αsgn(q),
ei, j = , (1) [13] −αsgn(q)
∑{i, j}∈V 2 ||xi −x j ||q )ei j −q,0,V 2 }
||xi − x j ||
Fi,aj = ||xi − x j ||2 ∗ ei, j , ∀{i, j} ∈ E, (2)
1 Maxent-Stress Model. Instead of specifying springs for all node pairs,
Fi,r j = − ∗ ei, j , ∀{i, j} ∈ V 2 , (3)
||xi − x j || the maxent-stress model [13] is a hybrid model that defines a stress
model constraint on a subset of node pairs (typically, the set of graph
where ei, j is a unit vector. By successively moving each node along the edges E), while imposing an entropy-based constraint to the rest of the
resultant force Fi , node pairs. Hence, the energy function is defined as follows:
Fig. 1. (a) Influence of the parameters {α, β } on the force. Each plot For simplicity, we divide all k forces within the graph into attractive and
shows the force magnitude as a function of the graph-theoretical distance repulsive forces. To meet the above principle, two nodes with a larger
between two nodes in the graph and the pairwise Euclidian distance graph-theoretical distance should be exerted a larger repulsive force
in the layout for the given combination of α and β . The yellow color (β < 0) and a smaller attractive force (β > 0). To prevent the layout
represents a force magnitude close to zero and the orange color a large from diverging to infinity or collapsing into a point, for two nodes with
force magnitude. The red and blue boxes cover the parameter settings fixed graph-theoretical distances, the repulsive force should decrease
satisfying the criteria G1 and G2, respectively. (b) shows the resultant as the Euclidian distance between the two nodes increases (α < 0), and
forces for FDP, (c) for the stress model and (d) for the maxent stress the attractive force should decrease as the Euclidian distance between
model. two nodes decreases (α > 0). For yielding a clustering (dispersing)
3.2 Quotient based Force Function effect, we can also use a constant repulsive (attractive) force with a
After systematically comparing and analyzing various forces used in large attractive (repulsive) force by setting α = 0 (β = 0). Therefore,
different graph layout methods (see Table 1), in the following we we identify the following two guidelines for choosing α and β :
identify common components that appear in most methods, and further
propose a quotient-based representation to unify them. Given l forces, • G1: For the attractive force, the exponent parameters are
the resultant force Fi exerted on a node i is: suggested to satisfy: α ≥ 0, β ≥ 0; and