Evaluating_Representation_Learning_and_Graph_Layout_Methods_for_Visualization
Evaluating_Representation_Learning_and_Graph_Layout_Methods_for_Visualization
com
DEPARTMENT: APPLICATIONS
Graphs and other structured data have come to the forefront in machine learning over
the past few years due to the efficacy of novel representation learning methods
boosting the prediction performance in various tasks. Representation learning
methods embed the nodes in a low-dimensional real-valued space, enabling the
application of traditional machine learning methods on graphs. These representations
have been widely premised to be also suited for graph visualization. However, no
benchmarks or encompassing studies on this topic exist. We present an empirical
study comparing several state-of-the-art representation learning methods with two
recent graph layout algorithms, using readability and distance-based measures as
well as the link prediction performance. Generally, no method consistently
outperformed the others across quality measures. The graph layout methods provided
qualitatively superior layouts when compared to representation learning methods.
Embedding graphs in a higher dimensional space and applying t-distributed
stochastic neighbor embedding for visualization improved the preservation of local
neighborhoods, albeit at substantially higher computational cost.
V
isualization of data is useful both for under- finding planar embeddings (no edge crossings), and
standing data and cross-checking models later research also considered embeddings of large
trained on that data. Consequently, it may graphs, leading to the development of the still popu-
lead to new insights, better models, the detection of lar force-directed approaches. In contrast, graph
outliers, etc. Visualization is frequently used in data representation learning methods (sometimes also
science. However, it is not straightforward to visualize referred to as network embedding methods) have not
a graph. The vertices and edges typically do not have been developed specifically for visualization, but
a real-valued representation, hence, the primary prob- rather they aim to embed graphs into a low-dimen-
lem in graph visualization is to find a good representa- sional space (depending on the method, between 8
tion of the vertices in two-dimensional space. and 128 dimensions in general), to enable subsequent
Traditionally, this area was known as graph draw- use of common machine learning techniques.
ing and methods that produce a representation of There exist many graph-layout algorithms, each
the nodes of a graph on a two-dimensional real-val- with a different objective or optimization procedure.
ued space were referred to as graph-layout algo- Someone who wants to inspect or analyze a graph
rithms. Early research considered problems such as visually is then already faced with a difficult choice of
which method would be most appropriate for their
goal. This difficulty is amplified by the introduction of
dozens of representation learning methods for graphs,
which have been premised to also produce good visu-
This work is licensed under a Creative Commons Attribution alizations (see, e.g., the works of Perozzi et al.1 and
4.0 License. For more information, see https://creativecom- Tsitsulin et al.2).
mons.org/licenses/by/4.0/
Digital Object Identifier 10.1109/MCG.2022.3160104
It is not clear which method to use, or even which
Date of current version 8 June 2022. aspects vary by choosing one method instead of
May/June 2022 Published by the IEEE Computer Society IEEE Computer Graphics and Applications 19
APPLICATIONS
FIGURE 1. Visualizations of the netscience collaboration graph using t-SNE and different embedding methods: (a) AROPE,
(b) CNE, (c) DeepWalk, (d) GAE. Color corresponds to edge length: yellow (long) to dark blue (short).
FIGURE 2. Placement of low-degree nodes (dark) around a high-degree hub node (yellow, marked with red arrow) for the Twitter
graph. While nodes are clustered according to their degree in AROPE128 and GAE16, and to some extend in CNE16, leaf nodes
are close to their parent in DW128. (a) AROPE128. (b) CNE16. (c) DW128. (d) GAE16.
side and leaf nodes on the other side of the embed- more compact and nodes are pushed away from the
ding for karate club, can96, and powergrid. The nodes center. For FR-RTX , nodes are distributed more evenly
for the other three graphs are mostly aligned along around a shared connection, which hinders the forma-
two axes, thus hiding most of their connections. tion of visible clusters for powergrid but improves the
Embeddings by AROPE128 reveal more details of the readability of Twitter. We assume this is the effect of
graph structure. In the Twitter graph layout in Figure 2(a), approximating the repulsive forces in FR-RTX, causing
we observe a clustering according to node-degree. only the closest nodes to repel each other.
CNE: The native two-dimensional embeddings of GAE: The embeddings from GAE2 have a distinct
CNE2 exhibit a proximity-based arrangement of the circular shape due to the inner product decoder. High-
nodes where hub nodes are placed in the center of degree node embeddings have large coordinates and
their connections. The embeddings by CNE16 are simi- low-degree nodes are placed near the origin. In this
lar to AROPE128, DW128, and GAE16. For the Twitter graph layout, we can easily identify the most central
graph, we find the t-SNE embedding is more readable hub nodes, (e.g., for Facebook or Twitter). The embed-
as it shows cluster structure. We see in Figure 2(b) dings by GAE16 and AROPE128 have a similar local
that CNE16 also clusters nodes by degree. structure, as shown in Figure 2(d). In both embeddings,
DeepWalk: The embeddings by DW2 conceal the nodes from the same cluster are arranged by degree.
underlying shape of the graph as nodes are mostly
arranged on a curved line. DW128 produces readable
embeddings with a clear cluster structure. In Figure 2(c), Readability Measures
we note that DW128 embeds low-degree nodes close to In Figure 3, we show the scores for crosslessness, edge-
their connections resulting in a star shape around the length uniformity, minimum angle, and Gabriel shape,
hub node. averaged over four runs. Averaging over all graph data-
DRGraph: The graph layouts by DRGRAPH for nets- sets, the graph layout methods outperform the represen-
cience and powergrid are appealing and very similar to tation learning methods (see the supplementary
the layouts by DW128. The different communities of the material, available online, Table 10). DW128 achieves high
Facebook graph are well visible, but in the node-link dia- scores for the layouts of larger graphs. The high mini-
gram (see the supplementary material, available online, mum-angle scores of the layouts by FR-RTX and DW128
Table 12), we notice some long edges that have a leaf stem from the star-shaped arrangement of hub-nodes
node on one end. Long edges also dominate the visuali- and their connections [see Figure 2(c)]. The layouts by
zation of the Twitter graph and make it difficult to AROPE2, DW2, and GAE2 generally have small angles
observe any structure in the center of the layout. We do between incident edges as they optimize the dot-prod-
not know whether the parameter settings or the fact uct similarity of connected nodes. Furthermore, we
that DRGRAPH only preserves first-order graph distan- observe that AROPE2, DW2, and GAE2 score poorly for
ces cause the “hairball” structure of this visualization. the Gabriel shape measure. These methods optimize the
FR: Both implementations result in graph layouts node embeddings for dot-product similarity whereas the
with similar global structure but different local node Gabriel shape retrieves neighbors based on Euclidean
arrangement. For FR, we observe that clusters are distance.
FIGURE 3. Average readability scores for each dataset, ordered by number of nodes. Crosslessness and edge length uniformity
are scaled to better show the relative differences between methods. (a) Crosslessness. (b) Edge length uniformity. (c) Minimum
angle. (d) Gabriel shape similarity.
Distance-Based Measures than two datasets. Averaging over all datasets, the
We show the scores for second-order neighborhood graph layouts based on the FR algorithm result in the
preservation and stress in Figure 4(a) and (b), and the highest link predict scores. While it is difficult to iden-
scores of the first-order neighborhood preservation tify much structure in the hairball-shaped Twitter
measure in the supplementary material, available graph layouts by FR and DRGRAPH, they score highly
online; they are highly similar to the Gabriel shape on the link-prediction task.
measure. DW128 and DRGRAPH preserve the second-
order neighborhood best. Considering all methods,
the neighborhoods of the Facebook graph are better Runtime
preserved than the neighborhoods of powergrid or The average runtime to embed the whole graphs using
Twitter. We presume that the community structure of an Intel Xeon CPU E5-2620 v4 @ 2.10 GHz with one
the graph aligns well with the second-order neighbor- GeForce GTX 1080Ti is depicted in Figure 4(d). We ran
hoods. FR places these communities far apart and, the experiments for GAE on Twitter on a different
thus, achieves the highest score. The differences in machine with 256 GB RAM resulting in mean runtimes
the stress measure are subtle. Averaged over all data- of 5020 s for GAE2 and 4635 s for GAE16.
sets, FR and FR-RTX result in layouts with the lowest We notice that AROPE2 is the fastest method and
stress. The t-SNE-based embeddings from AROPE128, that the runtime increase for AROPE128 is mainly
DW128, and GAE16 are more distance faithful than caused by t-SNE, which took about 22, 43, and 660 s
their native two-dimensional node embeddings but to reduce the dimensionality from 128 to 2 for
the opposite holds for CNE2 and CNE16. Facebook, powergrid, and Twitter, respectively. FR-
RTX has a slightly larger runtime than the other meth-
ods on the smaller graphs possibly due to a small
Link-Prediction startup cost for a user-interface. Notably, on the
Results are presented in Figure 4(c). No single method Twitter dataset, FR-RTX runs in less than a minute,
achieves the highest link prediction score on more while FR takes almost 24 h. CNE16 has lower runtime
FIGURE 4. Average neighborhood preservation, stress, AUC-ROC, and runtime over four experiment repetitions. We scale the
AUC values to better show the relative differences. (a) Second-order neighborhood preservation. (b) Stress. (c) Link prediction
AUCROC. (d) Runtime in seconds on a logarithmic axis.
than CNE2 on the larger graphs as the optimization of worse on these measures. Although other distances
the 16-dimensional node representations stabilizes than Euclidean may be interpretable, it is not obvious
earlier than the two-dimensional representations. to what extent that may be the case. In addition to
the quality measures, the standard version of t-SNE
also defines high- and low-dimensional neighborhoods
DISCUSSION based on Euclidean distance. To retain the graph
In this study, we have shown that visualizations by neighborhoods, we would have to adjust the similarity
graph layout methods scored higher on the chosen definition of t-SNE.
quality measures than the native two-dimensional
node embeddings by representation learning meth-
ods. The combination of DeepWalk with t-SNE resu- Recommendations for Practical Use
lted in visualizations with the best local neighborhood It is important to note that the choice of method is not
preservation and highest scores in Gabriel shape but universal and depends on the task for which the visu-
does not scale to larger graphs. We believe that there alization is used. The designer of the visualization
is great potential in comparing a wider range of scal- should ask the question, which quality measure is
able graph layout and representation learning meth- most important to judge the effectiveness of the visu-
ods on real-world graphs with millions of nodes. alization, for that task. For example, out-of-sample
The standard definitions of Gabriel shape, neigh- quality measures like link prediction performance may
borhood preservation, and stress all assume that the not be important in the context when a static graph is
graph-theoretical distances are reflected by the to be analyzed that does not have any missing edges.
Euclidean distances in the low-dimensional embed- In other contexts, generalizability of the proximity of
ding. Methods, such as AROPE, DeepWalk, and GAE, nodes, for which link prediction performance is a
optimize the embedding based on dot-product similar- proxy, could be desirable.
ity. From this perspective, it is not surprising that the No single winner emerged from the comparisons
graph layouts by AROPE2, DW2, and GAE2 score of graph representation learning with graph layout
15. T. N. Kipf and M. Welling, “Variational graph auto- Germany. She is the corresponding author of this article. Con-
encoders,” in Proc. NeurIPS Workshop Bayesian Deep tact her at [email protected].
Learn., 2016.
16. T. Fruchterman and E. Reingold, “Graph drawing by
BO KANG is a postdoctoral researcher with the IDLab, Ghent
force-directed placement,” Softw.-Pract. Exp., vol. 21,
University, Ghent, Belgium. Kang received the Ph.D. degree in
no. 11, pp. 1129–1164, 1991.
computer science engineering from Ghent University, in 2019.
17. A. C. Mara, J. Lijffijt, and T. d. Bie, “Benchmarking
network embedding models for link prediction: Are we Contact him at [email protected].
making progress?,” in Proc. IEEE 7th Int. Conf. Data Sci.
Adv. Anal., 2020, pp. 138–147.
TIJL DE BIE is a full professor with Ghent University, Ghent,
18. L. Van der Maaten and G. Hinton, “Visualizing data
Belgium, having previously been affiliated with the KU
using t-SNE,” J. Mach. Learn. Res., vol. 9, no. 11,
pp. 2579–2605, 2008. Leuven, Leuven, Belgium, the University of Southampton,
19. A. Hagberg et al., “Exploring network structure, Southampton, U.K., University of California (UC) Berkeley,
dynamics, and function using networkx,” in Proc. SciPy, Berkeley, CA, USA, UC Davis, Davis, CA, and Bristol University,
2008, pp. 11–15. Bristol, U.K. Contact him at [email protected].
20. M. Jacomy et al., “ForceAtlas2, a continuous graph
layout algorithm for handy network visualization
designed for the gephi software,” PLoS One, vol. 9, JEFREY LIJFFIJT is a professor with Ghent University, Ghent,
no. 6, Jun. 2014, Art. no. e98679. Belgium. Lijffijt received the Doctor of Science degree in
technology from Aalto University, Espoo, Finland, in 2013.
EDITH HEITER is a Ph.D. student with the IDLab, Ghent Uni- Contact him at jefrey.lijffi[email protected].
versity, Ghent, Belgium. Heiter received the M.Sc. degree in
€ cken,
computer science from Saarland University, Saarbru Contact department editor Mike Potel at [email protected].