0% found this document useful (0 votes)

79 views9 pages

F G R L P T G: AST Raph Epresentation Earning With Y Orch Eometric

This document introduces PyTorch Geometric (PyG), a library for deep learning on graph-structured data built on PyTorch. PyG implements many recent graph neural network models through a message passing interface that allows efficient computation on GPUs via sparse kernels. It supports learning on various graph types including homogeneous and heterogeneous graphs, point clouds, and manifolds. PyG includes operators for convolution, pooling, and readout and allows flexible modeling through its high-level interface.

Uploaded by

Ilie Mihai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views9 pages

F G R L P T G: AST Raph Epresentation Earning With Y Orch Eometric

Uploaded by

Ilie Mihai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Published as a workshop paper at ICLR 2019

FAST G RAPH R EPRESENTATION L EARNING WITH

PYTORCH G EOMETRIC
Matthias Fey & Jan E. Lenssen
Department of Computer Graphics
TU Dortmund University
44227 Dortmund, Germany
{matthias.fey,janeric.lenssen}@udo.edu
arXiv:1903.02428v3 [cs.LG] 25 Apr 2019

A BSTRACT
We introduce PyTorch Geometric, a library for deep learning on irregularly struc-
tured input data such as graphs, point clouds and manifolds, built upon PyTorch. In
addition to general graph data structures and processing methods, it contains a va-
riety of recently published methods from the domains of relational learning and 3D
data processing. PyTorch Geometric achieves high data throughput by leveraging
sparse GPU acceleration, by providing dedicated CUDA kernels and by introduc-
ing efficient mini-batch handling for input examples of different size. In this work,
we present the library in detail and perform a comprehensive comparative study of
the implemented methods in homogeneous evaluation scenarios.

1 INTRODUCTION
Graph Neural Networks (GNNs) recently emerged as a powerful approach for representation learning
on graphs, point clouds and manifolds (Bronstein et al., 2017; Kipf & Welling, 2017). Similar to the
concepts of convolutional and pooling layers on regular domains, GNNs are able to (hierarchically)
extract localized embeddings by passing, transforming, and aggregating information (Bronstein et al.,
2017; Gilmer et al., 2017; Battaglia et al., 2018; Ying et al., 2018; Morris et al., 2019).
However, implementing GNNs is challenging, as high GPU throughput needs to be achieved on
highly sparse and irregular data of varying size. Here, we introduce PyTorch Geometric (PyG), a
geometric deep learning extension library for PyTorch (Paszke et al., 2017) which achieves high
performance by leveraging dedicated CUDA kernels. Following a simple message passing API, it
bundles most of the recently proposed convolutional and pooling layers into a single and unified
framework. All implemented methods support both CPU and GPU computations and follow an
immutable data flow paradigm that enables dynamic changes in graph structures through time. PyG
is released under the MIT license and is available on GitHub.1 It is thoroughly documented and
provides accompanying tutorials and examples as a first starting point.2

2 OVERVIEW
In PyG, we represent a graph  = (X, (I, E)) by a node feature matrix X ∈ ℝ𝑁×𝐹 of 𝑁 nodes and
a sparse adjacency tuple (I, E) of 𝐸 edges, where I ∈ ℕ2×𝐸 encodes edge indices in COOrdinate
(COO) format and E ∈ ℝ𝐸×𝐷 (optionally) holds 𝐷-dimensional edge features. All user facing APIs,
e.g., data loading routines, multi-GPU support, data augmentation or model instantiations are heavily
inspired by PyTorch to keep them as familiar as possible.

Neighborhood Aggregation. Generalizing the convolutional operator to irregular domains is typ-

ically expressed as a neighborhood aggregation or message passing scheme (Gilmer et al., 2017)
( )
( )
𝑥⃗′𝑖 = 𝛾 𝑥⃗𝑖 , ⬚ 𝜙 𝑥⃗𝑖 , 𝑥⃗𝑗 , 𝑒⃗𝑗,𝑖 (1)
𝑗∈ (𝑖)
1
GitHub repository: https://github.com/rusty1s/pytorch_geometric
2
Documentation: https://rusty1s.github.io/pytorch_geometric

1
Published as a workshop paper at ICLR 2019

( )
𝑥⃗4 𝜙 𝑥⃗1 , 𝑥⃗2 , 𝑒⃗2,1
( )
𝑥⃗2 𝜙 𝑥⃗1 , 𝑥⃗3 , 𝑒⃗3,1
( )
𝑥⃗1 𝜙 𝑥⃗1 , 𝑥⃗4 , 𝑒⃗4,1 𝑥⃗′1
gather (𝑰) scatter_ ⬚ (𝑰)
𝛾
𝑥⃗3

Figure 1: Computation scheme of a GNN layer by leveraging gather and scatter methods based on
edge indices I, hence alternating between node parallel space and edge parallel space.

where ⬚ denotes a differentiable, permutation invariant function, e.g., sum, mean or max, and 𝛾
and 𝜙 denote differentiable functions, e.g., MLPs. In practice, this can be achieved by gathering
and scattering of node features and vectorized element-wise computation of 𝛾 and 𝜙, as visualized in
Figure 1. Although working on irregularly structured input, this scheme can be heavily accelerated by
the GPU. In contrast to implementations via sparse matrix multiplications, the usage of gather/scatter
proves to be advantageous for low-degree graphs and non-coalesced input (cf. Appendix A), and
allows for the integration of central node and multi-dimensional edge information while aggregating.
We provide the user with a general MessagePassing interface to allow for rapid and clean proto-
typing of new research ideas. In order to use, users only need to define the methods 𝜙, i.e., message,
and 𝛾, i.e., update, as well as chosing an aggregation scheme ⬚. For implementing 𝜙, node features
are automatically mapped to the respective source and target nodes.
Almost all recently proposed neighborhood aggregation functions can be lifted to this interface, in-
cluding (but not limited to) the methods already integrated into PyG: For learning on arbitrary graphs
we have implemented GCN (Kipf & Welling, 2017) and its simplified version (SGC) from Wu et al.
(2019), the spectral chebyshev and ARMA filter convolutions (Defferrard et al., 2016; Bianchi et al.,
2019), GraphSAGE (Hamilton et al., 2017), the attention-based operators GAT (Veličković et al.,
2018) and AGNN (Thekumparampil et al., 2018), the Graph Isomorphism Network (GIN) from Xu
et al. (2019), the Approximate Personalized Propagation of Neural Predictions (APPNP) operator
(Klicpera et al., 2019), the Dynamic Neighborhood Aggregation (DNA) operator (Fey, 2019) and
the signed operator for learning in signed networks (Derr et al., 2018).
For learning on point clouds, manifolds and graphs with multi-dimensional edge features, we pro-
vide the relational GCN operator from Schlichtkrull et al. (2018), PointNet++ (Qi et al., 2017),
PointCNN (Li et al., 2018), and the continuous kernel-based methods MPNN (Gilmer et al., 2017;
Simonovsky & Komodakis, 2017), MoNet (Monti et al., 2017), SplineCNN (Fey et al., 2018) and
the edge convolution operator (EdgeCNN) from Wang et al. (2018b).
In addition to these operators, we provide high-level implementations of, e.g., maximizing mutual in-
formation (Veličković et al., 2019), autoencoding graphs (Kipf & Welling, 2016; Pan et al., 2018), ag-
gregating jumping knowledge (Xu et al., 2018), and predicting temporal events in knowledge graphs
(Jin et al., 2019).

Global Pooling. PyG also supports graph-level outputs as opposed to node-level outputs by pro-
viding a variety of readout functions such as global add, mean or max pooling. We additionaly offer
more sophisticated methods such as set-to-set (Vinyals et al., 2016), sort pooling (Zhang et al., 2018)
or the global soft attention layer from Li et al. (2016).

Hierarchical Pooling. To further extract hierarchical information and to allow deeper GNN mod-
els, various pooling approaches can be applied in a spatial or data-dependent manner. We currently
provide implementation examples for Graclus (Dhillon et al., 2007; Fagginger Auer & Bisseling,
2011) and voxel grid pooling (Simonovsky & Komodakis, 2017), the iterative farthest point sam-
pling algorithm (Qi et al., 2017) followed by 𝑘-NN or query ball graph generation (Qi et al., 2017;
Wang et al., 2018b), and differentiable pooling mechanisms such as DiffPool (Ying et al., 2018) and
top𝑘 pooling (Gao & Ji, 2018; Cangea et al., 2018).

2
Published as a workshop paper at ICLR 2019

Table 1: Semi-supervised node classification with both fixed and random splits.
Cora CiteSeer PubMed
Method
Fixed Random Fixed Random Fixed Random
Cheby 81.4 ± 0.7 77.8 ± 2.2 70.2 ± 1.0 67.7 ± 1.7 78.4 ± 0.4 75.8 ± 2.2
GCN 81.5 ± 0.6 79.4 ± 1.9 71.1 ± 0.7 68.1 ± 1.7 79.0 ± 0.6 77.4 ± 2.4
GAT 83.1 ± 0.4 81.0 ± 1.4 70.8 ± 0.5 69.2 ± 1.9 78.5 ± 0.3 78.3 ± 2.3
SGC 81.7 ± 0.1 80.2 ± 1.6 71.3 ± 0.2 68.7 ± 1.6 78.9 ± 0.1 76.5 ± 2.4
ARMA 82.8 ± 0.6 80.7 ± 1.4 72.3 ± 1.1 68.9 ± 1.6 78.8 ± 0.3 77.7 ± 2.6
APPNP 83.3 ± 0.5 82.2 ± 1.5 71.8 ± 0.5 70.0 ± 1.4 80.1 ± 0.2 79.4 ± 2.2

Mini-batch Handling. Our framework supports batches of multiple graph instances (of potentially
different size) by automatically creating a single (sparse) block-diagonal adjacency matrix and con-
catenating feature matrices in the node dimension. Therefore, neighborhood aggregation methods
can be applied without modification, since no messages are exchanged between disconnected graphs.
In addition, an automatically generated assignment vector ensures that node-level information is not
aggregated across graphs, e.g., when executing global aggregation operators.

Processing of Datasets. We provide a consistent data format and an easy-to-use interface for the
creation and processing of datasets, both for large datasets and for datasets that can be kept in memory
during training. In order to create new datasets, users just need to read/download their data and
convert it to the PyG data format in the respective process method. In addition, datasets can be
modified by the use of transforms, which take in separate graphs and transform them, e.g., for
data augmentation, for enhancing node features with synthetic structural graph properties (Cai &
Wang, 2018), to automatically generate graphs from point clouds or to sample point clouds from
meshes.
PyG already supports a lot of common benchmark datasets often found in literature which are au-
tomatically downloaded and processed on first instantiation. In detail, we provide over 60 graph
kernel benchmark datasets3 (Kersting et al., 2016), e.g., PROTEINS or IMDB-BINARY, the cita-
tion graphs Cora, CiteSeer, PubMed and Cora-Full (Sen et al., 2008; Bojchevski & Günnemann,
2018), the Coauthor CS/Physics and Amazon Computers/Photo datasets from Shchur et al. (2018),
the molecule datasets QM7b (Montavon et al., 2013) and QM9 (Ramakrishnan et al., 2014), the
protein-protein interaction graphs from Hamilton et al. (2017), and the temporal datasets Bitcoin-
OTC (Kumar et al., 2016), ICEWS (Boschee et al., 2015) and GDELT (Leetaru & Schrodt, 2013). In
addition, we provide embedded datasets like MNIST superpixels (Monti et al., 2017), FAUST (Bogo
et al., 2014), ModelNet10/40 (Wu et al., 2015), ShapeNet (Chang et al., 2015), COMA (Ranjan et al.,
2018) and the PCPNet dataset from Guerrero et al. (2018).

3 E MPIRICAL E VALUATION

We evaluate the correctness of the implemented methods by performing a comprehensive compara-

tive study in homogeneous evaluation scenarios. Descriptions and statistics of all used datasets can
be found in Appendix B. For all experiments, we tried to follow the hyperparameter setup of the
respective papers as closely as possible. The individual experimental setups can be derived and all
experiments can be replicated from the code provided at our GitHub repository.4

Semi-supervised Node Classification. We perform semi-supervised node classification (cf. Ta-

ble 1) by reporting average accuracies of (a) 100 runs for the fixed train/val/test split from Kipf &
Welling (2017), and (b) 100 runs of randomly initialized train/val/test splits as suggested by Shchur
et al. (2018), where we additionally ensure uniform class distribution on the train split.
Nearly all experiments show a high reproducibility of the results reported in the respective papers.
However, test performance is worse for all models when using random data splits. Among the ex-
periments, the APPNP operator (Klicpera et al., 2019) generally performs best, while the ARMA
3
Kernel datasets: http://graphkernels.cs.tu-dortmund.de
4
https://github.com/rusty1s/pytorch_geometric/benchmark

3
Published as a workshop paper at ICLR 2019

Table 2: Graph classification.

IMDB- REDDIT-
Method MUTAG PROTEINS COLLAB
BINARY BINARY
GCN 74.6 ± 7.7 73.1 ± 3.8 80.6 ± 2.1 72.6 ± 4.5 89.3 ± 3.3
SAGE 74.9 ± 8.7 73.8 ± 3.6 79.7 ± 1.7 72.4 ± 3.6 89.1 ± 1.9
Flat

GIN-0 85.7 ± 7.7 72.1 ± 5.1 79.3 ± 2.7 72.8 ± 4.5 89.6 ± 2.6
GIN-𝜖 83.4 ± 7.5 72.6 ± 4.9 79.8 ± 2.4 72.1 ± 5.1 90.3 ± 3.0
Graclus 77.1 ± 7.2 73.0 ± 4.1 79.6 ± 2.0 72.2 ± 4.2 88.8 ± 3.2
Hier.

top𝑘 76.3 ± 7.5 72.7 ± 4.1 79.7 ± 2.2 72.5 ± 4.6 87.6 ± 2.4
DiffPool 85.0 ± 10.3 75.1 ± 3.5 78.9 ± 2.3 72.6 ± 3.9 92.1 ± 2.6
SAGE w/o JK 73.7 ± 7.8 72.7 ± 3.6 79.6 ± 2.4 72.1 ± 4.4 87.9 ± 1.9
Global

GlobalAttention 74.6 ± 8.0 72.5 ± 4.5 79.6 ± 2.2 72.3 ± 3.8 87.4 ± 2.5
Set2Set 73.7 ± 6.9 73.6 ± 3.7 79.6 ± 2.3 72.2 ± 4.2 89.6 ± 2.4
SortPool 77.3 ± 8.9 72.4 ± 4.1 77.7 ± 3.1 72.4 ± 3.8 74.9 ± 6.7

Table 4: Training runtime comparison.

DGL DGL
Table 3: Point cloud classification. Dataset Method PyG
DB GS
Method ModelNet10 GCN 4.19s 0.32s 0.25s
Cora
MPNN 92.07 GAT 6.31s 5.36s 0.80s
PointNet++ 92.51 GCN 3.78s 0.34s 0.30s
EdgeCNN 92.62 CiteSeer
GAT 5.61s 4.91s 0.88s
SplineCNN 92.65
PointCNN 93.28 GCN 12.91s 0.36s 0.32s
PubMed
GAT 18.69s 13.76s 2.42s
MUTAG RGCN 18.81s 2.40s 2.14s

(Bianchi et al., 2019), SGC (Wu et al., 2019), GCN (Kipf & Welling, 2017) and GAT (Veličković
et al., 2018) operators follow closely behind.

Graph Classification. We report the average accuracy of 10-fold cross validation on a number of
common benchmark datasets (cf. Table 2) where we randomly sample a training fold to serve as a
validation set. We only make use of discrete node features. In case they are not given, we use one-hot
encodings of node degrees as feature input. For all experiments, we use the global mean operator
to obtain graph-level outputs. Inspired by the Jumping Knowledge framework (Xu et al., 2018), we
compute graph-level outputs after each convolutional layer and combine them via concatenation.
For evaluating the (global) pooling operators, we use the GraphSAGE operator as our baseline. We
omit Jumping Knowledge when comparing global pooling operators, and hence report an additional
baseline based on global mean pooling. For each dataset, we tune (1) the number of hidden units
∈ {16, 32, 64, 128} and (2) the number of layers ∈ {2, 3, 4, 5} with respect to the validation set.
Due to standardized evaluations and network architectures, not all results are aligned with their offi-
cial reported values. For example, except for DiffPool (Ying et al., 2018), (global) pooling operators
do not perform as benefically as expected to their respective (flat) counterparts, especially when
baselines are enhanced by Jumping Knowledge (Xu et al., 2018). However, the potential of more
sophisticated approaches may not be well-reflected on these simple benchmark tasks (Cai & Wang,
2018). Among the flat GNN approaches, the GIN layer (Xu et al., 2019) generally achieves the best
results.

Point Cloud Classification. We evaluate various point cloud methods on ModelNet10 (Wu et al.,
2015) where we uniformly sample 1,024 points from mesh surfaces based on face area (cf. Table 3).
As hierarchical pooling layers, we use the iterative farthest point sampling algorithm followed by a
new graph generation based on a larger query ball (PointNet++ (Qi et al., 2017), MPNN (Gilmer
et al., 2017; Simonovsky & Komodakis, 2017) and SplineCNN (Fey et al., 2018)) or based on a fixed

4
Published as a workshop paper at ICLR 2019

number of nearest neighbors (EdgeCNN (Wang et al., 2018b) and PointCNN (Li et al., 2018)). We
have taken care to use approximately the same number of parameters for each model.
All approaches perform nearly identically with PointCNN (Li et al., 2018) taking a slight lead. We
attribute this to the fact that all operators are based on similar principles and might have the same
expressive power for the given task.

Runtime Experiments. We conduct several experiments on a number of dataset-model pairs to

report the runtime of a whole training procedure for 200 epochs obtained on a single NVIDIA GTX
1080 Ti (cf. Table 4). As it shows, PyG is very fast despite working on sparse data. Compared to the
Degree Bucketing (DB) approach of the Deep Graph Library (DGL) v0.2 (Wang et al., 2018a), PyG
trains models up to 40 times faster. Although runtimes are comparable when using gather and scatter
optimizations (GS) inside DGL, we could further improve runtimes of GAT (Veličković et al., 2018)
by up to 7 times by providing our own optimized sparse softmax kernels.

4 ROADMAP AND CONCLUSION

We presented the PyTorch Geometric framework for fast representation learning on graphs, point
clouds and manifolds. We are actively working to further integrate existing methods and plan to
quickly integrate future methods into our framework. All researchers and software engineers are
invited to collaborate with us in extending its scope.

ACKNOWLEDGMENTS
This work has been supported by the German Research Association (DFG) within the Collaborative
Research Center SFB 876, Providing Information by Resource-Constrained Analysis, projects A6
and B2. We thank Moritz Ludolph and all other contributors for their amazing involvement in this
project. Last but not least, we thank Christopher Morris for fruitful discussions, proofreading and
helpful advice.

R EFERENCES
P. W. Battaglia, J. B. Hamrick, V. Bapst, A. Sanchez-Gonzalez, V. F. Zambaldi, M. Malinowski,
A. Tacchetti, D. Raposo, A. Santoro, R. Faulkner, Ç. Gülçehre, F. Song, A. J. Ballard, J. Gilmer,
G. E. Dahl, A. Vaswani, K. Allen, C. Nash, V. Langston, C. Dyer, N. Heess, D. Wierstra, P. Kohli,
M. Botvinick, O. Vinyals, Y. Li, and R. Pascanu. Relational inductive biases, deep learning, and
graph networks. CoRR, abs/1806.01261, 2018.
F. M. Bianchi, D. Grattarola, L. Livi, and C. Alippi. Graph neural networks with convolutional
ARMA filters. CoRR, abs/1901.01343, 2019.
F. Bogo, J. Romero, M. Loper, and M. J. Black. FAUST: Dataset and evaluation for 3D mesh regis-
tration. In CVPR, 2014.
A. Bojchevski and S. Günnemann. Deep gaussian embedding of attributed graphs: Unsupervised
inductive learning via ranking. In ICLR, 2018.
E. Boschee, J. Lautenschlager, S. O’Brien, S. Shellman, J. Starz, and M. Ward. ICEWS coded event
data. Harvard Dataverse, 2015.
M. M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, and P. Vandergheynst. Geometric deep learning:
Going beyond euclidean data. In Signal Processing Magazine, 2017.
C. Cai and Y. Wang. A simple yet effective baseline for non-attribute graph classification. CoRR,
abs/1811.03508, 2018.
C. Cangea, P. Veličković, N. Jovanović, T. N. Kipf, and P. Liò. Towards sparse hierarchical graph
classifiers. In NeurIPS-W, 2018.
A. X. Chang, T. Funkhouser, L. J. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva,
S. Song, H. Su, J. Xiao, L. Yi, and F. Yu. ShapeNet: An information-rich 3D model repository.
CoRR, abs/1512.03012, 2015.

5
Published as a workshop paper at ICLR 2019

M. Defferrard, X. Bresson, and P. Vandergheynst. Convolutional neural networks on graphs with

fast localized spectral filtering. In NIPS, 2016.
T. Derr, Y. Ma, and J. Tang. Signed graph convolutional networks. In ICDM, 2018.
I. S. Dhillon, Y. Guan, and B. Kulis. Weighted graph cuts without eigenvectors: A multilevel ap-
proach. In TPAMI, 2007.
B. O. Fagginger Auer and R. H. Bisseling. A GPU algorithm for greedy graph matching. In Facing
the Multicore - Challenge II - Aspects of New Paradigms and Technologies in Parallel Computing,
2011.
M. Fey. Just jump: Dynamic neighborhood aggregation in graph neural networks. In ICLR-W, 2019.
M. Fey, J. E. Lenssen, F. Weichert, and H. Müller. SplineCNN: Fast geometric deep learning with
continuous B-spline kernels. In CVPR, 2018.
H. Gao and S. Ji. Graph U-Net. https://openreview.net/forum?id=HJePRoAct7,
2018. Submitted to ICLR.
J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl. Neural message passing for
quantum chemistry. In ICML, 2017.
P. Guerrero, Y. Kleiman, M. Ovsjanikov, and N. J. Mitra. PCPNet: Learning local shape properties
from raw point clouds. Computer Graphics Forum, 37, 2018.
W. L. Hamilton, R. Ying, and J. Leskovec. Inductive representation learning on large graphs. In
NIPS, 2017.
W. Jin, C. Zhang, P. Szekely, and X. Ren. Recurrent event network for reasoning over temporal
knowledge graphs. In ICLR-W, 2019.
K. Kersting, N. M. Kriege, C. Morris, P. Mutzel, and M. Neumann. Benchmark data sets for graph
kernels. http://graphkernels.cs.tu-dortmund.de, 2016.
T. N. Kipf and M. Welling. Variational graph auto-encoders. In NIPS-W, 2016.
T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. In
ICLR, 2017.
J. Klicpera, A. Bojchevski, and S. Günnemann. Predict then propagate: Graph neural networks meet
personalized PageRank. In ICLR, 2019.
S. Kumar, F. Spezzano, V. Subrahmanian, and C. Faloutsos. Edge weight prediction in weighted
signed networks. In ICDM, 2016.
K. Leetaru and P. A. Schrodt. GDELT: Global data on events, location, and tone. ISA Annual
Convention, 2013.
Y. Li, D. Tarlow, M. Brockschmidt, and R. Zemel. Gated graph sequence neural networks. In ICLR,
2016.
Y. Li, R. Bu, M. Sun, W. Wu, X. Di, and B. Chen. PointCNN: Convolution on -transformed points.
In NeurIPS, 2018.
G. Montavon, M. Rupp, V. Gobre, A. Vazquez-Mayagoitia, K. Hansen, A. Tkatchenko, K. Müller,
and O. A. von Lilienfeld. Machine learning of molecular electronic properties in chemical com-
pound space. New Journal of Physics, 2013.
F. Monti, D. Boscaini, J. Masci, E. Rodolà, J. Svoboda, and M. M. Bronstein. Geometric deep
learning on graphs and manifolds using mixture model CNNs. In CVPR, 2017.
C. Morris, M. Ritzert, M. Fey, W. L. Hamilton, J. E. Lenssen, G. Rattan, and M. Grohe. Weisfeiler
and Leman go neural: Higher-order graph neural networks. In AAAI, 2019.

6
Published as a workshop paper at ICLR 2019

S. Pan, R. Hu, G. Long, J. Jiang, L. Yao, and C. Zhang. Adversarially regularized graph autoencoder
for graph embedding. In IJCAI, 2018.
A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga,
and A. Lerer. Automatic differentiation in PyTorch. In NIPS-W, 2017.
C. R. Qi, L. Yi, H. Su, and L. J. Guibas. PointNet++: Deep hierarchical feature learning on point
sets in a metric space. In NIPS, 2017.
R. Ramakrishnan, P. O. Dral, M. Rupp, and O. A. von Lilienfeld. Quantum chemistry structures and
properties of 134 kilo molecules. Scientific Data, 2014.
A. Ranjan, T. Bolkart, S. Sanyal, and M. J. Black. Generating 3D faces using convolutional mesh
autoencoders. In ECCV, 2018.
M. S. Schlichtkrull, T. N. Kipf, P. Bloem, R. van den Berg, I. Titov, and M. Welling. Modeling
relational data with graph convolutional networks. In ESWC, 2018.
G. Sen, G. Namata, M. Bilgic, and L. Getoor. Collective classification in network data. AI Magazine,
29, 2008.
O. Shchur, M. Mumme, A. Bojchevski, and S. Günnemann. Pitfalls of graph neural network evalu-
ation. In NeurIPS-W, 2018.
M. Simonovsky and N. Komodakis. Dynamic edge-conditioned filters in convolutional neural net-
works on graphs. In CVPR, 2017.
K. K. Thekumparampil, C. Wang, S. Oh, and L. Li. Attention-based graph neural network for semi-
supervised learning. CoRR, abs/1803.03735, 2018.
P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio. Graph attention net-
works. In ICLR, 2018.
P. Veličković, W. Fedus, W. L. Hamilton, P. Liò, Y. Bengio, and R. D. Hjeml. Deep graph infomax.
In ICLR, 2019.
O. Vinyals, S. Bengio, and M. Kudlur. Order matters: Sequence to sequence for sets. In ICLR, 2016.
M. Wang, L. Yu, A. Gan, D. Zheng, Y. Gai, Z. Ye, M. Li, J. Zhou, Q. Huang, J. Zhao, H. Lin,
C. Ma, D. Deng, Q. Guo, H. Zhang, J. Li, A. J. Smola, and Z. Zhang. Deep graph library. http:
//dgl.ai, 2018a.
Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M. Solomon. Dynamic graph CNN
for learning on point clouds. CoRR, abs/1801.07829, 2018b.
F. Wu, T. Zhang, A. H. de Souza Jr., C. Fifty, T. Yu, and K. Q. Weinberger. Simplifying graph
convolutional networks. CoRR, abs/1902.07153, 2019.
Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao. 3D ShapeNets: A deep represen-
tation for volumetric shapes. In CVPR, 2015.
K. Xu, C. Li, Y. Tian, T. Sonobe, K. Kawarabayashi, and S. Jegelka. Representation learning on
graphs with jumping knowledge networks. In ICML, 2018.
K. Xu, W. Hu, J. Leskovec, and S. Jegelka. How powerful are graph neural networks? In ICLR,
2019.
R. Ying, J. You, C. Morris, X. Ren, W. Hamilton, and J. Leskovec. Hierarchical graph representation
learning with differentiable pooling. In NeurIPS, 2018.
M. Zhang, Z. Cui, M. Neumann, and Y. Chen. An end-to-end deep learning architecture for graph
classification. In AAAI, 2018.

7
Published as a workshop paper at ICLR 2019

Coalesced GS Non-coalesced GS Coalesced SpMM Non-coalesced SpMM

2
Forward
Seconds for 1 000 runs
Backward
1.5

0.5

0
1 2 4 8 16 32 64 128 256
Average degree of 10 000 nodes

Figure 2: Forward and backward runtimes of 1 000 runs of gather and scatter operations (GS) in
comparison to sparse-matrix multiplications (SpMM) on Erdős RÃľnyi graphs with 10 000 nodes
and varying average node degrees. Runtimes are capped at two seconds due to visualization. We
report runtimes for both coalesced (i.e. ordered by row index) and non-coalesced sparse layout.

A GATHER AND SCATTER O PERATIONS

PyTorch Geometric makes heavy usage of gather and scatter operations to map node and edge in-
formation into edge and node parallel space, respectively. Despite inevitable non-coalesced memory
access, these operations achieve high data throughput by leveraging parallelization over all elements
and making use of atomic operations with approximately constant runtime in practice. Following
upon the PyTorch scatter_add implementation, we provide our own mean and max operations
to allow for all different kinds of aggregation.5
Figure 2 compares the runtime of gather and scatter operations (GS) to the frequently used alternative
of using sparse-matrix multiplications (SpMM). It shows that atomic operations only begin to throttle
the runtime when scattering graphs with high average node degree (≈ 128) and even exceed the
runtime of highly optimized SpMM executions, both for forward and backward passes.
Due to SpMM first converting adjacency matrices into Compressed Row Storage (CSR) format, it
expects coalesced sparse tensors (i.e. ordered by row index) which is expensive to compute on GPUs
and should be hence performed as part of the pre-processing. However, for the backward pass of
SpMM, coalescing is performed in any case due to the need of transposing the sparse tensor. In
contrast, GS is always fast, nevertheless of the input being coalesced. Additionally, it allows for
modifications of the graph connectivity (i.e. adding self-loops), allows bidirectional data flow, and
does naturally support the integration of central node and multi-dimensional edge information.
However, we do think that our GS scheme can still be improved, e.g., in highly dense graph settings
and towards reducing the memory footprint in the edge parallel space. In addition, it should be noted
that scatter operations are non-deterministic by nature on the GPU. Although we did not observe any
deviations for inference, training results can vary across the same manual seeds.

B DATASETS

We give detailed descriptions and statistics (cf. Table 5) of the datasets used in our experiments:

Citation Networks. In the citation network datasets Cora, Citeseer and Pubmed nodes represent
documents and edges represent citation links. The networks contain bag-of-words feature vectors for
each document. We treat the citation links as (undirected) edges. For training, we use 20 labels per
class.

5
GitHub repository: https://github.com/rusty1s/pytorch_scatter

8
Published as a workshop paper at ICLR 2019

Table 5: Statistics of the datasets used in the experiments.

Dataset Graphs Nodes Edges Features Classes Label rate
Cora 1 2,708 5,278 1,433 7 0.052
CiteSeer 1 3,327 4,552 3,703 6 0.036
PubMed 1 19,717 44,324 500 3 0.003
MUTAG 188 17.93 19.79 7 2 0.800
PROTEINS 1,113 39.06 72.82 3 2 0.800
COLLAB 5,000 74.49 2,457.22 — 3 0.800
IMDB-BINARY 1,000 19.77 96.53 — 2 0.800
REDDIT-BINARY 2,00 429.63 497.754 — 2 0.800
ModelNet10 4,899 1,024 ∼19,440 — 10 0.815

Social Network Datasets. COLLAB is derived from three public scientific collaboration datasets.
Each graph corresponds to an ego-network of different researchers from each field with the task to
label each graph to the field the corresponding researcher belongs to. IMDB-BINARY is a movie
collaboration dataset where each graph corresponds to an ego-network of actors/actresses. An edge
is drawn between two actors/actresses if they appear in the same movie. The task is to infer the genre
of the graph. REDDIT-BINARY is an online discussion dataset where each graph corresponds to a
thread. An edge is drawn between two users if one of them responded to another’s comment. The
task is to label each graph to the community/subreddit it belongs to.

Bioinformatic Datasets. MUTAG is a dataset consisting of mutagenetic aromatic and heteroaro-

matic nitro compounds. PROTEINS holds a set of proteins represented by graphs. Nodes represent
secondary structure elements (SSEs) which are connected whenever there are neighbors either in the
amino acid sequence or in 3D space.

3D Object Datasets. ModelNet10 is an orientation-aligned dataset of CAD models. Each model

corresponds to exactly one out of 10 object categories. Categories were chosen based on a list of the
most common object categories in the world.

Technical - Manual - For - Agilia - SP - Range - in - v1.6 - Agilia SP MC - Eng
No ratings yet
Technical - Manual - For - Agilia - SP - Range - in - v1.6 - Agilia SP MC - Eng
302 pages
Unit III GNN
No ratings yet
Unit III GNN
56 pages
GNNV4
No ratings yet
GNNV4
55 pages
Graph Neural Networks in Particle Physics: Implementations, Innovations, and Challenges
No ratings yet
Graph Neural Networks in Particle Physics: Implementations, Innovations, and Challenges
34 pages
GNN Foundations Frontiers and Applications Chapter6
No ratings yet
GNN Foundations Frontiers and Applications Chapter6
21 pages
PyTorch & PyTorch Geometric
No ratings yet
PyTorch & PyTorch Geometric
21 pages
ReGNN A Redundancy-Eliminated Graph Neural Networks Accelerator
No ratings yet
ReGNN A Redundancy-Eliminated Graph Neural Networks Accelerator
15 pages
A Graph Neural Network Accelerator
No ratings yet
A Graph Neural Network Accelerator
14 pages
GCNN
No ratings yet
GCNN
11 pages
Applying Graph Neural Networks To The Decision Ver 240324 001255
No ratings yet
Applying Graph Neural Networks To The Decision Ver 240324 001255
14 pages
Enadpool: The Edge-Node Attention-Based Differentiable Pooling For Graph Neural Networks
No ratings yet
Enadpool: The Edge-Node Attention-Based Differentiable Pooling For Graph Neural Networks
9 pages
GRL Unit 3
No ratings yet
GRL Unit 3
14 pages
Hierarchical Graph Neural Networks
No ratings yet
Hierarchical Graph Neural Networks
14 pages
Simonovsky Dynamic Edge-Conditioned Filters CVPR 2017 Paper
No ratings yet
Simonovsky Dynamic Edge-Conditioned Filters CVPR 2017 Paper
10 pages
Tutorial 3
No ratings yet
Tutorial 3
82 pages
Hierarchical Graph Pooling With Structure Learning
No ratings yet
Hierarchical Graph Pooling With Structure Learning
9 pages
A New Model For Learning in Graph Domains
No ratings yet
A New Model For Learning in Graph Domains
7 pages
Dynamic Hypergraph Neural Networks: Jianwen Jiang Yuxuan Wei Yifan Feng Jingxuan Cao Yue Gao
No ratings yet
Dynamic Hypergraph Neural Networks: Jianwen Jiang Yuxuan Wei Yifan Feng Jingxuan Cao Yue Gao
7 pages
Original Paper
No ratings yet
Original Paper
10 pages
Limitations of MP-GNN: Link Prediction
No ratings yet
Limitations of MP-GNN: Link Prediction
50 pages
QuadroUno Comp PDF
100% (7)
QuadroUno Comp PDF
29 pages
Ometric
100% (1)
Ometric
26 pages
Masked Attention Is All You Need For Graphs: Duvenaud Et Al. 2015 Kearnes Et Al. 2016 Gilmer Et Al. 2017
No ratings yet
Masked Attention Is All You Need For Graphs: Duvenaud Et Al. 2015 Kearnes Et Al. 2016 Gilmer Et Al. 2017
15 pages
Graph Convolutional Networks: A Comprehensive Review: Open Access Research
No ratings yet
Graph Convolutional Networks: A Comprehensive Review: Open Access Research
23 pages
Thura2022-05-30 - Geometric Deep Learning
No ratings yet
Thura2022-05-30 - Geometric Deep Learning
33 pages
(2020 Arxiv) A Survey On The Expressive Power of Graph Neural Networks
No ratings yet
(2020 Arxiv) A Survey On The Expressive Power of Graph Neural Networks
42 pages
Thesis Master 2022 Application of GNN For Graph Classification
No ratings yet
Thesis Master 2022 Application of GNN For Graph Classification
81 pages
Thesis 2021 Optimizaiton GNN Stathas Nistath Meng Eecs 2021 Thesis
No ratings yet
Thesis 2021 Optimizaiton GNN Stathas Nistath Meng Eecs 2021 Thesis
79 pages
Suevey On GNN
No ratings yet
Suevey On GNN
31 pages
Dirac-Bianconi Graph Neural Networks - Enabling Non-Diffusive Long-Range Graph Predictions
No ratings yet
Dirac-Bianconi Graph Neural Networks - Enabling Non-Diffusive Long-Range Graph Predictions
14 pages
Graph Convolutional Networks Adaptations and Applications
No ratings yet
Graph Convolutional Networks Adaptations and Applications
6 pages
Self Attention Graph Pooling
No ratings yet
Self Attention Graph Pooling
10 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Graph Element Networks: Adapative Structured Computation and Memory
No ratings yet
Graph Element Networks: Adapative Structured Computation and Memory
11 pages
DGCNN
No ratings yet
DGCNN
8 pages
GNN PPoPP 2021
No ratings yet
GNN PPoPP 2021
14 pages
Gnns
No ratings yet
Gnns
75 pages
Deep Learning On Graphs: A Survey: Ziwei Zhang, Peng Cui and Wenwu Zhu, Fellow, IEEE
No ratings yet
Deep Learning On Graphs: A Survey: Ziwei Zhang, Peng Cui and Wenwu Zhu, Fellow, IEEE
24 pages
A Comparison Between Recursive Neural Networks and Graph Neural Networks
No ratings yet
A Comparison Between Recursive Neural Networks and Graph Neural Networks
8 pages
GNN Hands On 02
No ratings yet
GNN Hands On 02
11 pages
Featgraph: A Flexible and Efficient Backend For Graph Neural Network Systems
No ratings yet
Featgraph: A Flexible and Efficient Backend For Graph Neural Network Systems
12 pages
DLP Grade 10 Assembly-Disassembly Lesson
No ratings yet
DLP Grade 10 Assembly-Disassembly Lesson
10 pages
Original GNN
No ratings yet
Original GNN
22 pages
The Graph Neural Network Model
No ratings yet
The Graph Neural Network Model
20 pages
Approximation - and Quantization-Aware Training For Graph Neural Networks
No ratings yet
Approximation - and Quantization-Aware Training For Graph Neural Networks
14 pages
Computing Graph Neural Networks: A Survey From Algorithms To Accelerators
No ratings yet
Computing Graph Neural Networks: A Survey From Algorithms To Accelerators
38 pages
Bacciu 2020
No ratings yet
Bacciu 2020
62 pages
B EIM GRADE 12 Q1M1 Learner Copy-TVL Final Layout
100% (2)
B EIM GRADE 12 Q1M1 Learner Copy-TVL Final Layout
25 pages
Graph Neural Networks
100% (1)
Graph Neural Networks
27 pages
Yang 20 A
No ratings yet
Yang 20 A
16 pages
A Comprehensive Survey On Graph Neural Networks
No ratings yet
A Comprehensive Survey On Graph Neural Networks
22 pages
Content Augmented Graph Neural Networks
No ratings yet
Content Augmented Graph Neural Networks
15 pages
Graph Neural Networks: A Review of Methods and Applications
No ratings yet
Graph Neural Networks: A Review of Methods and Applications
22 pages
Papers Papers PDF
No ratings yet
Papers Papers PDF
48 pages
A Gentle Introduction To Graph Neural Networks
No ratings yet
A Gentle Introduction To Graph Neural Networks
9 pages
GNNS
No ratings yet
GNNS
7 pages
Defence Transcription
No ratings yet
Defence Transcription
4 pages
SplineCNN-Fast Geometric Deep Learning With Continuous B-Spline Kernels
No ratings yet
SplineCNN-Fast Geometric Deep Learning With Continuous B-Spline Kernels
9 pages
Graph Neural Networks: A Review of Methods and Applications
No ratings yet
Graph Neural Networks: A Review of Methods and Applications
20 pages
A Survey of Graph Neural Networks in Various Learning Paradigms Methods, Applications, and Challenges
No ratings yet
A Survey of Graph Neural Networks in Various Learning Paradigms Methods, Applications, and Challenges
70 pages
GNN Review
No ratings yet
GNN Review
26 pages
A Comprehensive Survey of Graph Neural Networks PDF
No ratings yet
A Comprehensive Survey of Graph Neural Networks PDF
22 pages
ABB Auxiliary Signalling and Trippning Relays RXMA 1 RXMA 2 RXMS 1 RXSF 1 RXME 1 RXME 18 RXMH 2 RXM
No ratings yet
ABB Auxiliary Signalling and Trippning Relays RXMA 1 RXMA 2 RXMS 1 RXSF 1 RXME 1 RXME 18 RXMH 2 RXM
20 pages
ABB REB500 Commissioning Manual
No ratings yet
ABB REB500 Commissioning Manual
94 pages
CPU Scheduling
No ratings yet
CPU Scheduling
63 pages
PDF Durst Laborator 1200 Manual Eng
100% (1)
PDF Durst Laborator 1200 Manual Eng
135 pages
BENNING Tebechop 3000HD DM
No ratings yet
BENNING Tebechop 3000HD DM
8 pages
15572854118361575jy NARI MUX 2M Interface
No ratings yet
15572854118361575jy NARI MUX 2M Interface
2 pages
Subhadip Mitra Major Project
No ratings yet
Subhadip Mitra Major Project
7 pages
IC Electronic English Catalogue 2010
No ratings yet
IC Electronic English Catalogue 2010
48 pages
HP Aruba Certified Network Security Professional - HPE7-A02 Free Exam Questions (2024) - 6
No ratings yet
HP Aruba Certified Network Security Professional - HPE7-A02 Free Exam Questions (2024) - 6
4 pages
12 IP Webbrowser8
No ratings yet
12 IP Webbrowser8
3 pages
Data Communication Slide
No ratings yet
Data Communication Slide
309 pages
Types of Suspension
No ratings yet
Types of Suspension
4 pages
Apds Unified Communications Assessment
No ratings yet
Apds Unified Communications Assessment
17 pages
Valtek Beta Positioners: For Control Valves
No ratings yet
Valtek Beta Positioners: For Control Valves
8 pages
101NDXFGLUUU1
No ratings yet
101NDXFGLUUU1
2 pages
Camshaft Angle Variator
No ratings yet
Camshaft Angle Variator
2 pages
Report of Sony Picture Attack
No ratings yet
Report of Sony Picture Attack
5 pages
Indian Digital Marketing Spends by Industry
No ratings yet
Indian Digital Marketing Spends by Industry
19 pages
Concur Expense EXP - SG - Workflow - AuthAppr
No ratings yet
Concur Expense EXP - SG - Workflow - AuthAppr
38 pages
Lundahl: Tube Amplifier Interstage Transformer / Line Output Transformer LL1692A
No ratings yet
Lundahl: Tube Amplifier Interstage Transformer / Line Output Transformer LL1692A
2 pages
Rochedale PSP
No ratings yet
Rochedale PSP
123 pages
页面提取自－catalogue 2
No ratings yet
页面提取自－catalogue 2
9 pages
8 Linux Advanced File System Management m8 Slides
No ratings yet
8 Linux Advanced File System Management m8 Slides
24 pages
Data Processing
No ratings yet
Data Processing
7 pages
Data Sheet No. 01.12.01 - PR6 - 7 Inductive Pulse Module
No ratings yet
Data Sheet No. 01.12.01 - PR6 - 7 Inductive Pulse Module
1 page
04 Baterias
No ratings yet
04 Baterias
2 pages

F G R L P T G: AST Raph Epresentation Earning With Y Orch Eometric

Uploaded by

F G R L P T G: AST Raph Epresentation Earning With Y Orch Eometric

Uploaded by

Published as a workshop paper at ICLR 2019

FAST G RAPH R EPRESENTATION L EARNING WITH

Neighborhood Aggregation. Generalizing the convolutional operator to irregular domains is typ-

We evaluate the correctness of the implemented methods by performing a comprehensive compara-

Semi-supervised Node Classification. We perform semi-supervised node classification (cf. Ta-

Table 2: Graph classification.

Table 4: Training runtime comparison.

Runtime Experiments. We conduct several experiments on a number of dataset-model pairs to

4 ROADMAP AND CONCLUSION

M. Defferrard, X. Bresson, and P. Vandergheynst. Convolutional neural networks on graphs with

Coalesced GS Non-coalesced GS Coalesced SpMM Non-coalesced SpMM

A GATHER AND SCATTER O PERATIONS

Table 5: Statistics of the datasets used in the experiments.

Bioinformatic Datasets. MUTAG is a dataset consisting of mutagenetic aromatic and heteroaro-

3D Object Datasets. ModelNet10 is an orientation-aligned dataset of CAD models. Each model

You might also like