0% found this document useful (0 votes)
53 views10 pages

PyTorch Geometric Temporal Spatiotemporal Signal Processing

PyTorch Geometric Temporal is an open-source deep learning framework designed for spatiotemporal signal processing using neural machine learning models. It aims to provide an easy-to-use platform for researchers and practitioners, featuring modular neural network layers, data loaders for spatiotemporal datasets, and benchmark datasets from various domains. The framework demonstrates strong predictive performance on real-world problems such as epidemiological forecasting and web traffic management, and is built on existing libraries within the PyTorch ecosystem.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views10 pages

PyTorch Geometric Temporal Spatiotemporal Signal Processing

PyTorch Geometric Temporal is an open-source deep learning framework designed for spatiotemporal signal processing using neural machine learning models. It aims to provide an easy-to-use platform for researchers and practitioners, featuring modular neural network layers, data loaders for spatiotemporal datasets, and benchmark datasets from various domains. The framework demonstrates strong predictive performance on real-world problems such as epidemiological forecasting and web traffic management, and is built on existing libraries within the PyTorch ecosystem.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

PyTorch Geometric Temporal: Spatiotemporal Signal Processing

with Neural Machine Learning Models


Benedek Rozemberczki∗ Paul Scherer Yixuan He
AstraZeneca University of Cambridge University of Oxford
United Kingdom United Kingdom United Kingdom
[email protected] [email protected] [email protected]

George Panagopoulos Alexander Riedel Maria Astefanoaei


École Polytechnique Ernst-Abbe University for Applied IT University of Copenhagen
arXiv:2104.07788v3 [cs.LG] 10 Jun 2021

France Sciences Denmark


[email protected] Germany [email protected]
[email protected]

Oliver Kiss Ferenc Beres Guzmán López


Central European University ELKH SZTAKI Tryolabs
Hungary Hungary Uruguay
[email protected] [email protected] [email protected]

Nicolas Collignon Rik Sarkar


Pedal Me The University of Edinburgh
United Kingdom United Kingdom
[email protected] [email protected]

ABSTRACT ACM Reference Format:


We present PyTorch Geometric Temporal a deep learning frame- Benedek Rozemberczki, Paul Scherer, Yixuan He, George Panagopoulos,
Alexander Riedel, Maria Astefanoaei, Oliver Kiss, Ferenc Beres, Guzmán
work combining state-of-the-art machine learning algorithms for
López, Nicolas Collignon, and Rik Sarkar. 2021. PyTorch Geometric Tempo-
neural spatiotemporal signal processing. The main goal of the li- ral: Spatiotemporal Signal Processing with Neural Machine Learning Models.
brary is to make temporal geometric deep learning available for In CIKM’21: ACM International Conference on Information and Knowledge
researchers and machine learning practitioners in a unified easy- Management, 1-5 November 2021, Online. ACM, New York, NY, USA, 10 pages.
to-use framework. PyTorch Geometric Temporal was created with https://doi.org/10.1145/nnnnnnn.nnnnnnn
foundations on existing libraries in the PyTorch eco-system, stream-
lined neural network layer definitions, temporal snapshot gener-
ators for batching, and integrated benchmark datasets. These fea- 1 INTRODUCTION
tures are illustrated with a tutorial-like case study. Experiments Deep learning on static graph structured data has seen an unprece-
demonstrate the predictive performance of the models implemented dented success in various business and scientific application do-
in the library on real world problems such as epidemiological fore- mains. Neural network layers which operate on graph data can
casting, ride-hail demand prediction and web-traffic management. serve as building blocks of document labeling, fraud detection, traf-
Our sensitivity analysis of runtime shows that the framework can fic forecasting and cheminformatics systems [7, 45–47, 63]. This
potentially operate on web-scale datasets with rich temporal fea- emergence and the wide spread adaptation of geometric deep learn-
tures and spatial structure. ing was made possible by open-source machine learning libraries.
The high quality, breadth, user oriented nature and availability of
∗ The project started when the author was a doctoral student of the Center for Doctoral specialized deep learning libraries [13, 15, 46, 67] were all contribut-
Training in Data Science at The University of Edinburgh. ing factors to the practical success and large-scale deployment
of graph machine learning systems. At the same time the exist-
ing geometric deep learning frameworks operate on graphs which
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed have a fixed topology and it is also assumed that the node features
for profit or commercial advantage and that copies bear this notice and the full citation and labels are static. Besides limiting assumptions about the input
on the first page. Copyrights for components of this work owned by others than ACM data, these off-the-shelf libraries are not designed to operate on
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a spatiotemporal data.
fee. Request permissions from [email protected]. Present work. We propose PyTorch Geometric Temporal, an
CIKM’21, 1-5 November 2021, Online open-source Python library for spatiotemporal machine learning.
© 2021 Association for Computing Machinery.
ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00 We designed PyTorch Geometric Temporal with a simple and con-
https://doi.org/10.1145/nnnnnnn.nnnnnnn sistent API inspired by the software architecture of existing widely
CIKM’21, 1-5 November 2021, Online B. Rozemberczki, P. Scherer, Y. He, G. Panagopoulos, A. Riedel, M. Astefanoaei, O. Kiss, F. Beres, G. López, N. Collignon, and R. Sarkar

used geometric deep learning libraries from the PyTorch ecosystem can differ in terms of the dynamics of the graph and that of the
[15, 40]. Our framework was built by applying simple design prin- modelled vertex attributes. We take a discrete temporal snapshot
ciples consistently. The framework reuses existing neural network view of this data representation problem [25, 26] and our work
layers in a modular manner, models have a limited number of public considers three spatiotemporal data types which can be described
methods and hyperparameters can be inspected. Spatiotemporal by the subplots of Figure 1 and the following formal definitions:
signal iterators ingest data memory efficiently in widely used scien-
Definition 2.1. Dynamic graph with temporal signal A dy-
tific computing formats and return those in a PyTorch compatible
namic graph with a temporal signal is the ordered set of graph and
format. The design principles in combination with the test coverage,
node feature matrix tuples D = {(G1, X1 ), . . . , (G𝑇 , X𝑇 )} where
documentation, practical tutorials, continuous integration, package
the vertex sets satisfy that 𝑉𝑡 = 𝑉 , ∀𝑡 ∈ {1, . . . ,𝑇 } and the node
indexing and frequent releases make the framework an end-user
friendly spatiotemporal machine learning system. feature matrices that X𝑡 ∈ R |𝑉 |×𝑑 , ∀𝑡 ∈ {1, . . . ,𝑇 } .
The experimental evaluation of the framework entails node level Definition 2.2. Dynamic graph with static signal. A dynamic
regression tasks on datasets released exclusively with the frame- graph with a static signal is the ordered set of graph and node
work. Specifically, we compare the predictive performance of spa- feature matrix tuples D = {(G1, X), . . . , (G𝑇 , X)} where vertex
tiotemporal graph neural networks on epidemiological forecasting, sets satisfy 𝑉𝑡 = 𝑉 , ∀𝑡 ∈ {1, . . . ,𝑇 } and the node feature matrix that
demand planning, web traffic management and social media in- X ∈ R |𝑉 |×𝑑 .
teraction prediction tasks. Synthetic experiments show that with
the right batching strategy PyTorch Geometric Temporal is highly Definition 2.3. Static graph with temporal signal. A static
scalable and benefits from GPU accelerated computing. graph with a temporal signal is the ordered set of graph and node
Our contributions. The main contributions of our work can be feature matrix tuples D = {(G, X1 ), . . . , (G, X𝑇 )} where the node
summarized as: feature matrix satisfies that X𝑡 ∈ R |𝑉 |×𝑑 , ∀𝑡 ∈ {1, . . . ,𝑇 } .
• We publicly release PyTorch Geometric Temporal the first Representing spatiotemporal data based on these theoretical
deep learning library for parametric spatiotemporal machine concepts allows us the creation of memory efficient data structures
learning models. which conceptualize these definitions in practice well.
• We provide data loaders and iterators with PyTorch Geometric
Temporal which can handle spatiotemporal datasets.
• We release new spatiotemporal benchmark datasets from
the renewable energy production, epidemiological reporting,
goods delivery and web traffic forecasting domains.
• We evaluate the spatiotemporal forecasting capabilities of
the neural and parametric machine learning models available
in PyTorch Geometric Temporal on real world datasets.
The remainder of the paper has the following structure. In Sec-
(a) Dynamic graph with temporal signal.
tion 2 we overview important preliminaries and the related work
about temporal and geometric deep learning and the characteris-
tics of related open-source machine learning software. The main
design principles of PyTorch Geometric Temporal are discussed in
Section 3 with a practical example. We demonstrate the forecasting
capabilities of the framework in Section 4 where we also evalu-
ate the scalability of the library on various commodity hardware.
We conclude in Section 5 where we summarize the results. The
source code of PyTorch Geometric Temporal is publicly available (b) Dynamic graph with static signal.
at https://github.com/benedekrozemberczki/pytorch_geometric_
temporal; the Python package can be installed via the Python Pack-
age Index. Detailed documentation is accessible at https://pytorch-
geometric-temporal.readthedocs.io/.

2 PRELIMINARIES AND RELATED WORK


In order to position our contribution and highlight its significance
we introduce some important concepts about spatiotemporal data
and discuss related literature about geometric deep learning and (c) Static graph with temporal signal.

machine learning software.


Figure 1: The data iterators in PyTorch Geometric Tempo-
2.1 Temporal Graph Sequences ral can provide temporal snapshots for all of the non static
geometric deep learning scenarios.
Our framework considers specific input data types on which the
spatiotemporal machine learning models operate. Input data types
PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models CIKM’21, 1-5 November 2021, Online

2.2 Deep Learning with Time and Geometry Table 1: A comparison of spatiotemporal deep learning mod-
els in PyTorch Geometric Temporal based on the temporal
Our work provides deep learning models that operate on data which
and spatial block, order of proximity considered and the het-
has both temporal and spatial aspects. These techniques are natural
ereogeneity of the edges.
recombinations of existing neural network layers that operate on
sequences and static graph-structured data.
Temporal GNN Proximity Multi
Model
2.2.1 Temporal Deep Learning. A large family of temporal deep Layer Layer Order Type
learning models such as the LSTM [24] and GRU [12] generates DCRNN [32] GRU DiffConv Higher False
in-memory representations of data points which are iteratively GConvGRU [51] GRU Chebyshev Lower False
updated as it learns by new snapshots. Another family of deep GConvLSTM [51] LSTM Chebyshev Lower False
learning models uses the attention mechanism [3, 35, 59] to learn GC-LSTM [10] LSTM Chebyshev Lower True
representations of the data points which are adaptively recontextu-
DyGrAE [54, 55] LSTM GGCN Higher False
alized based on the temporal history. These types of models serve
as templates for the temporal block of spatiotemporal deep learning LRGCN [31] LSTM RGCN Lower False
models. EGCN-H [39] GRU GCN Lower False
EGCN-O [39] LSTM GCN Lower False
2.2.2 Static Graph Representation Learning. Learning representa-
tions of vertices, edges and whole graphs with graph neural net- T-GCN [65] GRU GCN Lower False
works in a supervised or unsupervised way can be described by A3T-GCN [68] GRU GCN Lower False
the message passing formalism [17]. In this conceptual framework AGCRN [4] GRU Chebyshev Higher False
using the node and edge attributes in a graph as parametric func- MPNN LSTM [38] LSTM GCN Lower False
tion generates compressed representations (messages) which are STGCN [63] Attention Chebyshev Higher False
propagated between the nodes based on a message-passing rule
ASTGCN [22] Attention Chebyshev Higher False
and aggregated to form new representations. Most of the existing
graph neural network architectures such as GCN [30], GGCN[33], MSTGCN [22] Attention Chebyshev Higher False
ChebyConv [14], and RGCN [50] fit perfectly into this general de- GMAN [66] Attention Custom Lower False
scription of graph neural networks. Models are differentiated by MTGNN [61] Attention Custom Higher False
assumptions about the input graph (e.g. node heterogeneity, mul- AAGCN [52] Attention Custom Higher False
tiplexity, presence of edge attributes), the message compression
function used, the propagation scheme and message aggregation
function applied to the received messages. the supervised training of temporal graph representation learning
2.2.3 Spatiotemporal Deep Learning. A spatiotemporal deep learn- models with graphics card based acceleration.
ing model fuses the basic conceptual ideas of temporal deep learning
techniques and graph representation learning. Operating on a tem- Table 2: A desiderata and automatic differentiation backend
poral graph sequence these models perform message-passing at library based comparison of open-source geometric deep
each time point with a graph neural network block and the new learning libraries.
temporal information is incorporated by a temporal deep learning
block. This design allows for sharing salient temporal and spatial Library Backend Supervised Temporal GPU
autocorrelation information across the spatial units. The temporal PT Geometric [15] PT ✔ ✘ ✔
and spatial layers which are fused together in a single parametric Geometric2DR [49] PT ✘ ✘ ✔
machine learning model are trained together jointly by exploit- CogDL [9] PT ✔ ✘ ✔
Spektral [21] TF ✔ ✘ ✔
ing the fact that the fused models are end-to-end differentiable. In
TF Geometric [27] TF ✔ ✘ ✔
Table 1 we summarized the spatiotemporal deep learning models
StellarGraph [13] TF ✔ ✘ ✔
implemented in the framework which we categorized based on DGL [67] TF/PT/MX ✔ ✘ ✔
the temporal and graph neural network layer blocks, the order of DIG [34] PT ✔ ✘ ✔
spatial proximity and heterogeneity of the edge set. Jraph [18] JAX ✔ ✘ ✔
Graph-Learn [62] Custom ✔ ✘ ✔
2.3 Graph Representation Learning Software GEM [19] TF ✘ ✘ ✔
The current graph representation learning software ecosystem DynamicGEM [20] TF ✘ ✔ ✔
OpenNE [57] Custom ✘ ✘ ✘
which allows academic research and industrial deployment extends
Karate Club [46] Custom ✘ ✘ ✘
open-source auto-differentiation libraries such as TensorFlow [1],
Our Work PT ✔ ✔ ✔
PyTorch [41], MxNet [11] and JAX [16, 28]. Our work does the same
as we build on the PyTorch Geometric ecosystem. We summarized
the characteristics of these libraries in Table 2 which allows for com-
paring frameworks based on the backend, presence of supervised 2.4 Spatiotemporal Data Analytics Software
training functionalities, presence of temporal models and GPU sup- The open-source ecosystem for spatiotemporal data processing
port. Our proposed framework is the only one to date which allows consists of specialized database systems, basic analytical tools and
CIKM’21, 1-5 November 2021, Online B. Rozemberczki, P. Scherer, Y. He, G. Panagopoulos, A. Riedel, M. Astefanoaei, O. Kiss, F. Beres, G. López, N. Collignon, and R. Sarkar

advanced machine learning libraries. We summarized the charac- 3.1.3 Limited number of public methods. The spatiotemporal neu-
teristics of the most popular libraries in Table 3 with respect to the ral network layers in our framework have a limited number of public
year of release, purpose of the framework, source code language methods for simplicity. For example, the auxiliary layer initializa-
and GPU support. tion methods and other internal model mechanics are implemented
First, it is evident that most spatiotemporal data processing tools as private methods. All of the layers provide a forward method and
are fairly new and there is much space for contributions in each those which explicitly use the message-passing scheme in PyTorch
subcategory. Second, the database systems are written in high- Geometric provide a public message method.
performance languages while the analytics and machine learning
oriented tools have a pure Python/R design or a wrapper written 3.1.4 Auxiliary layers. The auxiliary neural network layers which
in these languages. Finally, the use of GPU acceleration is not are not part of the PyTorch Geometric ecosystem such as diffusion
widespread which alludes to the fact that current spatiotemporal convolutional graph neural networks [32] are implemented as stan-
data processing tools might have a scalability issue. Our proposed dalone neural network layers in the framework. These layers are
framework PyTorch Geometric Temporal is the first fully open- available for the design of novel neural network architectures as
source GPU accelerated spatiotemporal machine learning library. individual components.

Table 3: A multi-aspect comparison of open-source spa- 3.2 Data Structures


tiotemporal database systems, data analytics libraries and The design of PyTorch Geometric Temporal required the introduction
machine learning frameworks. of custom data structures which can efficiently store the datasets
and provide temporally ordered snapshots for batching.
Library Year Purpose Language GPU
GeoWave [60] 2016 Database Java ✘ 3.2.1 Spatiotemporal Signal Iterators. Based on the categorization
StacSpec [23] 2017 Database Javascript ✘ of spatiotemporal signals discussed in Section 2 we implemented
MobilityDB [69] 2019 Database C ✘ three types of Spatiotemporal Signal Iterators. These iterators store
PyStac [44] 2020 Database Python ✘ spatiotemporal datasets in memory efficiently without redundancy.
StaRs [42] 2017 Analytics R ✘
For example a Static Graph Temporal Signal iterator will not store
CuSpatial [56] 2019 Analytics Python ✔
the edge indices and weights for each time period in order to save
PySAL [43] 2017 Machine Learning Python ✘
STDMTMB [2] 2018 Machine Learning R ✘ memory. By iterating over a Spatiotemporal Signal Iterator at each
Our work 2021 Machine Learning Python ✔ step a graph snapshot is returned which describes the graph of
interest at a given point in time. Graph snapshots are returned in
temporal order by the iterators. The Spatiotemporal Signal Iterators
can be indexed directly to access a specific graph snapshot – a
3 THE FRAMEWORK DESIGN design choice which allows the use of advanced temporal batching.
Our primary goal is to give a general theoretical overview of the
framework, discuss the framework design choices, give a detailed 3.2.2 Graph Snapshots. The time period specific snapshots which
practical example and highlight our strategy for the long term consist of labels, features, edge indices and weights are stored as
viability and maintenance of the project. NumPy arrays [58] in memory, but returned as a PyTorch Geometric
Data object instance [15] by the Spatiotemporal Signal Iterators
3.1 Neural Network Layer Design when these are iterated on. This design choice hedges against the
proliferation of classes and exploits the existing and widely used
The spatiotemporal neural network layers are implemented as
compact data structures from the PyTorch ecosystem [40].
classes in the framework. Each of the classes has a similar architec-
ture driven by a few simple design principles. 3.2.3 Train-Test Splitting. As part of the library we provide a tem-
3.1.1 Non-proliferation of classes. The framework reuses the ex- poral train-test splitting function which creates train and test snap-
isting high level neural network layer classes as building blocks shot iterators from a Spatiotemporal Signal Iterator given a test
from the PyTorch and PyTorch Geometric ecosystems. The goal of dataset ratio. This parameter of the splitting function decides the
the library is not to replace the existing frameworks. This design fraction of data that is separated from the end of the spatiotem-
strategy makes sure that the number of auxiliary classes in the poral graph snapshot sequence for testing. The returned iterators
framework is kept low and that the framework interfaces well with have the same type as the input iterator. Importantly, this splitting
the rest of the ecosystem. does not influence the applicability of widely used semi-supervised
model training strategies such as node masking.
3.1.2 Hyperparameter inspection and type hinting. The neural net-
work layers do not have default hyperparameter settings as some 3.2.4 Integrated Benchmark Dataset Loaders. We provided easy-to-
of these have to be set in a dataset dependent manner. In order use practical data loader classes for widely used existing [38] and
to help with this, the layer hyperparameters are stored as public the newly released benchmark datasets. These loaders return Spa-
class attributes and they are available for inspection. Moreover, the tiotemporal Signal Iterators which can be used for training existing
constructors of the neural network layers use type hinting which and custom designed spatiotemporal neural network architectures
helps the end-users to set the hyperparameters. to solve supervised machine learning problems.
PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models CIKM’21, 1-5 November 2021, Online

3.3 Design in Practice Case Study: Cumulative convolutional filter parameter in the constructor (line 6). The model
Model Training on CPU consists of a one-hop Diffusion Convolutional Recurrent Neural
Network layer [32] and a fully connected layer with a single neuron
In the following we overview a simple end-to-end machine learning
(lines 8-9).
pipeline designed with PyTorch Geometric Temporal. These code
In the forward pass method of the neural network the model
snippets solve a practical epidemiological forecasting problem –
uses the vertex features, edges and the optional edge weights (line
predicting the weekly number of chickenpox cases in Hungary [47].
11). The initial recurrent graph convolution based aggregation (line
The pipeline consists of data preparation, model definition, training
12) is followed by a rectified linear unit activation function [37]
and evaluation phases.
and dropout [53] for regularization (lines 13-14). Using the fully-
1 from torch_geometric_temporal import ChickenpoxDatasetLoader connected layer the model outputs a single score for each spatial
2 from torch_geometric_temporal import temporal_signal_split unit (lines 15-16).
3

4 loader = ChickenpoxDatasetLoader() 1 model = RecurrentGCN(node_features=8, filters=32)


5 2

6 dataset = loader.get_dataset() 3 optimizer = torch.optim.Adam(model.parameters(), lr=0.01)


7 4

8 train, test = temporal_signal_split(dataset, 5 model.train()


9 train_ratio=0.9) 6
7 for epoch in range(200):
8 cost = 0
Listings 1: Loading a default benchmark dataset and creat- 9 for time, snapshot in enumerate(train):
ing a temporal split with PyTorch Geometric Temporal 10 y_hat = model(snapshot.x,
11 snapshot.edge_index,
3.3.1 Dataset Loading and Splitting. In Listings 1 as a first step 12 snapshot.edge_attr)
we import the Hungarian chickenpox cases benchmark dataset 13 cost = cost + torch.mean((y_hat-snapshot.y)**2)
loader and the temporal train test splitter function (lines 1-2). We 14 cost = cost / (time+1)
define the dataset loader (line 4) and use the get_dataset() class 15 cost.backward()
method to return a temporal signal iterator (line 5). Finally, we 16 optimizer.step()
create a train-test split of the spatiotemporal dataset by using the 17 optimizer.zero_grad()
splitting function and retain 10% of the temporal snapshots for
model performance evaluation (lines 7-8). Listings 3: Creating a recurrent graph convolutional neural
network instance and training it by cumulative weight up-
1 import torch
dates.
2 import torch.nn.functional as F
3 from torch_geometric_temporal.nn.recurrent import DCRNN 3.3.3 Model Training. Using the dataset split and the model defini-
4
tion we can turn our attention to training a regressor. In Listings 3
5 class RecurrentGCN(torch.nn.Module):
we create a model instance (line 1), transfer the model parameters
6 def __init__(self, node_features, filters):
(line 3) to the Adam optimizer [29] which uses a learning rate of
7 super(RecurrentGCN, self).__init__()
8 self.recurrent = DCRNN(node_features, filters, 1) 0.01 and set the model to be trainable (line 5). In each epoch we set
9 self.linear = torch.nn.Linear(filters, 1) the accumulated cost to be zero (line 8) iterate over the temporal
10 snapshots in the training data (line 9), make forward passes with
11 def forward(self, x, edge_index, edge_weight): the model on each temporal snapshot and accumulate the spatial
12 h = self.recurrent(x, edge_index, edge_weight) unit specific mean squared errors (lines 10-13). We normalize the
13 h = F.relu(h) cost, backpropagate and update the model parameters (lines 14-17).
14 h = F.dropout(h, training=self.training)
15 h = self.linear(h) 1 model.eval()
16 return h 2 cost = 0
3 for time, snapshot in enumerate(test):
4 y_hat = model(snapshot.x,
Listings 2: Defining a recurrent graph convolutonal neural
5 snapshot.edge_index,
network using PyTorch Geometric Temporal consisting of 6 snapshot.edge_attr)
a diffusion convolutional spatiotemporal layer followed by 7 cost = cost + torch.mean((y_hat-snapshot.y)**2)
rectified linear unit activations, dropout and a feedforward 8 cost = cost / (time+1)
neural network layer. 9 cost = cost.item()
10 print("MSE: {:.4f}".format(cost))
3.3.2 Recurrent Graph Convolutional Model Definition. We define
a recurrent graph convolutional neural network model in Listings 2.
We import the base and functional programming PyTorch libraries Listings 4: Evaluating the recurrent graph convolutional
and one of the neural network layers from PyTorch Geometric neural network on the test portion of the spatiotemporal
Temporal (lines 1-3). The model requires a node feature count and dataset using the time unit averaged mean squared error.
CIKM’21, 1-5 November 2021, Online B. Rozemberczki, P. Scherer, Y. He, G. Panagopoulos, A. Riedel, M. Astefanoaei, O. Kiss, F. Beres, G. López, N. Collignon, and R. Sarkar

3.3.4 Model Evaluation. The scoring of the trained recurrent graph this backpropagation strategy is slower as weight updates happen
neural network in Listings 4 uses the snapshots in the test dataset. at each time step, not just at the end of training epochs.
We set the model to be non trainable and the accumulated squared
error as zero (lines 1-2). We iterate over the test spatiotemporal snap- 3.4.2 Model Evaluation. During model scoring the GPU can be
shots, make forward passes to predict the number of chickenpox utilized again. The snippet in Listings 6 demonstrates that the only
cases and accumulate the squared error (lines 3-7). The accumulated modification needed for accelerated evaluation is the transfer of
errors are normalized and we can print the mean squared error snapshots to the GPU. In each time period we move the temporal
calculated on the whole test horizon (lines 8-10). snapshot to the device to do the forward pass (line 4). We do the
forward pass with the model and the snapshot on the GPU and
accumulate the loss (lines 5-8). The loss value is averaged out and
3.4 Design in Practice Case Study: Incremental detached from the GPU for printing (lines 9-11).
Model Training with GPU Acceleration
Exploiting the power of GPU based acceleration of computations 1 model.eval()
happens at the training and evaluation steps of the PyTorch Geo- 2 cost = 0
metric Temporal pipelines. In this case study we assume that the 3 for time, snapshot in enumerate(test):
Hungarian Chickenpox cases dataset is already loaded in memory, 4 snapshot = snapshot.to(device)
the temporal split happened and a model class was defined by the 5 y_hat = model(snapshot.x,
6 snapshot.edge_index,
code snippets in Listings 1 and 2. Moreover, we assume that the
7 snapshot.edge_attr)
machine used for training the neural network can access a single 8 cost = cost + torch.mean((y_hat-snapshot.y)**2)
CUDA compatible GPU device [48]. 9 cost = cost / (time+1)
10 cost = cost.item()
1 model = RecurrentGCN(node_features=8, filters=32) 11 print("MSE: {:.4f}".format(cost))
2 device = torch.device('cuda')
3 model = model.to(device)
4 Listings 6: Evaluating the recurrent graph convolutional
5 optimizer = torch.optim.Adam(model.parameters(), lr=0.01) neural network with GPU based acceleration.
6 model.train()
7
3.5 Maintaining PyTorch Geometric Temporal
8 for epoch in range(200):
9 for snapshot in train: The viability of the project is made possible by the open-source
10 snapshot = snapshot.to(device) code, version control, public releases, automatically generated doc-
11 y_hat = model(snapshot.x, umentation, continuous integration, and near 100% test coverage.
12 snapshot.edge_index,
13 snapshot.edge_attr) 3.5.1 Open-Source Code-Base and Public Releases. The source code
14 cost = torch.mean((y_hat-snapshot.y)**2) of PyTorch Geometric Temporal is publicly available on GitHub un-
15 cost.backward() der the MIT license. Using an open version control system allowed
16 optimizer.step() us to have a large group collaborate on the project and have exter-
17 optimizer.zero_grad() nal contributors who also submitted feature requests. The public
releases of the library are also made available on the Python Package
Index, which means that the framework can be installed via the pip
Listings 5: Creating a recurrent graph convolutional neural
command using the terminal.
network instance and training it by incremental weight up-
dates on a GPU. 3.5.2 Documentation. The source-code of PyTorch Geometric Tem-
poral and Sphinx [8] are used to generate a publicly available doc-
3.4.1 Model Training. In Listings 5 we demonstrate accelerated umentation of the library at https://pytorch-geometric-temporal.
training with incremental weight updates. The model of interest readthedocs.io/. This documentation is automatically created every
and the device used for training are defined while the model is trans- time when the code-base changes in the public repository. The doc-
ferred to the GPU (lines 1-3). The optimizer registers the model umentation covers the constructors and public methods of neural
parameters and the model parameters are set to be trainable (lines network layers, temporal signal iterators, public dataset loaders
5-6). We iterate over the temporal snapshot iterator 200 times and and splitters. It also includes a list of relevant research papers, an
the iterator returns a temporal snapshot in each step. Importantly in-depth installation guide, a detailed getting-started tutorial and a
the snapshots which are PyTorch Geometric Data objects are trans- list of integrated benchmark datasets.
ferred to the GPU (lines 8-10). The use of PyTorch Geometric Data
objects as temporal snapshots allows the transfer of the time pe- 3.5.3 Continuous Integration. We provide continuous integration
riod specific edges, node features and target vector with a single for PyTorch Geometric Temporal with GitHub Actions which are
command. Using the input data a forward pass is made, loss is available for free on GitHub without limitations on the number of
accumulated and weight updates happen using the optimizer in builds. When the code is updated on any branch of the repository
each time period (lines 11-17). Compared to the cumulative back- the build process is triggered and the library is deployed on Linux,
propagation based training approach discussed in Subsection 3.3 Windows and macOS virtual machines.
PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models CIKM’21, 1-5 November 2021, Online

3.5.4 Unit Tests and Code Coverage. The temporal graph neural • Wikipedia Math. Contains Wikipedia pages about popular
network layers, custom data structures and benchmark dataset load- mathematics topics and edges describe the links from one
ers are all covered by unit tests. These unit tests can be executed page to another. Features describe the number of daily visits
locally using the source code. Unit tests are also triggered by the between 2019 and 2021 March.
continuous integration provided by GitHub Actions. When the mas- • Twitter Tennis RG and UO. Twitter mention graphs of
ter branch of the open-source GitHub repository is updated, the major tennis tournaments from 2017. Each snapshot con-
build is successful, and all of the unit tests pass a coverage report tains the graph of popular player or sport news accounts and
is generated by CodeCov. mentions between them [5, 6]. Node labels encode the num-
ber of mentions received and vertex features are structural
4 EXPERIMENTAL EVALUATION properties.
The proposed framework is evaluated on node level regression • Covid19 England. A dataset about mass mobility between
tasks using novel datasets which we release with the paper. We also regions in England and the number of confirmed COVID-19
evaluate the effect of various batching techniques on the predictive cases from March to May 2020 [38]. Each day contains a
performance and runtime. different mobility graph and node features corresponding
to the number of cases in the previous days. Mobility stems
4.1 Datasets from Facebook Data For Good 1 and cases from gov.uk 2 .
• Montevideo Buses. A dataset about the hourly passenger
We release new spatiotemporal benchmark datasets with PyTorch inflow at bus stop level for eleven bus lines from the city
Geometric Temporal which can be used to test models on node level of Montevideo. Nodes are bus stops and edges represent
regression tasks. The descriptive statistics and properties of these connections between the stops; the dataset covers a whole
newly introduced benchmark datasets are summarized in Table 4. month of traffic patterns.
• MTM-1 Hand Motions. A temporal dataset of Methods-
Table 4: Properties and granularity of the spatiotemporal Time Measurement-1 [36] motions, signalled as consecutive
datasets introduced in the paper with information about the graph frames of 21 3D hand key points that were acquired
number of time periods (𝑇 ) and spatial units (|𝑉 |). via MediaPipe Hands [64] from original RGB-Video material.
Node features encode the normalized xyz-coordinates of
Dataset Signal Graph Frequency 𝑇 |𝑉 | each finger joint and the vertices are connected according
Chickenpox Hungary Temporal Static Weekly 522 20 to the human hand structure.
Windmill Large Temporal Static Hourly 17,472 319
Windmill Medium Temporal Static Hourly 17,472 26
Windmill Small Temporal Static Hourly 17,472 11 4.2 Predictive Performance
Pedal Me Deliveries Temporal Static Weekly 36 15 The forecasting experiments focus on the evaluation of the recur-
Wikipedia Math Temporal Static Daily 731 1,068 rent graph neural networks implemented in our framework. We
Twitter Tennis RG Static Dynamic Hourly 120 1000 compare the predictive performance under two specific backpropa-
Twitter Tennis UO Static Dynamic Hourly 112 1000 gation regimes which can be used to train these recurrent models:
Covid19 England Temporal Dynamic Daily 61 129
Montevideo Buses Temporal Static Hourly 744 675 • Incremental: After each temporal snapshot the loss is back-
MTM-1 Hand Motions Temporal Static 1/24 Seconds 14,469 21 propagated and model weights are updated. This would need
as many weight updates as the number of temporal snap-
shots.
These newly released datasets are the following: • Cumulative: When the loss from every temporal snapshot
• Chickenpox Hungary. A spatiotemporal dataset about the is aggregated it is backpropagated and weights are updated
officially reported cases of chickenpox in Hungary. The with the optimizer. This requires one weight update per
nodes are counties and edges describe direct neighbourhood epoch.
relationships. The dataset covers the weeks between 2005
and 2015 without missingness. 4.2.1 Experimental settings. Using 90% of the temporal snapshots
• Windmill Output Datasets. An hourly windfarm energy for training, we evaluated the forecasting performance on the last
output dataset covering 2 years from a European country. 10% by calculating the average mean squared error from 10 experi-
Edge weights are calculated from the proximity of the wind- mental runs. We used models with a recurrent graph convolutional
mills – high weights imply that two windmill stations are layer which had 32 convolutional filters. The spatiotemporal layer
in close vicinity. The size of the dataset relates to the group- was followed by the rectified linear unit [37] activation function
ping of windfarms considered; the smaller datasets are more and during training time we used a dropout of 0.5 for regulariza-
localized to a single region. tion [53] after the spatiotemporal layer. The hidden representations
• Pedal Me Deliveries. A dataset about the number of weekly were fed to a fully connected feedforward layer which outputted
bicycle package deliveries by Pedal Me in London during the predicted scores for each spatial unit. The recurrent models
2020 and 2021. Nodes in the graph represent geographical
units and edges are proximity based mutual adjacency rela- 1 https://dataforgood.fb.com/

tionships. 2 https://coronavirus.data.gov.uk/
CIKM’21, 1-5 November 2021, Online B. Rozemberczki, P. Scherer, Y. He, G. Panagopoulos, A. Riedel, M. Astefanoaei, O. Kiss, F. Beres, G. López, N. Collignon, and R. Sarkar

Table 5: The predictive performance of spatiotemporal neural networks evaluated by average mean squared error. We report
average performances calculated from 10 experimental repetitions with standard deviations around the average mean squared
error calculated on 10% forecasting horizons. We use the incremental and cumulative backpropagation strategies.

Chickenpox Hungary Twitter Tennis RG PedalMe London Wikipedia Math


Incremental Cumulative Incremental Cumulative Incremental Cumulative Incremental Cumulative
DCRNN [32] 1.124 ± 0.015 1.123 ± 0.014 2.049 ± 0.023 2.043 ± 0.016 1.463 ± 0.019 1.450 ± 0.024 0.679 ± 0.020 0.803 ± 0.018
GConvGRU [51] 1.128 ± 0.011 1.132 ± 0.023 2.051 ± 0.020 2.007 ± 0.022 1.622 ± 0.032 1.944 ± 0.013 0.657 ± 0.015 0.837 ± 0.021
GConvLSTM [51] 1.121 ± 0.014 1.119 ± 0.022 2.049 ± 0.024 2.007 ± 0.012 1.442 ± 0.028 1.433 ± 0.020 0.777 ± 0.021 0.868 ± 0.018
GC-LSTM [10] 1.115 ± 0.014 1.116 ± 0.023 2.053 ± 0.024 2.032 ± 0.015 1.455 ± 0.023 1.468 ± 0.025 0.779 ± 0.023 0.852 ± 0.016
DyGrAE [54, 55] 1.120 ± 0.021 1.118 ± 0.015 2.031 ± 0.006 2.007 ± 0.004 1.455 ± 0.031 1.456 ± 0.019 0.773 ± 0.009 0.816 ± 0.016
EGCN-H [39] 1.113 ± 0.016 1.104 ± 0.024 2.040 ± 0.018 2.006 ± 0.008 1.467 ± 0.026 1.436 ± 0.017 0.775 ± 0.022 0.857 ± 0.022
EGCN-O [39] 1.124 ± 0.009 1.119 ± 0.020 2.055 ± 0.020 2.010 ± 0.014 1.491 ± 0.024 1.430 ± 0.023 0.750 ± 0.014 0.823 ± 0.014
A3T-GCN[68] 1.114 ± 0.008 1.119 ± 0.018 2.045 ± 0.021 2.008 ± 0.016 1.469 ± 0.027 1.475 ± 0.029 0.781 ± 0.011 0.872 ± 0.017
T-GCN [65] 1.117 ± 0.011 1.111 ± 0.022 2.045 ± 0.027 2.008 ± 0.017 1.479 ± 0.012 1.481 ± 0.029 0.764 ± 0.011 0.846 ± 0.020
MPNN LSTM [38] 1.116 ± 0.023 1.129 ± 0.021 2.053 ± 0.041 2.007 ± 0.010 1.485 ± 0.028 1.458 ± 0.013 0.795 ± 0.010 0.905 ± 0.017
AGCRN [4] 1.120 ± 0.010 1.116 ± 0.017 2.039 ± 0.022 2.010 ± 0.009 1.469 ± 0.030 1.465 ± 0.026 0.788 ± 0.011 0.832 ± 0.020

were trained for 100 epochs with the Adam optimizer [29] which • CPU: The machine used for benchmarking had 8 Intel 1.00
used a learning rate of 10−2 to minimize the mean squared error. GHz i5-1035G1 processors.
• GPU: We utilized a machine with a single Tesla V-100 graph-
4.2.2 Experimental findings. Results are presented in Table 5 where ics card for the experiments.
we also report standard deviations around the test set mean squared
error and bold numbers denote the best performing model un-
Runtime in seconds

24

der each training regime on a dataset. Our experimental findings


16
demonstrate multiple important empirical regularities which have
important practical implications. Namely these are the following: 8

0
(1) Most recurrent graph neural networks have a similar predic-
8 9 10 11 12 2 3 4 5 6
tive performance on these regression tasks. In simple terms log2 Number of nodes log2 Number of edges per node
there is not a single model which acts as silver bullet. This
Runtime in seconds

also postulates that the model with the lowest training time 24

is likely to be as good as the slowest one. 16

(2) Results on the Wikipedia Math dataset imply that a cumula- 8


tive backpropagation strategy can have a detrimental effect
0
on the predictive performance of a recurrent graph neural
3 4 5 6 7 2 3 4 5 6
network. When computation resources are not a bottleneck log2 Number of node features log2 Number of filters
an incremental strategy can be significantly better.
Incremental CPU Cumulative CPU Incremental GPU Cumulative GPU

4.3 Runtime Performance


Figure 2: The average time needed for doing an epoch on a
The evaluation of the PyTorch Geometric Temporal runtime per- dynamic graph – temporal signal iterator of Watts Strogatz
formance focuses on manipulating the input size and measuring graphs with a recurrent graph convolutional model.
the time needed to complete a training epoch. We investigate the
runtime under the incremental and cumulative backpropagation
strategies. 4.3.2 Experimental findings. We plotted the average runtime cal-
culated from 10 experimental runs on Figure 2 for each input size.
4.3.1 Experimental settings. The runtime evaluation used the GCon- Our results about runtime have two important implications about
vGRU model [51] with the hyperparameter settings described in the practical application of our framework:
Subsection 4.2. We measured the time needed for doing a single (1) The use of a cumulative backpropagation strategy only re-
epoch over a sequence of 100 synthetic graphs. Reference Watts- sults in marginal computation gains compared to the incre-
Strogatz graphs in the snapshots of the dynamic graph with tempo- mental one.
ral signal iterator had binary labels, 210 nodes, 25 edges per node (2) On temporal sequences of large dynamically changing graphs
and 25 node features. Runtimes were measured on the following the GPU aided training can reduce the time needed to do an
hardware: epoch by a whole magnitude.
PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models CIKM’21, 1-5 November 2021, Online

5 CONCLUSIONS AND FUTURE DIRECTIONS [14] Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convo-
lutional Neural Networks on Graphs with Fast Localized Spectral Filtering. In
In this paper we discussed PyTorch Geometric Temporal the first Advances in Neural Information Processing Systems. 3844–3852.
deep learning library designed for neural spatiotemporal signal [15] Matthias Fey and Jan E. Lenssen. 2019. Fast Graph Representation Learning with
PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and
processing. We reviewed the existing geometric deep learning and Manifolds.
machine learning techniques implemented in the framework. We [16] Roy Frostig, Matthew James Johnson, and Chris Leary. 2018. Compiling Machine
gave an overview of the general machine learning framework de- Learning Programs via High-Level Tracing. Systems for Machine Learning (2018).
[17] Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E
sign principles, the newly introduced input and output data struc- Dahl. 2017. Neural Message Passing for Quantum Chemistry. In International
tures, long-term project viability and discussed a case study with Conference on Machine Learning. PMLR, 1263–1272.
source-code which utilized the library. Our empirical evaluation [18] Jonathan Godwin*, Thomas Keck*, Peter Battaglia, Victor Bapst, Thomas Kipf,
Yujia Li, Kimberly Stachenfeld, Petar Veličković, and Alvaro Sanchez-Gonzalez.
focused on (a) the predictive performance of the models available 2020. Jraph: A Library for Graph Neural Networks in Jax. http://github.com/
in the library on real world datasets which we released with the deepmind/jraph
[19] Palash Goyal, Sujit Rokka Chhetri, Ninareh Mehrabi, Emilio Ferrara, and Ar-
framework; (b) the scalability of the methods under various input quimedes Canedo. 2018. DynamicGEM: A Library for Dynamic Graph Embedding
sizes and structures. Methods. arXiv preprint arXiv:1811.10734 (2018).
Our work could be extended and it also opens up opportunities [20] Palash Goyal and Emilio Ferrara. [n.d.]. GEM: A Python Package for Graph
Embedding Methods. Journal of Open Source Software 3, 29 ([n. d.]), 876.
for novel geometric deep learning and applied machine learning [21] Daniele Grattarola and Cesare Alippi. 2020. Graph Neural Networks in Tensor-
research. A possible direction to extend our work would be the Flow and Keras with Spektral. arXiv preprint arXiv:2006.12138 (2020).
consideration of continuous time or time differences between tem- [22] Shengnan Guo, Youfang Lin, Ning Feng, Chao Song, and Huaiyu Wan. 2019.
Attention Based Spatial-Temporal Graph Convolutional Networks for Traffic
poral snapshots which are not constant. Another opportunity is the Flow Forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence,
inclusion of temporal models which operate on curved spaces such Vol. 33. 922–929.
[23] Matthew Hanson. 2019. The Open-Source Software Ecosystem for Leveraging
as hyperbolic and spherical spaces. We are particularly interested in Public Datasets in Spatio-Temporal Asset Catalogs (STAC). In AGU Fall Meeting
how the spatiotemporal deep learning techniques in the framework Abstracts, Vol. 2019. IN23B–07.
can be deployed and used for solving high-impact practical machine [24] Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory.
Neural computation 9, 8 (1997), 1735–1780.
learning tasks. [25] Petter Holme. 2015. Modern Temporal Network Theory: A Colloquium. The
European Physical Journal B 88, 9 (2015), 1–30.
[26] Petter Holme and Jari Saramäki. 2012. Temporal Networks. Physics reports 519, 3
REFERENCES (2012), 97–125.
[1] Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey [27] Jun Hu, Shengsheng Qian, Quan Fang, Youze Wang, Quan Zhao, Huaiwen Zhang,
Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. and Changsheng Xu. 2021. Efficient Graph Deep Learning in TensorFlow with
2016. Tensorflow: A System for Large-Scale Machine Learning. In 12th {USENIX } TF Geometric. arXiv preprint arXiv:2101.11552 (2021).
symposium on operating systems design and implementation ( {OSDI } 16). 265–283. [28] James Bradbury and Roy Frostig and Peter Hawkins and Matthew James Johnson
[2] Sean Anderson, Eric Ward, Lewis Barnett, and Philippina English. 2018. sdmTMB: and Chris Leary and Dougal Maclaurin and George Necula and Adam Paszke
Spatial and spatiotemporal GLMMs with TMB. https://github.com/pbs-assess/ and Jake VanderPlas and Skye Wanderman-Milne and Qiao Zhang. 2018. JAX:
sdmTMB. Composable Transformations of Python+NumPy Programs. http://github.com/
[3] Dzmitry Bahdanau, Kyung Hyun Cho, and Yoshua Bengio. 2015. Neural Machine google/jax
Translation by Jointly Learning to Align and Translate. In 3rd International [29] Diederik Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimiza-
Conference on Learning Representations, ICLR 2015. tion. In Proceedings of the 3rd International Conference on Learning Representations.
[4] Lei Bai, Lina Yao, Can Li, Xianzhi Wang, and Can Wang. 2020. Adaptive Graph [30] Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with
Convolutional Recurrent Network for Traffic Forecasting. Advances in Neural Graph Convolutional Networks. In International Conference on Learning Repre-
Information Processing Systems 33 (2020). sentations (ICLR).
[5] Ferenc Béres, Domokos M. Kelen, Róbert Pálovics, and András A. Benczúr. 2019. [31] Jia Li, Zhichao Han, Hong Cheng, Jiao Su, Pengyun Wang, Jianfeng Zhang, and
Node Embeddings in Dynamic Graphs. Applied Network Science 4, 64 (2019), 25. Lujia Pan. 2019. Predicting Path Failure in Time-Evolving Graphs. In Proceedings
[6] Ferenc Béres, Róbert Pálovics, Anna Oláh, and András A. Benczúr. 2018. Temporal of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data
Walk Based Centrality Metric for Graph Streams. Applied Network Science 3, 32 Mining. 1279–1289.
(2018), 26. [32] Yaguang Li, Rose Yu, Cyrus Shahabi, and Yan Liu. 2018. Diffusion Convolutional
[7] Aleksandar Bojchevski, Johannes Klicpera, Bryan Perozzi, Amol Kapoor, Martin Recurrent Neural Network: Data-Driven Traffic Forecasting. In International
Blais, Benedek Rózemberczki, Michal Lukasik, and Stephan Günnemann. 2020. Conference on Learning Representations.
Scaling Graph Neural Networks with Approximate Pagerank. In Proceedings of [33] Yujia Li, Richard Zemel, Marc Brockschmidt, and Daniel Tarlow. 2016. Gated
the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Graph Sequence Neural Networks. In International Conference on Learning Repre-
Mining. 2464–2473. sentations (ICLR).
[8] Georg Brandl. 2010. Sphinx Documentation. URL http://sphinx-doc. org/sphinx. [34] Meng Liu, Youzhi Luo, Limei Wang, Yaochen Xie, Hao Yuan, Shurui Gui, Zhao
pdf (2010). Xu, Haiyang Yu, Jingtun Zhang, Yi Liu, Keqiang Yan, Bora Oztekin, Haoran Liu,
[9] Yukuo Cen, Zhenyu Hou, Yan Wang, Qibin Chen, Yizhen Luo, Xingcheng Yao, Xuan Zhang, Cong Fu, and Shuiwang Ji. 2021. DIG: A Turnkey Library for Diving
Aohan Zeng, Shiguang Guo, Peng Zhang, Guohao Dai, et al. 2021. CogDL: An into Graph Deep Learning Research. arXiv preprint arXiv:2103.12608 (2021).
Extensive Toolkit for Deep Learning on Graphs. arXiv preprint arXiv:2103.00959 [35] Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective
(2021). Approaches to Attention-based Neural Machine Translation. In Proceedings of the
[10] Jinyin Chen, Xuanheng Xu, Yangyang Wu, and Haibin Zheng. 2018. GC-LSTM: 2015 Conference on Empirical Methods in Natural Language Processing. 1412–1421.
Graph Convolution Embedded LSTM for Dynamic Link Prediction. arXiv preprint [36] Harold B Maynard, G J Stegemerten, and John L Schwab. 1948. Methods-Time
arXiv:1812.04206 (2018). Measurement. McGraw-Hill.
[11] Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun [37] Vinod Nair and Geoffrey E Hinton. 2010. Rectified Linear Units Improve Re-
Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. MXNet: A Flexible stricted Boltzmann Machines. In Proceedings of the 27th International Conference
and Efficient Machine Learning Library for Heterogeneous Distributed Systems. on International Conference on Machine Learning. 807–814.
arXiv preprint arXiv:1512.01274 (2015). [38] George Panagopoulos, Giannis Nikolentzos, and Michalis Vazirgiannis. 2021.
[12] Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Transfer Graph Neural Networks for Pandemic Forecasting. In Proceedings of the
Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase 35th AAAI Conference on Artificial Intelligence.
Representations Using RNN Encoder–Decoder for Statistical Machine Translation. [39] Aldo Pareja, Giacomo Domeniconi, Jie Chen, Tengfei Ma, Toyotaro Suzumura,
In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Hiroki Kanezashi, Tim Kaler, Tao B Schardl, and Charles E Leiserson. 2020.
Processing (EMNLP). Association for Computational Linguistics, 1724–1734. EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs. In
[13] CSIRO’s Data61. 2018. StellarGraph Machine Learning Library. https://github. AAAI. 5363–5370.
com/stellargraph/stellargraph.
CIKM’21, 1-5 November 2021, Online B. Rozemberczki, P. Scherer, Y. He, G. Panagopoulos, A. Riedel, M. Astefanoaei, O. Kiss, F. Beres, G. López, N. Collignon, and R. Sarkar

[40] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory [65] Ling Zhao, Yujiao Song, Chao Zhang, Yu Liu, Pu Wang, Tao Lin, Min Deng, and
Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Haifeng Li. 2019. T-GCN: A Temporal Graph Convolutional Network for Traffic
PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Prediction. IEEE Transactions on Intelligent Transportation Systems 21, 9 (2019),
Advances in Neural Information Processing Systems. 8024–8035. 3848–3858.
[41] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory [66] Chuanpan Zheng, Xiaoliang Fan, Cheng Wang, and Jianzhong Qi. 2020. GMAN:
Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. A Graph Multi-Attention Network for Traffic Prediction. In Proceedings of the
2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. AAAI Conference on Artificial Intelligence, Vol. 34. 1234–1241.
Advances in Neural Information Processing Systems 32 (2019), 8026–8037. [67] Da Zheng, Minjie Wang, Quan Gan, Zheng Zhang, and George Karypis. 2020.
[42] Edzer Pebesma. 2017. staRs: Spatiotemporal Arrays: Raster and Vector Datacubes. Learning Graph Neural Networks with Deep Graph Library. In Companion Pro-
https://github.com/r-spatial/stars. ceedings of the Web Conference 2020 (WWW ’20). 305–306.
[43] Sergio J Rey and Luc Anselin. 2010. PySAL: A Python Library of Spatial Analytical [68] Jiawei Zhu, Yujiao Song, Ling Zhao, and Haifeng Li. 2020. A3T-GCN: Attention
Methods. In Handbook of Applied Spatial Analysis. Springer, 175–193. Temporal Graph Convolutional Network for Traffic Forecasting. arXiv preprint
[44] Emanuele Rob. 2020. PySTAC: Python library for working with any SpatioTempo- arXiv:2006.11583 (2020).
ral Asset Catalog (STAC). https://github.com/stac-utils/pystac. GitHub repository. [69] Esteban Zimányi, Mahmoud Sakr, and Arthur Lesuisse. 2020. MobilityDB: A
[45] Benedek Rozemberczki, Peter Englert, Amol Kapoor, Martin Blais, and Bryan Mobility Database Based on PostgreSQL and PostGIS. ACM Transactions on
Perozzi. 2020. Pathfinder Discovery Networks for Neural Message Passing. arXiv Database Systems (TODS) 45, 4 (2020), 1–42.
preprint arXiv:2010.12878 (2020).
[46] Benedek Rozemberczki, Oliver Kiss, and Rik Sarkar. 2020. Karate Club: An
API Oriented Open-source Python Framework for Unsupervised Learning on
Graphs. In Proceedings of the 29th ACM International Conference on Information
& Knowledge Management. 3125–3132.
[47] Benedek Rozemberczki, Paul Scherer, Oliver Kiss, Rik Sarkar, and Tamas Ferenci.
2021. Chickenpox Cases in Hungary: a Benchmark Dataset for Spatiotemporal
Signal Processing with Graph Neural Networks. arXiv:2102.08100 [cs.LG]
[48] Jason Sanders and Edward Kandrot. 2010. CUDA by Example: An Introduction to
General-Purpose GPU Programming. Addison-Wesley Professional.
[49] Paul Scherer and Pietro Lio. 2020. Learning Distributed Representations of Graphs
with Geo2DR. In ICML Workshop on Graph Representation Learning and Beyond.
[50] Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan
Titov, and Max Welling. 2018. Modeling Relational Data with Graph Convolu-
tional Networks. In European Semantic Web Conference. Springer, 593–607.
[51] Youngjoo Seo, Michaël Defferrard, Pierre Vandergheynst, and Xavier Bresson.
2018. Structured Sequence Modeling with Graph Convolutional Recurrent Net-
works. In International Conference on Neural Information Processing. Springer,
362–373.
[52] Lei Shi, Yifan Zhang, Jian Cheng, and Hanqing Lu. 2019. Two-Stream Adaptive
Graph Convolutional Networks for Skeleton-Based Action Recognition. In Pro-
ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
12026–12035.
[53] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan
Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from
Overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929–1958.
[54] Aynaz Taheri and Tanya Berger-Wolf. 2019. Predictive Temporal Embedding of
Dynamic Graphs. In Proceedings of the 2019 IEEE/ACM International Conference
on Advances in Social Networks Analysis and Mining. 57–64.
[55] Aynaz Taheri, Kevin Gimpel, and Tanya Berger-Wolf. 2019. Learning to Repre-
sent the Evolution of Dynamic Graphs with Recurrent Models. In Companion
Proceedings of The 2019 World Wide Web Conference (WWW ’19). 301–307.
[56] Paul Taylor, Christopher Harris, Thompson Comer, and Mark Harris. 2019. CUDA-
Accelerated GIS and Spatiotemporal Algorithms. https://github.com/rapidsai/
cuspatial.
[57] Cunchao Tu, Yuan Yao, Zhengyan Zhang, Ganqu Cui, Hao Wang, Changxin Tian,
Jie Zhou, and Cheng Yang. 2018. OpenNE: An Open Source Toolkit for Network
Embedding. https://github.com/thunlp/OpenNE.
[58] Stefan Van Der Walt, S Chris Colbert, and Gael Varoquaux. 2011. The NumPy
Array: a Structure for Efficient Numerical Computation. Computing in science &
engineering 13, 2 (2011), 22–30.
[59] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,
Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is All You
Need. In Proceedings of the 31st International Conference on Neural Information
Processing Systems. 6000–6010.
[60] Michael A Whitby, Rich Fecher, and Chris Bennight. 2017. GeoWave: Utiliz-
ing Distributed Key-Value Stores for Multidimensional Data. In International
Symposium on Spatial and Temporal Databases. Springer, 105–122.
[61] Zonghan Wu, Shirui Pan, Guodong Long, Jing Jiang, Xiaojun Chang, and Chengqi
Zhang. 2020. Connecting the Dots: Multivariate Time Series Forecasting with
Graph Neural Networks. In Proceedings of the 26th ACM SIGKDD International
Conference on Knowledge Discovery & Data Mining. 753–763.
[62] Hongxia Yang. 2019. AliGraph: A Comprehensive Graph Neural Network Plat-
form. In Proceedings of the 25th ACM SIGKDD International Conference on Knowl-
edge Discovery & Data Mining. 3165–3166.
[63] Bing Yu, Haoteng Yin, and Zhanxing Zhu. 2018. Spatio-Temporal Graph Con-
volutional Networks: a Deep Learning Framework for Traffic Forecasting. In
Proceedings of the 27th International Joint Conference on Artificial Intelligence.
3634–3640.
[64] Fan Zhang, Valentin Bazarevsky, Andrey Vakunov, Andrei Tkachenka, George
Sung, Chuo-Ling Chang, and Matthias Grundmann. 2020. MediaPipe Hands:
On-device Real-time Hand Tracking. arXiv:2006.10214 [cs.CV]

You might also like