PyTorch Geometric Temporal Spatiotemporal Signal Processing
PyTorch Geometric Temporal Spatiotemporal Signal Processing
used geometric deep learning libraries from the PyTorch ecosystem can differ in terms of the dynamics of the graph and that of the
[15, 40]. Our framework was built by applying simple design prin- modelled vertex attributes. We take a discrete temporal snapshot
ciples consistently. The framework reuses existing neural network view of this data representation problem [25, 26] and our work
layers in a modular manner, models have a limited number of public considers three spatiotemporal data types which can be described
methods and hyperparameters can be inspected. Spatiotemporal by the subplots of Figure 1 and the following formal definitions:
signal iterators ingest data memory efficiently in widely used scien-
Definition 2.1. Dynamic graph with temporal signal A dy-
tific computing formats and return those in a PyTorch compatible
namic graph with a temporal signal is the ordered set of graph and
format. The design principles in combination with the test coverage,
node feature matrix tuples D = {(G1, X1 ), . . . , (G𝑇 , X𝑇 )} where
documentation, practical tutorials, continuous integration, package
the vertex sets satisfy that 𝑉𝑡 = 𝑉 , ∀𝑡 ∈ {1, . . . ,𝑇 } and the node
indexing and frequent releases make the framework an end-user
friendly spatiotemporal machine learning system. feature matrices that X𝑡 ∈ R |𝑉 |×𝑑 , ∀𝑡 ∈ {1, . . . ,𝑇 } .
The experimental evaluation of the framework entails node level Definition 2.2. Dynamic graph with static signal. A dynamic
regression tasks on datasets released exclusively with the frame- graph with a static signal is the ordered set of graph and node
work. Specifically, we compare the predictive performance of spa- feature matrix tuples D = {(G1, X), . . . , (G𝑇 , X)} where vertex
tiotemporal graph neural networks on epidemiological forecasting, sets satisfy 𝑉𝑡 = 𝑉 , ∀𝑡 ∈ {1, . . . ,𝑇 } and the node feature matrix that
demand planning, web traffic management and social media in- X ∈ R |𝑉 |×𝑑 .
teraction prediction tasks. Synthetic experiments show that with
the right batching strategy PyTorch Geometric Temporal is highly Definition 2.3. Static graph with temporal signal. A static
scalable and benefits from GPU accelerated computing. graph with a temporal signal is the ordered set of graph and node
Our contributions. The main contributions of our work can be feature matrix tuples D = {(G, X1 ), . . . , (G, X𝑇 )} where the node
summarized as: feature matrix satisfies that X𝑡 ∈ R |𝑉 |×𝑑 , ∀𝑡 ∈ {1, . . . ,𝑇 } .
• We publicly release PyTorch Geometric Temporal the first Representing spatiotemporal data based on these theoretical
deep learning library for parametric spatiotemporal machine concepts allows us the creation of memory efficient data structures
learning models. which conceptualize these definitions in practice well.
• We provide data loaders and iterators with PyTorch Geometric
Temporal which can handle spatiotemporal datasets.
• We release new spatiotemporal benchmark datasets from
the renewable energy production, epidemiological reporting,
goods delivery and web traffic forecasting domains.
• We evaluate the spatiotemporal forecasting capabilities of
the neural and parametric machine learning models available
in PyTorch Geometric Temporal on real world datasets.
The remainder of the paper has the following structure. In Sec-
(a) Dynamic graph with temporal signal.
tion 2 we overview important preliminaries and the related work
about temporal and geometric deep learning and the characteris-
tics of related open-source machine learning software. The main
design principles of PyTorch Geometric Temporal are discussed in
Section 3 with a practical example. We demonstrate the forecasting
capabilities of the framework in Section 4 where we also evalu-
ate the scalability of the library on various commodity hardware.
We conclude in Section 5 where we summarize the results. The
source code of PyTorch Geometric Temporal is publicly available (b) Dynamic graph with static signal.
at https://github.com/benedekrozemberczki/pytorch_geometric_
temporal; the Python package can be installed via the Python Pack-
age Index. Detailed documentation is accessible at https://pytorch-
geometric-temporal.readthedocs.io/.
2.2 Deep Learning with Time and Geometry Table 1: A comparison of spatiotemporal deep learning mod-
els in PyTorch Geometric Temporal based on the temporal
Our work provides deep learning models that operate on data which
and spatial block, order of proximity considered and the het-
has both temporal and spatial aspects. These techniques are natural
ereogeneity of the edges.
recombinations of existing neural network layers that operate on
sequences and static graph-structured data.
Temporal GNN Proximity Multi
Model
2.2.1 Temporal Deep Learning. A large family of temporal deep Layer Layer Order Type
learning models such as the LSTM [24] and GRU [12] generates DCRNN [32] GRU DiffConv Higher False
in-memory representations of data points which are iteratively GConvGRU [51] GRU Chebyshev Lower False
updated as it learns by new snapshots. Another family of deep GConvLSTM [51] LSTM Chebyshev Lower False
learning models uses the attention mechanism [3, 35, 59] to learn GC-LSTM [10] LSTM Chebyshev Lower True
representations of the data points which are adaptively recontextu-
DyGrAE [54, 55] LSTM GGCN Higher False
alized based on the temporal history. These types of models serve
as templates for the temporal block of spatiotemporal deep learning LRGCN [31] LSTM RGCN Lower False
models. EGCN-H [39] GRU GCN Lower False
EGCN-O [39] LSTM GCN Lower False
2.2.2 Static Graph Representation Learning. Learning representa-
tions of vertices, edges and whole graphs with graph neural net- T-GCN [65] GRU GCN Lower False
works in a supervised or unsupervised way can be described by A3T-GCN [68] GRU GCN Lower False
the message passing formalism [17]. In this conceptual framework AGCRN [4] GRU Chebyshev Higher False
using the node and edge attributes in a graph as parametric func- MPNN LSTM [38] LSTM GCN Lower False
tion generates compressed representations (messages) which are STGCN [63] Attention Chebyshev Higher False
propagated between the nodes based on a message-passing rule
ASTGCN [22] Attention Chebyshev Higher False
and aggregated to form new representations. Most of the existing
graph neural network architectures such as GCN [30], GGCN[33], MSTGCN [22] Attention Chebyshev Higher False
ChebyConv [14], and RGCN [50] fit perfectly into this general de- GMAN [66] Attention Custom Lower False
scription of graph neural networks. Models are differentiated by MTGNN [61] Attention Custom Higher False
assumptions about the input graph (e.g. node heterogeneity, mul- AAGCN [52] Attention Custom Higher False
tiplexity, presence of edge attributes), the message compression
function used, the propagation scheme and message aggregation
function applied to the received messages. the supervised training of temporal graph representation learning
2.2.3 Spatiotemporal Deep Learning. A spatiotemporal deep learn- models with graphics card based acceleration.
ing model fuses the basic conceptual ideas of temporal deep learning
techniques and graph representation learning. Operating on a tem- Table 2: A desiderata and automatic differentiation backend
poral graph sequence these models perform message-passing at library based comparison of open-source geometric deep
each time point with a graph neural network block and the new learning libraries.
temporal information is incorporated by a temporal deep learning
block. This design allows for sharing salient temporal and spatial Library Backend Supervised Temporal GPU
autocorrelation information across the spatial units. The temporal PT Geometric [15] PT ✔ ✘ ✔
and spatial layers which are fused together in a single parametric Geometric2DR [49] PT ✘ ✘ ✔
machine learning model are trained together jointly by exploit- CogDL [9] PT ✔ ✘ ✔
Spektral [21] TF ✔ ✘ ✔
ing the fact that the fused models are end-to-end differentiable. In
TF Geometric [27] TF ✔ ✘ ✔
Table 1 we summarized the spatiotemporal deep learning models
StellarGraph [13] TF ✔ ✘ ✔
implemented in the framework which we categorized based on DGL [67] TF/PT/MX ✔ ✘ ✔
the temporal and graph neural network layer blocks, the order of DIG [34] PT ✔ ✘ ✔
spatial proximity and heterogeneity of the edge set. Jraph [18] JAX ✔ ✘ ✔
Graph-Learn [62] Custom ✔ ✘ ✔
2.3 Graph Representation Learning Software GEM [19] TF ✘ ✘ ✔
The current graph representation learning software ecosystem DynamicGEM [20] TF ✘ ✔ ✔
OpenNE [57] Custom ✘ ✘ ✘
which allows academic research and industrial deployment extends
Karate Club [46] Custom ✘ ✘ ✘
open-source auto-differentiation libraries such as TensorFlow [1],
Our Work PT ✔ ✔ ✔
PyTorch [41], MxNet [11] and JAX [16, 28]. Our work does the same
as we build on the PyTorch Geometric ecosystem. We summarized
the characteristics of these libraries in Table 2 which allows for com-
paring frameworks based on the backend, presence of supervised 2.4 Spatiotemporal Data Analytics Software
training functionalities, presence of temporal models and GPU sup- The open-source ecosystem for spatiotemporal data processing
port. Our proposed framework is the only one to date which allows consists of specialized database systems, basic analytical tools and
CIKM’21, 1-5 November 2021, Online B. Rozemberczki, P. Scherer, Y. He, G. Panagopoulos, A. Riedel, M. Astefanoaei, O. Kiss, F. Beres, G. López, N. Collignon, and R. Sarkar
advanced machine learning libraries. We summarized the charac- 3.1.3 Limited number of public methods. The spatiotemporal neu-
teristics of the most popular libraries in Table 3 with respect to the ral network layers in our framework have a limited number of public
year of release, purpose of the framework, source code language methods for simplicity. For example, the auxiliary layer initializa-
and GPU support. tion methods and other internal model mechanics are implemented
First, it is evident that most spatiotemporal data processing tools as private methods. All of the layers provide a forward method and
are fairly new and there is much space for contributions in each those which explicitly use the message-passing scheme in PyTorch
subcategory. Second, the database systems are written in high- Geometric provide a public message method.
performance languages while the analytics and machine learning
oriented tools have a pure Python/R design or a wrapper written 3.1.4 Auxiliary layers. The auxiliary neural network layers which
in these languages. Finally, the use of GPU acceleration is not are not part of the PyTorch Geometric ecosystem such as diffusion
widespread which alludes to the fact that current spatiotemporal convolutional graph neural networks [32] are implemented as stan-
data processing tools might have a scalability issue. Our proposed dalone neural network layers in the framework. These layers are
framework PyTorch Geometric Temporal is the first fully open- available for the design of novel neural network architectures as
source GPU accelerated spatiotemporal machine learning library. individual components.
3.3 Design in Practice Case Study: Cumulative convolutional filter parameter in the constructor (line 6). The model
Model Training on CPU consists of a one-hop Diffusion Convolutional Recurrent Neural
Network layer [32] and a fully connected layer with a single neuron
In the following we overview a simple end-to-end machine learning
(lines 8-9).
pipeline designed with PyTorch Geometric Temporal. These code
In the forward pass method of the neural network the model
snippets solve a practical epidemiological forecasting problem –
uses the vertex features, edges and the optional edge weights (line
predicting the weekly number of chickenpox cases in Hungary [47].
11). The initial recurrent graph convolution based aggregation (line
The pipeline consists of data preparation, model definition, training
12) is followed by a rectified linear unit activation function [37]
and evaluation phases.
and dropout [53] for regularization (lines 13-14). Using the fully-
1 from torch_geometric_temporal import ChickenpoxDatasetLoader connected layer the model outputs a single score for each spatial
2 from torch_geometric_temporal import temporal_signal_split unit (lines 15-16).
3
3.3.4 Model Evaluation. The scoring of the trained recurrent graph this backpropagation strategy is slower as weight updates happen
neural network in Listings 4 uses the snapshots in the test dataset. at each time step, not just at the end of training epochs.
We set the model to be non trainable and the accumulated squared
error as zero (lines 1-2). We iterate over the test spatiotemporal snap- 3.4.2 Model Evaluation. During model scoring the GPU can be
shots, make forward passes to predict the number of chickenpox utilized again. The snippet in Listings 6 demonstrates that the only
cases and accumulate the squared error (lines 3-7). The accumulated modification needed for accelerated evaluation is the transfer of
errors are normalized and we can print the mean squared error snapshots to the GPU. In each time period we move the temporal
calculated on the whole test horizon (lines 8-10). snapshot to the device to do the forward pass (line 4). We do the
forward pass with the model and the snapshot on the GPU and
accumulate the loss (lines 5-8). The loss value is averaged out and
3.4 Design in Practice Case Study: Incremental detached from the GPU for printing (lines 9-11).
Model Training with GPU Acceleration
Exploiting the power of GPU based acceleration of computations 1 model.eval()
happens at the training and evaluation steps of the PyTorch Geo- 2 cost = 0
metric Temporal pipelines. In this case study we assume that the 3 for time, snapshot in enumerate(test):
Hungarian Chickenpox cases dataset is already loaded in memory, 4 snapshot = snapshot.to(device)
the temporal split happened and a model class was defined by the 5 y_hat = model(snapshot.x,
6 snapshot.edge_index,
code snippets in Listings 1 and 2. Moreover, we assume that the
7 snapshot.edge_attr)
machine used for training the neural network can access a single 8 cost = cost + torch.mean((y_hat-snapshot.y)**2)
CUDA compatible GPU device [48]. 9 cost = cost / (time+1)
10 cost = cost.item()
1 model = RecurrentGCN(node_features=8, filters=32) 11 print("MSE: {:.4f}".format(cost))
2 device = torch.device('cuda')
3 model = model.to(device)
4 Listings 6: Evaluating the recurrent graph convolutional
5 optimizer = torch.optim.Adam(model.parameters(), lr=0.01) neural network with GPU based acceleration.
6 model.train()
7
3.5 Maintaining PyTorch Geometric Temporal
8 for epoch in range(200):
9 for snapshot in train: The viability of the project is made possible by the open-source
10 snapshot = snapshot.to(device) code, version control, public releases, automatically generated doc-
11 y_hat = model(snapshot.x, umentation, continuous integration, and near 100% test coverage.
12 snapshot.edge_index,
13 snapshot.edge_attr) 3.5.1 Open-Source Code-Base and Public Releases. The source code
14 cost = torch.mean((y_hat-snapshot.y)**2) of PyTorch Geometric Temporal is publicly available on GitHub un-
15 cost.backward() der the MIT license. Using an open version control system allowed
16 optimizer.step() us to have a large group collaborate on the project and have exter-
17 optimizer.zero_grad() nal contributors who also submitted feature requests. The public
releases of the library are also made available on the Python Package
Index, which means that the framework can be installed via the pip
Listings 5: Creating a recurrent graph convolutional neural
command using the terminal.
network instance and training it by incremental weight up-
dates on a GPU. 3.5.2 Documentation. The source-code of PyTorch Geometric Tem-
poral and Sphinx [8] are used to generate a publicly available doc-
3.4.1 Model Training. In Listings 5 we demonstrate accelerated umentation of the library at https://pytorch-geometric-temporal.
training with incremental weight updates. The model of interest readthedocs.io/. This documentation is automatically created every
and the device used for training are defined while the model is trans- time when the code-base changes in the public repository. The doc-
ferred to the GPU (lines 1-3). The optimizer registers the model umentation covers the constructors and public methods of neural
parameters and the model parameters are set to be trainable (lines network layers, temporal signal iterators, public dataset loaders
5-6). We iterate over the temporal snapshot iterator 200 times and and splitters. It also includes a list of relevant research papers, an
the iterator returns a temporal snapshot in each step. Importantly in-depth installation guide, a detailed getting-started tutorial and a
the snapshots which are PyTorch Geometric Data objects are trans- list of integrated benchmark datasets.
ferred to the GPU (lines 8-10). The use of PyTorch Geometric Data
objects as temporal snapshots allows the transfer of the time pe- 3.5.3 Continuous Integration. We provide continuous integration
riod specific edges, node features and target vector with a single for PyTorch Geometric Temporal with GitHub Actions which are
command. Using the input data a forward pass is made, loss is available for free on GitHub without limitations on the number of
accumulated and weight updates happen using the optimizer in builds. When the code is updated on any branch of the repository
each time period (lines 11-17). Compared to the cumulative back- the build process is triggered and the library is deployed on Linux,
propagation based training approach discussed in Subsection 3.3 Windows and macOS virtual machines.
PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models CIKM’21, 1-5 November 2021, Online
3.5.4 Unit Tests and Code Coverage. The temporal graph neural • Wikipedia Math. Contains Wikipedia pages about popular
network layers, custom data structures and benchmark dataset load- mathematics topics and edges describe the links from one
ers are all covered by unit tests. These unit tests can be executed page to another. Features describe the number of daily visits
locally using the source code. Unit tests are also triggered by the between 2019 and 2021 March.
continuous integration provided by GitHub Actions. When the mas- • Twitter Tennis RG and UO. Twitter mention graphs of
ter branch of the open-source GitHub repository is updated, the major tennis tournaments from 2017. Each snapshot con-
build is successful, and all of the unit tests pass a coverage report tains the graph of popular player or sport news accounts and
is generated by CodeCov. mentions between them [5, 6]. Node labels encode the num-
ber of mentions received and vertex features are structural
4 EXPERIMENTAL EVALUATION properties.
The proposed framework is evaluated on node level regression • Covid19 England. A dataset about mass mobility between
tasks using novel datasets which we release with the paper. We also regions in England and the number of confirmed COVID-19
evaluate the effect of various batching techniques on the predictive cases from March to May 2020 [38]. Each day contains a
performance and runtime. different mobility graph and node features corresponding
to the number of cases in the previous days. Mobility stems
4.1 Datasets from Facebook Data For Good 1 and cases from gov.uk 2 .
• Montevideo Buses. A dataset about the hourly passenger
We release new spatiotemporal benchmark datasets with PyTorch inflow at bus stop level for eleven bus lines from the city
Geometric Temporal which can be used to test models on node level of Montevideo. Nodes are bus stops and edges represent
regression tasks. The descriptive statistics and properties of these connections between the stops; the dataset covers a whole
newly introduced benchmark datasets are summarized in Table 4. month of traffic patterns.
• MTM-1 Hand Motions. A temporal dataset of Methods-
Table 4: Properties and granularity of the spatiotemporal Time Measurement-1 [36] motions, signalled as consecutive
datasets introduced in the paper with information about the graph frames of 21 3D hand key points that were acquired
number of time periods (𝑇 ) and spatial units (|𝑉 |). via MediaPipe Hands [64] from original RGB-Video material.
Node features encode the normalized xyz-coordinates of
Dataset Signal Graph Frequency 𝑇 |𝑉 | each finger joint and the vertices are connected according
Chickenpox Hungary Temporal Static Weekly 522 20 to the human hand structure.
Windmill Large Temporal Static Hourly 17,472 319
Windmill Medium Temporal Static Hourly 17,472 26
Windmill Small Temporal Static Hourly 17,472 11 4.2 Predictive Performance
Pedal Me Deliveries Temporal Static Weekly 36 15 The forecasting experiments focus on the evaluation of the recur-
Wikipedia Math Temporal Static Daily 731 1,068 rent graph neural networks implemented in our framework. We
Twitter Tennis RG Static Dynamic Hourly 120 1000 compare the predictive performance under two specific backpropa-
Twitter Tennis UO Static Dynamic Hourly 112 1000 gation regimes which can be used to train these recurrent models:
Covid19 England Temporal Dynamic Daily 61 129
Montevideo Buses Temporal Static Hourly 744 675 • Incremental: After each temporal snapshot the loss is back-
MTM-1 Hand Motions Temporal Static 1/24 Seconds 14,469 21 propagated and model weights are updated. This would need
as many weight updates as the number of temporal snap-
shots.
These newly released datasets are the following: • Cumulative: When the loss from every temporal snapshot
• Chickenpox Hungary. A spatiotemporal dataset about the is aggregated it is backpropagated and weights are updated
officially reported cases of chickenpox in Hungary. The with the optimizer. This requires one weight update per
nodes are counties and edges describe direct neighbourhood epoch.
relationships. The dataset covers the weeks between 2005
and 2015 without missingness. 4.2.1 Experimental settings. Using 90% of the temporal snapshots
• Windmill Output Datasets. An hourly windfarm energy for training, we evaluated the forecasting performance on the last
output dataset covering 2 years from a European country. 10% by calculating the average mean squared error from 10 experi-
Edge weights are calculated from the proximity of the wind- mental runs. We used models with a recurrent graph convolutional
mills – high weights imply that two windmill stations are layer which had 32 convolutional filters. The spatiotemporal layer
in close vicinity. The size of the dataset relates to the group- was followed by the rectified linear unit [37] activation function
ping of windfarms considered; the smaller datasets are more and during training time we used a dropout of 0.5 for regulariza-
localized to a single region. tion [53] after the spatiotemporal layer. The hidden representations
• Pedal Me Deliveries. A dataset about the number of weekly were fed to a fully connected feedforward layer which outputted
bicycle package deliveries by Pedal Me in London during the predicted scores for each spatial unit. The recurrent models
2020 and 2021. Nodes in the graph represent geographical
units and edges are proximity based mutual adjacency rela- 1 https://dataforgood.fb.com/
tionships. 2 https://coronavirus.data.gov.uk/
CIKM’21, 1-5 November 2021, Online B. Rozemberczki, P. Scherer, Y. He, G. Panagopoulos, A. Riedel, M. Astefanoaei, O. Kiss, F. Beres, G. López, N. Collignon, and R. Sarkar
Table 5: The predictive performance of spatiotemporal neural networks evaluated by average mean squared error. We report
average performances calculated from 10 experimental repetitions with standard deviations around the average mean squared
error calculated on 10% forecasting horizons. We use the incremental and cumulative backpropagation strategies.
were trained for 100 epochs with the Adam optimizer [29] which • CPU: The machine used for benchmarking had 8 Intel 1.00
used a learning rate of 10−2 to minimize the mean squared error. GHz i5-1035G1 processors.
• GPU: We utilized a machine with a single Tesla V-100 graph-
4.2.2 Experimental findings. Results are presented in Table 5 where ics card for the experiments.
we also report standard deviations around the test set mean squared
error and bold numbers denote the best performing model un-
Runtime in seconds
24
0
(1) Most recurrent graph neural networks have a similar predic-
8 9 10 11 12 2 3 4 5 6
tive performance on these regression tasks. In simple terms log2 Number of nodes log2 Number of edges per node
there is not a single model which acts as silver bullet. This
Runtime in seconds
also postulates that the model with the lowest training time 24
5 CONCLUSIONS AND FUTURE DIRECTIONS [14] Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convo-
lutional Neural Networks on Graphs with Fast Localized Spectral Filtering. In
In this paper we discussed PyTorch Geometric Temporal the first Advances in Neural Information Processing Systems. 3844–3852.
deep learning library designed for neural spatiotemporal signal [15] Matthias Fey and Jan E. Lenssen. 2019. Fast Graph Representation Learning with
PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and
processing. We reviewed the existing geometric deep learning and Manifolds.
machine learning techniques implemented in the framework. We [16] Roy Frostig, Matthew James Johnson, and Chris Leary. 2018. Compiling Machine
gave an overview of the general machine learning framework de- Learning Programs via High-Level Tracing. Systems for Machine Learning (2018).
[17] Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E
sign principles, the newly introduced input and output data struc- Dahl. 2017. Neural Message Passing for Quantum Chemistry. In International
tures, long-term project viability and discussed a case study with Conference on Machine Learning. PMLR, 1263–1272.
source-code which utilized the library. Our empirical evaluation [18] Jonathan Godwin*, Thomas Keck*, Peter Battaglia, Victor Bapst, Thomas Kipf,
Yujia Li, Kimberly Stachenfeld, Petar Veličković, and Alvaro Sanchez-Gonzalez.
focused on (a) the predictive performance of the models available 2020. Jraph: A Library for Graph Neural Networks in Jax. http://github.com/
in the library on real world datasets which we released with the deepmind/jraph
[19] Palash Goyal, Sujit Rokka Chhetri, Ninareh Mehrabi, Emilio Ferrara, and Ar-
framework; (b) the scalability of the methods under various input quimedes Canedo. 2018. DynamicGEM: A Library for Dynamic Graph Embedding
sizes and structures. Methods. arXiv preprint arXiv:1811.10734 (2018).
Our work could be extended and it also opens up opportunities [20] Palash Goyal and Emilio Ferrara. [n.d.]. GEM: A Python Package for Graph
Embedding Methods. Journal of Open Source Software 3, 29 ([n. d.]), 876.
for novel geometric deep learning and applied machine learning [21] Daniele Grattarola and Cesare Alippi. 2020. Graph Neural Networks in Tensor-
research. A possible direction to extend our work would be the Flow and Keras with Spektral. arXiv preprint arXiv:2006.12138 (2020).
consideration of continuous time or time differences between tem- [22] Shengnan Guo, Youfang Lin, Ning Feng, Chao Song, and Huaiyu Wan. 2019.
Attention Based Spatial-Temporal Graph Convolutional Networks for Traffic
poral snapshots which are not constant. Another opportunity is the Flow Forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence,
inclusion of temporal models which operate on curved spaces such Vol. 33. 922–929.
[23] Matthew Hanson. 2019. The Open-Source Software Ecosystem for Leveraging
as hyperbolic and spherical spaces. We are particularly interested in Public Datasets in Spatio-Temporal Asset Catalogs (STAC). In AGU Fall Meeting
how the spatiotemporal deep learning techniques in the framework Abstracts, Vol. 2019. IN23B–07.
can be deployed and used for solving high-impact practical machine [24] Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory.
Neural computation 9, 8 (1997), 1735–1780.
learning tasks. [25] Petter Holme. 2015. Modern Temporal Network Theory: A Colloquium. The
European Physical Journal B 88, 9 (2015), 1–30.
[26] Petter Holme and Jari Saramäki. 2012. Temporal Networks. Physics reports 519, 3
REFERENCES (2012), 97–125.
[1] Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey [27] Jun Hu, Shengsheng Qian, Quan Fang, Youze Wang, Quan Zhao, Huaiwen Zhang,
Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. and Changsheng Xu. 2021. Efficient Graph Deep Learning in TensorFlow with
2016. Tensorflow: A System for Large-Scale Machine Learning. In 12th {USENIX } TF Geometric. arXiv preprint arXiv:2101.11552 (2021).
symposium on operating systems design and implementation ( {OSDI } 16). 265–283. [28] James Bradbury and Roy Frostig and Peter Hawkins and Matthew James Johnson
[2] Sean Anderson, Eric Ward, Lewis Barnett, and Philippina English. 2018. sdmTMB: and Chris Leary and Dougal Maclaurin and George Necula and Adam Paszke
Spatial and spatiotemporal GLMMs with TMB. https://github.com/pbs-assess/ and Jake VanderPlas and Skye Wanderman-Milne and Qiao Zhang. 2018. JAX:
sdmTMB. Composable Transformations of Python+NumPy Programs. http://github.com/
[3] Dzmitry Bahdanau, Kyung Hyun Cho, and Yoshua Bengio. 2015. Neural Machine google/jax
Translation by Jointly Learning to Align and Translate. In 3rd International [29] Diederik Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimiza-
Conference on Learning Representations, ICLR 2015. tion. In Proceedings of the 3rd International Conference on Learning Representations.
[4] Lei Bai, Lina Yao, Can Li, Xianzhi Wang, and Can Wang. 2020. Adaptive Graph [30] Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with
Convolutional Recurrent Network for Traffic Forecasting. Advances in Neural Graph Convolutional Networks. In International Conference on Learning Repre-
Information Processing Systems 33 (2020). sentations (ICLR).
[5] Ferenc Béres, Domokos M. Kelen, Róbert Pálovics, and András A. Benczúr. 2019. [31] Jia Li, Zhichao Han, Hong Cheng, Jiao Su, Pengyun Wang, Jianfeng Zhang, and
Node Embeddings in Dynamic Graphs. Applied Network Science 4, 64 (2019), 25. Lujia Pan. 2019. Predicting Path Failure in Time-Evolving Graphs. In Proceedings
[6] Ferenc Béres, Róbert Pálovics, Anna Oláh, and András A. Benczúr. 2018. Temporal of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data
Walk Based Centrality Metric for Graph Streams. Applied Network Science 3, 32 Mining. 1279–1289.
(2018), 26. [32] Yaguang Li, Rose Yu, Cyrus Shahabi, and Yan Liu. 2018. Diffusion Convolutional
[7] Aleksandar Bojchevski, Johannes Klicpera, Bryan Perozzi, Amol Kapoor, Martin Recurrent Neural Network: Data-Driven Traffic Forecasting. In International
Blais, Benedek Rózemberczki, Michal Lukasik, and Stephan Günnemann. 2020. Conference on Learning Representations.
Scaling Graph Neural Networks with Approximate Pagerank. In Proceedings of [33] Yujia Li, Richard Zemel, Marc Brockschmidt, and Daniel Tarlow. 2016. Gated
the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Graph Sequence Neural Networks. In International Conference on Learning Repre-
Mining. 2464–2473. sentations (ICLR).
[8] Georg Brandl. 2010. Sphinx Documentation. URL http://sphinx-doc. org/sphinx. [34] Meng Liu, Youzhi Luo, Limei Wang, Yaochen Xie, Hao Yuan, Shurui Gui, Zhao
pdf (2010). Xu, Haiyang Yu, Jingtun Zhang, Yi Liu, Keqiang Yan, Bora Oztekin, Haoran Liu,
[9] Yukuo Cen, Zhenyu Hou, Yan Wang, Qibin Chen, Yizhen Luo, Xingcheng Yao, Xuan Zhang, Cong Fu, and Shuiwang Ji. 2021. DIG: A Turnkey Library for Diving
Aohan Zeng, Shiguang Guo, Peng Zhang, Guohao Dai, et al. 2021. CogDL: An into Graph Deep Learning Research. arXiv preprint arXiv:2103.12608 (2021).
Extensive Toolkit for Deep Learning on Graphs. arXiv preprint arXiv:2103.00959 [35] Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective
(2021). Approaches to Attention-based Neural Machine Translation. In Proceedings of the
[10] Jinyin Chen, Xuanheng Xu, Yangyang Wu, and Haibin Zheng. 2018. GC-LSTM: 2015 Conference on Empirical Methods in Natural Language Processing. 1412–1421.
Graph Convolution Embedded LSTM for Dynamic Link Prediction. arXiv preprint [36] Harold B Maynard, G J Stegemerten, and John L Schwab. 1948. Methods-Time
arXiv:1812.04206 (2018). Measurement. McGraw-Hill.
[11] Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun [37] Vinod Nair and Geoffrey E Hinton. 2010. Rectified Linear Units Improve Re-
Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. MXNet: A Flexible stricted Boltzmann Machines. In Proceedings of the 27th International Conference
and Efficient Machine Learning Library for Heterogeneous Distributed Systems. on International Conference on Machine Learning. 807–814.
arXiv preprint arXiv:1512.01274 (2015). [38] George Panagopoulos, Giannis Nikolentzos, and Michalis Vazirgiannis. 2021.
[12] Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Transfer Graph Neural Networks for Pandemic Forecasting. In Proceedings of the
Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase 35th AAAI Conference on Artificial Intelligence.
Representations Using RNN Encoder–Decoder for Statistical Machine Translation. [39] Aldo Pareja, Giacomo Domeniconi, Jie Chen, Tengfei Ma, Toyotaro Suzumura,
In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Hiroki Kanezashi, Tim Kaler, Tao B Schardl, and Charles E Leiserson. 2020.
Processing (EMNLP). Association for Computational Linguistics, 1724–1734. EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs. In
[13] CSIRO’s Data61. 2018. StellarGraph Machine Learning Library. https://github. AAAI. 5363–5370.
com/stellargraph/stellargraph.
CIKM’21, 1-5 November 2021, Online B. Rozemberczki, P. Scherer, Y. He, G. Panagopoulos, A. Riedel, M. Astefanoaei, O. Kiss, F. Beres, G. López, N. Collignon, and R. Sarkar
[40] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory [65] Ling Zhao, Yujiao Song, Chao Zhang, Yu Liu, Pu Wang, Tao Lin, Min Deng, and
Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Haifeng Li. 2019. T-GCN: A Temporal Graph Convolutional Network for Traffic
PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Prediction. IEEE Transactions on Intelligent Transportation Systems 21, 9 (2019),
Advances in Neural Information Processing Systems. 8024–8035. 3848–3858.
[41] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory [66] Chuanpan Zheng, Xiaoliang Fan, Cheng Wang, and Jianzhong Qi. 2020. GMAN:
Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. A Graph Multi-Attention Network for Traffic Prediction. In Proceedings of the
2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. AAAI Conference on Artificial Intelligence, Vol. 34. 1234–1241.
Advances in Neural Information Processing Systems 32 (2019), 8026–8037. [67] Da Zheng, Minjie Wang, Quan Gan, Zheng Zhang, and George Karypis. 2020.
[42] Edzer Pebesma. 2017. staRs: Spatiotemporal Arrays: Raster and Vector Datacubes. Learning Graph Neural Networks with Deep Graph Library. In Companion Pro-
https://github.com/r-spatial/stars. ceedings of the Web Conference 2020 (WWW ’20). 305–306.
[43] Sergio J Rey and Luc Anselin. 2010. PySAL: A Python Library of Spatial Analytical [68] Jiawei Zhu, Yujiao Song, Ling Zhao, and Haifeng Li. 2020. A3T-GCN: Attention
Methods. In Handbook of Applied Spatial Analysis. Springer, 175–193. Temporal Graph Convolutional Network for Traffic Forecasting. arXiv preprint
[44] Emanuele Rob. 2020. PySTAC: Python library for working with any SpatioTempo- arXiv:2006.11583 (2020).
ral Asset Catalog (STAC). https://github.com/stac-utils/pystac. GitHub repository. [69] Esteban Zimányi, Mahmoud Sakr, and Arthur Lesuisse. 2020. MobilityDB: A
[45] Benedek Rozemberczki, Peter Englert, Amol Kapoor, Martin Blais, and Bryan Mobility Database Based on PostgreSQL and PostGIS. ACM Transactions on
Perozzi. 2020. Pathfinder Discovery Networks for Neural Message Passing. arXiv Database Systems (TODS) 45, 4 (2020), 1–42.
preprint arXiv:2010.12878 (2020).
[46] Benedek Rozemberczki, Oliver Kiss, and Rik Sarkar. 2020. Karate Club: An
API Oriented Open-source Python Framework for Unsupervised Learning on
Graphs. In Proceedings of the 29th ACM International Conference on Information
& Knowledge Management. 3125–3132.
[47] Benedek Rozemberczki, Paul Scherer, Oliver Kiss, Rik Sarkar, and Tamas Ferenci.
2021. Chickenpox Cases in Hungary: a Benchmark Dataset for Spatiotemporal
Signal Processing with Graph Neural Networks. arXiv:2102.08100 [cs.LG]
[48] Jason Sanders and Edward Kandrot. 2010. CUDA by Example: An Introduction to
General-Purpose GPU Programming. Addison-Wesley Professional.
[49] Paul Scherer and Pietro Lio. 2020. Learning Distributed Representations of Graphs
with Geo2DR. In ICML Workshop on Graph Representation Learning and Beyond.
[50] Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan
Titov, and Max Welling. 2018. Modeling Relational Data with Graph Convolu-
tional Networks. In European Semantic Web Conference. Springer, 593–607.
[51] Youngjoo Seo, Michaël Defferrard, Pierre Vandergheynst, and Xavier Bresson.
2018. Structured Sequence Modeling with Graph Convolutional Recurrent Net-
works. In International Conference on Neural Information Processing. Springer,
362–373.
[52] Lei Shi, Yifan Zhang, Jian Cheng, and Hanqing Lu. 2019. Two-Stream Adaptive
Graph Convolutional Networks for Skeleton-Based Action Recognition. In Pro-
ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
12026–12035.
[53] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan
Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from
Overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929–1958.
[54] Aynaz Taheri and Tanya Berger-Wolf. 2019. Predictive Temporal Embedding of
Dynamic Graphs. In Proceedings of the 2019 IEEE/ACM International Conference
on Advances in Social Networks Analysis and Mining. 57–64.
[55] Aynaz Taheri, Kevin Gimpel, and Tanya Berger-Wolf. 2019. Learning to Repre-
sent the Evolution of Dynamic Graphs with Recurrent Models. In Companion
Proceedings of The 2019 World Wide Web Conference (WWW ’19). 301–307.
[56] Paul Taylor, Christopher Harris, Thompson Comer, and Mark Harris. 2019. CUDA-
Accelerated GIS and Spatiotemporal Algorithms. https://github.com/rapidsai/
cuspatial.
[57] Cunchao Tu, Yuan Yao, Zhengyan Zhang, Ganqu Cui, Hao Wang, Changxin Tian,
Jie Zhou, and Cheng Yang. 2018. OpenNE: An Open Source Toolkit for Network
Embedding. https://github.com/thunlp/OpenNE.
[58] Stefan Van Der Walt, S Chris Colbert, and Gael Varoquaux. 2011. The NumPy
Array: a Structure for Efficient Numerical Computation. Computing in science &
engineering 13, 2 (2011), 22–30.
[59] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,
Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is All You
Need. In Proceedings of the 31st International Conference on Neural Information
Processing Systems. 6000–6010.
[60] Michael A Whitby, Rich Fecher, and Chris Bennight. 2017. GeoWave: Utiliz-
ing Distributed Key-Value Stores for Multidimensional Data. In International
Symposium on Spatial and Temporal Databases. Springer, 105–122.
[61] Zonghan Wu, Shirui Pan, Guodong Long, Jing Jiang, Xiaojun Chang, and Chengqi
Zhang. 2020. Connecting the Dots: Multivariate Time Series Forecasting with
Graph Neural Networks. In Proceedings of the 26th ACM SIGKDD International
Conference on Knowledge Discovery & Data Mining. 753–763.
[62] Hongxia Yang. 2019. AliGraph: A Comprehensive Graph Neural Network Plat-
form. In Proceedings of the 25th ACM SIGKDD International Conference on Knowl-
edge Discovery & Data Mining. 3165–3166.
[63] Bing Yu, Haoteng Yin, and Zhanxing Zhu. 2018. Spatio-Temporal Graph Con-
volutional Networks: a Deep Learning Framework for Traffic Forecasting. In
Proceedings of the 27th International Joint Conference on Artificial Intelligence.
3634–3640.
[64] Fan Zhang, Valentin Bazarevsky, Andrey Vakunov, Andrei Tkachenka, George
Sung, Chuo-Ling Chang, and Matthias Grundmann. 2020. MediaPipe Hands:
On-device Real-time Hand Tracking. arXiv:2006.10214 [cs.CV]