100% found this document useful (1 vote)
80 views122 pages

A Gentle Introduction To Graph Neural Network

Uploaded by

SEONGMUK KHANG
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
80 views122 pages

A Gentle Introduction To Graph Neural Network

Uploaded by

SEONGMUK KHANG
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 122

A gentle introduction to graph neural networks

Seongok Ryu, Department of Chemistry @ KAIST


• Motivation

• Graph Neural Networks

• Applications of Graph Neural Networks


Motivation
Non-Euclidean data structure
Successful deep learning architectures
1. Convolutional neural networks
Non-Euclidean data structure
Successful deep learning architectures
1. Convolutional neural networks
Non-Euclidean data structure
Successful deep learning architectures
1. Convolutional neural networks

https://medium.com/comet-app/review-of-deep-learning-algorithms-for-object-detection-c1f3d437b852
Non-Euclidean data structure
Successful deep learning architectures
1. Convolutional neural networks

Lee, Jason, Kyunghyun Cho, and Thomas Hofmann.


"Fully character-level neural machine translation without explicit segmentation." arXiv preprint arXiv:1610.03017 (2016).
Non-Euclidean data structure
Successful deep learning architectures
2. Recurrent neural network
Non-Euclidean data structure
Successful deep learning architectures
2. Recurrent neural network
Non-Euclidean data structure
Successful deep learning architectures
2. Recurrent neural network

Sentiment analysis
Non-Euclidean data structure
Successful deep learning architectures
2. Recurrent neural network

Neural machine translation


Non-Euclidean data structure
Data structures of shown examples are regular.

Image – values on pixels (grids) Sentence – sequential structure


Non-Euclidean data structure
HOWEVER, there are lots of irregular data structure, …

Social Graph
3D Mesh Molecular Graph
(Facebook, Wikipedia)

All you need is GRAPH!


Graph structure

𝑮𝒓𝒂𝒑𝒉 = 𝑮(𝑿, 𝑨)

X : Node, Vertex
- Individual person in a social network
- Atoms in a molecule

Represent elements of a system


Graph structure

𝑮𝒓𝒂𝒑𝒉 = 𝑮(𝑿, 𝑨)

A : Adjacency matrix
- Edges of a graph
- Connectivity, Relationship

Represent relationship or interaction between elements of the system


Graph structure
More detail, …

Battaglia, Peter W., et al.


"Relational inductive biases, deep learning, and graph networks." arXiv preprint arXiv:1806.01261 (2018).
Graph structure
Node features Ariana Grande
• 25 years old
• Female
• American
• Singer
• …

Donald Trump
• 72 years old
• Male
• American
• President, business man
• …

Vladimir Putin
• 65 years old
• Male
• Russian
• President
• …
Graph structure
Edge features

Aromatic bond

Double bond Single bond


Learning relation and interaction
What can we do with graph neural networks?

Battaglia, Peter W., et al.


"Relational inductive biases, deep learning, and graph networks." arXiv preprint arXiv:1806.01261 (2018).
Learning relation and interaction
What can we do with graph neural networks?

• Node classification

• Link prediction

• Node2Vec, Subgraph2Vec, Graph2Vec

: Embedding node/substructure/graph structure to a vector

• Learning physics law from data

• And you can do more amazing things with GNN!


Graph neural networks
• Overall architecture of graph neural networks
• Updating node states
- Graph Convolutional Network (GCN)
- Graph Attention Network (GAT)
- Gated Graph Neural Network (GGNN)

• Readout : permutation invariance on changing node orders


• Graph Auto-Encoders
• Practical issues
- Skip connection
- Inception
- Dropout
Principles of graph neural network
Weights using in updating hidden states of fully-connected Net, CNN and RNN

Battaglia, Peter W., et al.


"Relational inductive biases, deep learning, and graph networks." arXiv preprint arXiv:1806.01261 (2018).
Overall neural network structure
– case of supervised learning

(𝟎)
Input node features, 𝑯𝒊
: Raw node information

(𝑳)
Final node states, 𝑯𝒊
: How the NN recognizes the nodes

Graph features, Z
Principles of graph neural network
Updates in a graph neural network

• Edge update : relationship or interactions, sometimes called as ‘message passing’


ex) the forces of spring
• Node update : aggregates the edge updates and used in the node update
ex) the forces acting on the ball
• Global update : an update for the global attribute
ex) the net forces and total energy of the physical system
Battaglia, Peter W., et al.
"Relational inductive biases, deep learning, and graph networks." arXiv preprint arXiv:1806.01261 (2018).
Principles of graph neural network
Weights using in updating hidden states of GNN

Sharing weights for all nodes in graph,


(𝒍)
but nodes are differently updated by reflecting individual node features, 𝑯𝒋
GCN : Graph Convolutional Network

http://tkipf.github.io/ Famous for variational autoencoder (VAE)

Kipf, Thomas N., and Max Welling.


"Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 (2016)
GCN : Graph Convolutional Network

What NN learns
(𝒍+𝟏) (𝒍) (𝒍) (𝒍)
𝑿𝒊 = 𝝈( 𝒋∈[𝒊−𝒌,𝒊+𝒌] 𝑾𝒋 𝑿𝒋 + 𝒃𝒋 )
GCN : Graph Convolutional Network

2 3
1

4
(𝒍+𝟏) 𝒍 𝒍 𝒍 𝒍 𝒍 𝒍 𝒍 𝒍
𝑯𝟐 = 𝝈 𝑯𝟏 𝑾 + 𝑯𝟐 𝑾 + 𝑯𝟑 𝑾 + 𝑯𝟒 𝑾
GCN : Graph Convolutional Network

What NN learns
(𝒍+𝟏) 𝒍 𝒍 𝒍 𝒍
𝑯𝒊 =𝝈 𝑯𝒋 𝑾 = 𝝈 𝑨𝑯𝒋 𝑾
𝒋∈𝑵 𝒊
GCN : Graph Convolutional Network

• Classification of nodes of citation networks and a knowledge graph


𝐹
• 𝐿= 𝑙∈𝑦𝐿 𝑓=1 𝑌𝑙𝑓 ln 𝑍𝑙𝑓
Kipf, Thomas N., and Max Welling.
"Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 (2016)
GAT : Graph Attention Network
• Attention revisits

What NN learns
(𝒍+𝟏) 𝒍 𝒍
GCN : 𝑯𝒊 =𝝈 𝑯𝒋 𝑾 𝒍 = 𝝈 𝑨𝑯𝒋 𝑾 𝒍

𝒋∈𝑵 𝒊
GAT : Graph Attention Network
• Attention revisits

What NN learns
: Convolution weight and attention coefficient

(𝒍+𝟏) (𝒍) 𝒍 𝒍
GAT : 𝑯𝒊 =𝝈 𝜶𝒊𝒋 𝑯𝒋 𝑾
𝒋∈𝑵 𝒊
Velickovic, Petar, et al.
"Graph attention networks." arXiv preprint arXiv:1710.10903 (2017).
GAT : Graph Attention Network
• Attention mechanism in natural language processing

https://devblogs.nvidia.com/introduction-neural-machine-translation-gpus-part-3/figure6_sample_translations-2/
GAT : Graph Attention Network
• Attention mechanism in natural language processing

http://www.wildml.com/2016/01/attention-and-memory-in-deep-learning-and-nlp/
GAT : Graph Attention Network
• Attention mechanism in natural language processing

𝒂𝒕 𝑠 = 𝑎𝑙𝑖𝑔𝑛 𝒉𝒕 , 𝒉𝒔 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 𝑠𝑐𝑜𝑟𝑒 𝒉𝒕 , 𝒉𝒔

𝛽 ⋅ 𝒉𝑻𝒕 𝒉𝒔

𝑠𝑐𝑜𝑟𝑒(𝒉𝒕 , 𝒉𝒔 ) = 𝒉𝑻𝒕 𝑾𝒂 𝒉𝒔

𝒗𝑻𝒂 tanh(𝑾𝒂 [𝒉𝒕 , 𝒉𝒔 ])

Luong, Minh-Thang, Hieu Pham, and Christopher D. Manning.


"Effective approaches to attention-based neural machine translation." arXiv preprint arXiv:1508.04025 (2015).
GAT : Graph Attention Network
• Attention revisits

What NN learns
: Convolution weight and attention coefficient

(𝒍+𝟏) (𝒍) 𝒍 𝒍
𝑯𝒊 =𝝈 𝜶𝒊𝒋 𝑯𝒋 𝑾 𝜶𝒊𝒋 = 𝒇(𝑯𝒊 𝑾, 𝑯𝒋 𝑾)
𝒋∈𝑵 𝒊

Velickovic, Petar, et al.


"Graph attention networks." arXiv preprint arXiv:1710.10903 (2017).
GAT : Graph Attention Network
• Attention revisits

What NN learns
: Convolution weight and attention coefficient

(𝒍+𝟏) (𝒍) 𝒍 𝒍
𝑯𝒊 =𝝈 𝜶𝒊𝒋 𝑯𝒋 𝑾 𝜶𝒊𝒋 = 𝒇(𝑯𝒊 𝑾, 𝑯𝒋 𝑾)
𝒋∈𝑵 𝒊

• Velickovic, Petar, et al. – network analysis


𝒆𝒊𝒋
𝜶𝒊𝒋 = 𝒔𝒐𝒇𝒕𝒎𝒂𝒙 𝒆𝒊𝒋 = 𝒆𝒊𝒋 = 𝑳𝒆𝒂𝒓𝒌𝒚𝑹𝒆𝑳𝑼(𝒂𝑻 𝑯𝒊 𝑾, 𝑯𝒋 𝑾 )
𝒆𝒙𝒑 𝒌∈𝑵 𝒊 𝒆𝒊𝒌

• Seongok Ryu, et al. – molecular applications

𝜶𝒊𝒋 = 𝐭𝐚𝐧𝐡 𝑯𝒊 𝑾 𝑻 𝑪 𝑯𝒋 𝑾
GAT : Graph Attention Network
• Multi-head attention

• Average
𝑲
(𝒍+𝟏) 𝟏 (𝒍) 𝒍 𝒍
𝑯𝒊 =𝝈 𝜶𝒊𝒋 𝑯𝒋 𝑾
𝑲
𝒌=𝟏 𝒋∈𝑵 𝒊

• Concatenation
𝑲
(𝒍+𝟏) (𝒍) 𝒍 𝒍
𝑯𝒊 = 𝝈 𝜶𝒊𝒋 𝑯𝒋 𝑾
𝒌=𝟏 𝒋∈𝑵 𝒊

Velickovic, Petar, et al.


"Graph attention networks." arXiv preprint arXiv:1710.10903 (2017).
GGNN : Gated Graph Neural Network
• Message Passing Neural Network (MPNN) framework

Using previous node state and message state


to update the ith hidden node state

(𝑙+1) 𝑙 𝑙+1
ℎ𝑖 = 𝑈(ℎ𝑖 , 𝑚𝑖 )

The message state is updated by previous


neighbor states and the route state, and edge states.

(𝑙+1) 𝑙+1
𝑚𝑖 = 𝑀 (ℎ𝑖 , ℎ𝑗 , 𝑒𝑖𝑗 )
𝑗∈𝑁(𝑖)

Wu, Zhenqin, et al.


"MoleculeNet: a benchmark for molecular machine learning." Chemical science 9.2 (2018): 513-530.
GGNN : Gated Graph Neural Network
• Using recurrent unit for updating the node states, in this case GRU.

(𝒍+𝟏) 𝒍 𝒍+𝟏 (𝒍+𝟏) 𝒍 𝒍+𝟏


𝒉𝒊 = 𝑼(𝒉𝒊 , 𝒎𝒊 ) 𝒉𝒊 = 𝑮𝑹𝑼(𝒉𝒊 , 𝒎𝒊 )

Updating rate of the temporal state

(𝒍+𝟏) 𝒍+𝟏 (𝒍+𝟏) 𝒍+𝟏 (𝒍)


𝒉𝒊 =𝒛 ∗ 𝒉𝒊 + 𝟏−𝒛 ∗ 𝒉𝒊

Temporal node state Previous node state

Li, Yujia, et al.


"Gated graph sequence neural networks." arXiv preprint arXiv:1511.05493 (2015).
Readout : permutation invariance on changing nodes order

(𝟎)
Input node features, 𝑯𝒊
: Raw node information

(𝑳)
Final node states, 𝑯𝒊
: How the NN recognizes the nodes

Graph features, Z
Readout : permutation invariance on changing nodes order

Mapping Images to Scene Graphs with Permutation-Invariant Structured Prediction - Scientific Figure on ResearchGate. Available from:
https://www.researchgate.net/Graph-permutation-invariance-and-structured-prediction-A-graph-labeling-function-F-is_fig1_323217335 [accessed 8 Sep, 2018]
Readout : permutation invariance on changing nodes order

• Graph feature
(𝐿)
𝑧𝐺 = 𝑓 𝐻𝑖

• Node-wise summation

𝐿
𝑧𝐺 = 𝜏 𝑀𝐿𝑃 𝐻𝑖
𝑖∈𝐺

• Graph gathering

𝐿 0 𝐿 • 𝜏 : ReLU activation
𝑧𝐺 = 𝜏 𝜎 𝑀𝐿𝑃1 𝐻𝑖 , 𝐻𝑖 ⨀𝑀𝐿𝑃2 𝐻𝑖
𝑖∈𝐺 • 𝜎 : sigmoid activation

Gilmer, Justin, et al.


"Neural message passing for quantum chemistry." arXiv preprint arXiv:1704.01212 (2017).
Graph Auto-Encoders (GAE)

• Clustering
• Link prediction
• Matrix completion and recommendation
Kipf, Thomas N., and Max Welling.
"Variational graph auto-encoders." arXiv preprint arXiv:1611.07308 (2016).
https://github.com/tkipf/gae
Graph Auto-Encoders (GAE)

Input graph, 𝑮(𝑿, 𝑨)

Encoder

Node features, 𝑞(𝒁|𝑿, 𝑨)

Decoder

Reconstructed
Adjacency matrix, 𝐴 = 𝝈(𝒁𝒁𝑻 )

Kipf, Thomas N., and Max Welling.


"Variational graph auto-encoders." arXiv preprint arXiv:1611.07308 (2016).
https://github.com/tkipf/gae
Graph Auto-Encoders (GAE)

Encoder - Inference model


• Two-layer GCN Input graph, 𝑮(𝑿, 𝑨)

𝐺𝐶𝑁 𝑿, 𝑨 = 𝑨𝑅𝑒𝐿𝑈 𝑨𝑿𝑾𝟎 𝑾𝟏


Encoder
1 1
− −
𝑨 = 𝑫 𝑨𝑫
2 2 : symmetrically normalized adjacency matrix
Node features, 𝑞(𝒁|𝑿, 𝑨)

• Variational Inference Decoder


𝑁

𝑞 𝒁 𝑿, 𝑨 = 𝑞(𝒛𝒊 |𝑿, 𝑨) 𝑞 𝒛𝒊 𝑿, 𝑨 = 𝑁 𝒛𝒊 𝝁𝒊 , 𝑑𝑖𝑎𝑔 𝝈𝟐𝒊 Reconstructed


𝑖=1 Adjacency matrix, 𝐴 = 𝝈(𝒁𝒁𝑻 )

Kipf, Thomas N., and Max Welling.


"Variational graph auto-encoders." arXiv preprint arXiv:1611.07308 (2016).
https://github.com/tkipf/gae
Graph Auto-Encoders (GAE)

Decoder - Generative model


• Inner product between latent vectors Input graph, 𝑮(𝑿, 𝑨)
𝑁 𝑁

𝑝 𝑨𝒁 = 𝑝(𝑨𝒊𝒋 |𝒛𝒊 , 𝒛𝒋 ) 𝑝 𝑨𝒊𝒋 𝒛𝒊 , 𝒛𝒋 = 𝜎 𝒛𝑇𝒊 𝒛𝒋 Encoder


𝑖=1 𝑗=1
Node features, 𝑞(𝒁|𝑿, 𝑨)
𝐴𝑖𝑗 : the elements of (reconstructed) A
𝜎 ⋅ : sigmoid activation Decoder

𝑨 = 𝝈(𝒁𝒁𝑻 ) with 𝒁 = 𝑮𝑪𝑵(𝑿, 𝑨)


Reconstructed
Adjacency matrix, 𝑨 = 𝝈(𝒁𝒁𝑻 )
• Learning

ℒ = 𝔼𝑞(𝒁|𝑿,𝑨) log 𝑝(𝑨|𝒁) − KL 𝑞 𝒁 𝑿, 𝑨 ||(𝑝(𝒁)


Kipf, Thomas N., and Max Welling.
Reconstruction loss KL-divergence "Variational graph auto-encoders." arXiv preprint arXiv:1611.07308 (2016).
https://github.com/tkipf/gae
Practical Issues
: Inception
GoogLeNet – Winner of 2014 ImageNet Challenge

https://towardsdatascience.com/an-intuitive-guide-to-deep-network-architectures-65fdc477db41
Practical Issues
: Inception
Inception module

https://towardsdatascience.com/an-intuitive-guide-to-deep-network-architectures-65fdc477db41
Practical Issues
: Inception

(𝟏) 𝟎 (𝟐) 𝟏
𝑯𝒊 = 𝝈 𝑨𝑯𝒋 𝑾(𝟎) 𝑯𝒊 = 𝝈 𝑨𝑯𝒋 𝑾(𝟏)
Practical Issues
: Inception

• Make network wider


• Avoid vanishing gradient
Practical Issues
: Skip-connection
ResNet – Winner of 2015 ImageNet Challenge

Going Deeeeeeeeper!
Practical Issues
: Skip-connection
ResNet – Winner of 2015 ImageNet Challenge

(𝑙+1) (𝑙)
𝑦 = 𝐻𝑖 + 𝐻𝑖

https://towardsdatascience.com/an-intuitive-guide-to-deep-network-architectures-65fdc477db41
Practical Issues
: Dropout
Dropout rate (𝑝) = 0.25

# parameters # parameters # parameters # parameters

: 4x4 = 16 : 4x4 = 16 : 16x0.75 = 12 : 16x0.75x0.75 = 9

For a dense network, the dropout of hidden state reduces the number of parameters
from 𝑁𝑤 to 𝑁𝑤 1 − 𝑝 2
Practical Issues
: Dropout
Hidden states in a graph neural network

Node 1 Node 2 Node 3 ⋅⋅⋅

Age 28 40 36 ⋅⋅⋅
(𝑙+1)
Graph Conv. : 𝐻𝑖 = 𝐴𝑯(𝒍) 𝑊 Sex
M F F ⋅⋅⋅

Nationality Korean American French ⋅⋅⋅

Medical
Job Student Politician ⋅⋅⋅
Doctor

(𝒍) (𝒍) (𝒍) (𝒍)


𝑯𝟏 𝑯𝟐 𝑯𝟑 𝑯𝟒
Practical Issues
: Dropout
Hidden states in a graph neural network
Node 1 Node 2 Node 3 ⋅⋅⋅ Node 1 Node 2 Node 3 ⋅⋅⋅

Age 28 40 36 ⋅⋅⋅ Age 28 40 36 ⋅⋅⋅


Sex
M F F ⋅⋅⋅ Sex
M F F ⋅⋅⋅

Nationality Korean American French ⋅⋅⋅ Nationality Korean American French ⋅⋅⋅

Medical Medical
Job Student Politician ⋅⋅⋅ Job Student Politician ⋅⋅⋅
Doctor Doctor

(𝒍) (𝒍) (𝒍) (𝒍) (𝒍) (𝒍) (𝒍) (𝒍)


𝑯𝟏 𝑯𝟐 𝑯𝟑 𝑯𝟒 𝑯𝟏 𝑯𝟐 𝑯𝟑 𝑯𝟒

Mask individual person Mask information (features)


of node states
And many other options are possible. The proper method depends on your task.
Applications of graph neural networks
• Network Analysis
1. Node classification
2. Link prediction
3. Matrix completion

• Molecular Applications
1. Neural molecular fingerprint
2. Quantitative Structure-Property Relationship (QSPR)
3. Molecular generative model

• Interacting physical system


Network analysis
1. Node classification – karate club network

Karate club graph, colors denote communities obtained GCN embedding (with random weights)
via modularity-based clustering (Brandes et al., 2008). for nodes in the karate club network.

• All figures and descriptions are taken from Thomas N. Kipf’s blog.
• Watch video on his blog.
Kipf, Thomas N., and Max Welling.
"Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 (2016)
http://tkipf.github.io/graph-convolutional-networks/
Network analysis
1. Node classification
• Good node features → Good node classification results

Kipf, Thomas N., and Max Welling.


"Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 (2016)
http://tkipf.github.io/graph-convolutional-networks/
Network analysis
1. Node classification

• Semi-supervised learning – low label rate


• Citation network – Citeseer, Cora, Pubmed / Bipartite graph - NELL

Kipf, Thomas N., and Max Welling.


"Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 (2016)
http://tkipf.github.io/graph-convolutional-networks/
Network analysis
1. Node classification

• Outperforms classical machine learning methods

Kipf, Thomas N., and Max Welling.


"Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 (2016)
http://tkipf.github.io/graph-convolutional-networks/
Network analysis
1. Node classification
• Spectral graph convolution

𝐻(𝑙+1) = 𝜎 𝐷−1/2 𝐴𝐷−1/2 𝐻 𝑙 𝑊 (𝑙) 𝐴 = 𝐴 + 𝐼𝑁 , 𝐷𝑖𝑖 = 𝑗 𝐴𝑖𝑗

• Spectral graph filtering

𝑈 : the matrix of eigenvectors of the normalized graph Laplacian


𝑔𝜃 ⋆ 𝑥 = 𝑈𝑔 𝜃′ Λ 𝑈𝑇 𝑥 1
−2
1
−2
𝐿 = 𝐼𝑁 − 𝐷 𝐴𝐷 = 𝑈Λ𝑈 𝑇

𝑔𝜃 ′ Λ ≈ 𝜃𝑘′ 𝑇𝑘 (Λ) Polynomial approximation (In this case, Chebyshev polynomial)


𝑘=0

Kipf, Thomas N., and Max Welling.


"Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 (2016)
http://tkipf.github.io/graph-convolutional-networks/
Network analysis
1. Node classification
• Spectral graph convolution

𝐻(𝑙+1) = 𝜎 𝐷−1/2 𝐴𝐷−1/2 𝐻 𝑙 𝑊 (𝑙) 𝐴 = 𝐴 + 𝐼𝑁 , 𝐷𝑖𝑖 = 𝑗 𝐴𝑖𝑗

• Spectral graph filtering

𝑔𝜃 ⋆ 𝑥 ≈ 𝜃𝑘′ 𝑇𝑘 𝐿 𝑥 𝑔𝜃 ⋆ 𝑥 ≈ 𝜃0′ 𝑥 + 𝜃1′ 𝐿 − 𝐼𝑁 𝑥 = 𝜃0′ 𝑥 + 𝜃1′ 𝐷−1/2 𝐴𝐷−1/2 𝑥


𝑘=0 Linear approx.
Use a single parameter 𝜃 = 𝜃0′ = 𝜃1′

𝑔𝜃 ⋆ 𝑥 ≈ 𝜃 𝐷 −1/2 𝐴𝐷−1/2 𝑥 𝑔𝜃 ⋆ 𝑥 ≈ 𝜃 𝐼𝑁 + 𝐷 −1/2 𝐴𝐷−1/2 𝑥

Renormalization trick
𝐼𝑁 + 𝐷−1/2 𝐴𝐷 −1/2 = 𝐷 −1/2 𝐴𝐷−1/2

Kipf, Thomas N., and Max Welling.


"Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 (2016)
http://tkipf.github.io/graph-convolutional-networks/
Network analysis
1. Node classification
• Spectral graph convolution

𝐻(𝑙+1) = 𝜎 𝐷−1/2 𝐴𝐷−1/2 𝐻 𝑙 𝑊 (𝑙) 𝐴 = 𝐴 + 𝐼𝑁 , 𝐷𝑖𝑖 = 𝑗 𝐴𝑖𝑗

Kipf, Thomas N., and Max Welling.


"Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 (2016)
http://tkipf.github.io/graph-convolutional-networks/
Network analysis
2. Link prediction

• Clustering
• Link prediction
• Matrix completion and recommendation
Kipf, Thomas N., and Max Welling.
"Variational graph auto-encoders." arXiv preprint arXiv:1611.07308 (2016).
https://github.com/tkipf/gae
Network analysis
2. Link prediction

Input graph, 𝑮(𝑿, 𝑨)

Encoder

Node features, 𝑞(𝒁|𝑿, 𝑨)

Decoder

Reconstructed
Adjacency matrix, 𝐴 = 𝝈(𝒁𝒁𝑻 )

Kipf, Thomas N., and Max Welling.


"Variational graph auto-encoders." arXiv preprint arXiv:1611.07308 (2016).
https://github.com/tkipf/gae
Network analysis
2. Link prediction
• Trained on an incomplete version of {Cora, Citeseer, Pubmed} datasets where parts of the citation links
(edges) have been removed, while all node features are kept.
• Form validation and test sets from previously removed edges and the same number of randomly
sampled pairs of unconnected nodes (non-edges).

Kipf, Thomas N., and Max Welling.


"Variational graph auto-encoders." arXiv preprint arXiv:1611.07308 (2016).
https://github.com/tkipf/gae
Network analysis
3. Matrix completion
• Matrix completion → Can be applied for recommending system

Berg, Rianne van den, Thomas N. Kipf, and Max Welling.


"Graph convolutional matrix completion." arXiv preprint arXiv:1706.02263 (2017).
Network analysis
3. Matrix completion

• A rating matrix 𝑀 of shape 𝑁𝑢 × 𝑁𝑣


𝑁𝑢 : the number of users, 𝑁𝑣 : the number of items
• User 𝑖 rated item 𝑖, or the rating is unobserved (𝑀𝑖𝑗 = 0).
• Matrix completion problem or recommendation
→ a link prediction problem on a bipartite user-item interaction graph.

Input graph : 𝑮(𝓦, 𝓔, 𝓡)

• 𝓦 = 𝓤 ∪ 𝓥 : user nodes 𝒖𝐢 ∈ 𝓤, with 𝑖 ∈ {1, … , 𝑁𝑢 } and item nodes 𝒗𝒋 ∈ 𝓥, with 𝑗 ∈ 1, … , 𝑁𝑣

• 𝒖𝒊 , 𝑟, 𝒗𝒋 ∈ 𝓔 represent rating levels, such as 𝑟 ∈ 1, … , 𝑅 = 𝓡.

Berg, Rianne van den, Thomas N. Kipf, and Max Welling.


"Graph convolutional matrix completion." arXiv preprint arXiv:1706.02263 (2017).
Network analysis
3. Matrix completion
• GAE for the link prediction task
Take as input an 𝑁 × 𝐷 feature matrix, 𝑿
𝑻
𝑁 × 𝐸 node embedding matrix, 𝒁 = 𝒛𝑻𝟏 , … , 𝒛𝑻𝑵 𝒁 = 𝑓(𝑿, 𝑨)

A graph adjacency matrix, 𝑨

𝑨 = 𝑔(𝒁)

which takes pairs of node embeddings 𝒛𝒊 , 𝒛𝒋 and predicts respective entries 𝑨𝒊𝒋

Berg, Rianne van den, Thomas N. Kipf, and Max Welling.


"Graph convolutional matrix completion." arXiv preprint arXiv:1706.02263 (2017).
Network analysis
3. Matrix completion
• GAE for the bipartite recommender graphs, 𝑮(𝓦, 𝓔, 𝓡)

Encoder

𝑈, 𝑉 = 𝑓(𝑋, 𝑀1 , … , 𝑀𝑅 ), where 𝑀𝑟 ∈ 0,1 𝑁𝑢 ×𝑁𝑣 is the adjacency matrix associated with rating type 𝑟 ∈ ℛ

𝑈, 𝑉 : matrices of user and item embeddings with shape 𝑁𝑢 × 𝐸 and 𝑁𝑣 × 𝐸, respectively.

Decoder

𝑀 = 𝑔(𝑈, 𝑉), rating matrix 𝑀 of shape 𝑁𝑢 × 𝑁𝑣

𝐺(𝑿, 𝑨) 𝒁 = 𝑓(𝑿, 𝑨) 𝑨 = 𝑔(𝒁) GAE for the link prediction task

𝐺(𝓦, 𝓔, 𝓡) 𝑼, 𝑽 = 𝑓(𝑿, 𝑴𝟏 , … , 𝑴𝑹 ) 𝑴 = 𝑔(𝑼, 𝑽) GAE for the bipartite recommender

Berg, Rianne van den, Thomas N. Kipf, and Max Welling.


"Graph convolutional matrix completion." arXiv preprint arXiv:1706.02263 (2017).
Network analysis
3. Matrix completion
• GAE for the bipartite recommender graphs, 𝑮(𝓦, 𝓔, 𝓡)

𝐺(𝓦, 𝓔, 𝓡) 𝑼, 𝑽 = 𝑓(𝑿, 𝑴𝟏 , … , 𝑴𝑹 ) 𝑴 = 𝑔(𝑼, 𝑽)

Berg, Rianne van den, Thomas N. Kipf, and Max Welling.


"Graph convolutional matrix completion." arXiv preprint arXiv:1706.02263 (2017).
Network analysis
3. Matrix completion

Encoder

𝑢𝑖 = 𝜎 𝑊ℎ𝑖 : the final user embedding

ℎ𝑖 = 𝜎 𝑎𝑐𝑐𝑢𝑚 𝜇𝑗→𝑖,1 , … , 𝜇𝑗→𝑖,𝑅 : intermediate node state


𝑗∈𝒩𝑖 ,1 𝑗∈𝒩𝑖 ,𝑅

1
𝜇𝑗→𝑖,𝑟 = 𝑊𝑥 : Message function from item 𝑗 to user i 𝑐𝑖𝑗 = 𝒩𝑖 𝒩𝑗
𝑐𝑖𝑗 𝑟 𝑗

Berg, Rianne van den, Thomas N. Kipf, and Max Welling.


"Graph convolutional matrix completion." arXiv preprint arXiv:1706.02263 (2017).
Network analysis
3. Matrix completion

Decoder

𝑢𝑖𝑇 𝑄𝑟 𝑣𝑗
𝑒
𝑝 𝑀𝑖𝑗 = 𝑟 = 𝑇 : probability that rating 𝑀𝑖𝑗 is 𝑟
𝑢𝑖 𝑄𝑠 𝑣𝑗
𝑠∈𝑅 𝑒

𝑀𝑖𝑗 = 𝑔 𝑢𝑖 , 𝑣𝑗 = 𝔼𝑝 𝑀𝑖𝑗 =𝑟 𝑟 = 𝑟 ⋅ 𝑝(𝑀𝑖𝑗 = 𝑟) : expected rating


𝑟∈𝑅

Loss function
𝑅

ℒ=− 𝐼 𝑟 = 𝑀𝑖𝑗 log 𝑝 𝑀𝑖𝑗 = 𝑟 𝐼 𝑘 = 𝑙 = 1, when 𝑘 = 𝑙 and 0 otherwise


𝑖,𝑗 𝑟=1

Berg, Rianne van den, Thomas N. Kipf, and Max Welling.


"Graph convolutional matrix completion." arXiv preprint arXiv:1706.02263 (2017).
Network analysis
3. Matrix completion

Berg, Rianne van den, Thomas N. Kipf, and Max Welling.


"Graph convolutional matrix completion." arXiv preprint arXiv:1706.02263 (2017).
Molecular applications
: Which kinds of datasets exist?

• Bioactive molecules with drug-like properties


• ~1,828,820 compounds
• https://www.ebi.ac.uk/chembldb/

 Drugs and targets


 FDA approved, investigational, experimental, …
 7,713 (all drugs), 4,115 (targets), …
 https://drugbank.ca/

 Drugs and targets


 FDA approved, investigational, experimental, …
 7,713 (all drugs), 4,115 (targets), …
 https://drugbank.ca/
Molecular applications
: Which kinds of datasets exist?

Tox21 Data Challenge (@ Kaggle)

• 12 types of toxicity
• Molecule species (represented with SMILES)
and toxicity labels are given
• But too small to train a DL model

https://tripod.nih.gov/tox21/challenge/data.jsp
Molecular applications
1. Neural molecular fingerprint

Hash function have been used to generate molecular fingerprints.


* Molecular fingerprint : a vector representation of molecular substructures

https://mc.ai/machine-learning-for-drug-discovery-in-a-nutshell%E2%80%8A-%E2%80%8Apart-ii/
Molecular applications
1. Neural molecular fingerprint
Such molecular fingerprints can be easily obtained by open source packages, e.g.) RDKit.

http://kehang.github.io/basic_project/2017/04/18/machine-learning-in-molecular-property-prediction/
Molecular applications
1. Neural molecular fingerprint
In recent days, neural fingerprints generated by graph convolutional network is widely used for
more accurate molecular property predictions.

http://kehang.github.io/basic_project/2017/04/18/machine-learning-in-molecular-property-prediction/
Molecular applications
1. Neural molecular fingerprint

Duvenaud, David K., et al.


"Convolutional networks on graphs for learning molecular fingerprints." Advances in neural information processing systems. 2015.
Molecular applications
1. Neural molecular fingerprint

Duvenaud, David K., et al.


"Convolutional networks on graphs for learning molecular fingerprints." Advances in neural information processing systems. 2015.
Molecular applications
1. Neural molecular fingerprint

Duvenaud, David K., et al.


"Convolutional networks on graphs for learning molecular fingerprints." Advances in neural information processing systems. 2015.
Molecular applications
1. Neural molecular fingerprint

Duvenaud, David K., et al.


"Convolutional networks on graphs for learning molecular fingerprints." Advances in neural information processing systems. 2015.
Molecular applications
2. Quantitative Structure-Property Relationships (QSPR)

Ryu, Seongok, Jaechang Lim, and Woo Youn Kim.


"Deeply learning molecular structure-property relationships using graph attention neural network." arXiv preprint arXiv:1805.10988 (2018).
Molecular applications
2. Quantitative Structure-Property Relationships (QSPR)

Ryu, Seongok, Jaechang Lim, and Woo Youn Kim.


"Deeply learning molecular structure-property relationships using graph attention neural network." arXiv preprint arXiv:1805.10988 (2018).
Molecular applications
2. Quantitative Structure-Property Relationships (QSPR)

Learning solubility of molecules

Final node states


The neural network recognizes
several functional groups differently

Ryu, Seongok, Jaechang Lim, and Woo Youn Kim.


"Deeply learning molecular structure-property relationships using graph attention neural network." arXiv preprint arXiv:1805.10988 (2018).
Molecular applications
2. Quantitative Structure-Property Relationships (QSPR)
Learning photovoltaic efficiency (QM phenomena)

Final node states

Interestingly, The NN also can differentiate nodes


according to the quantum mechanical characteristics.

Ryu, Seongok, Jaechang Lim, and Woo Youn Kim.


"Deeply learning molecular structure-property relationships using graph attention neural network." arXiv preprint arXiv:1805.10988 (2018).
Molecular applications
2. Quantitative Structure-Property Relationships (QSPR)
Learning solubility of molecules

Ascending
order 𝟐
𝒁𝒊 − 𝒁𝒋
Graph features
Similar molecules are located closely
in the graph latent space
Molecular applications
3. Molecular generative model
Motivation : de novo molecular design

• Chemical space is too huge


: only 108 molecules have been
synthesized as potential drug candidates,
whereas it is estimated that there are
1023 to 1060 molecules.

• Limitation of virtual screening


Molecular applications
3. Molecular generative model
Motivation : de novo molecular design

Molecule Graph Simplified Molecule Line-Entry System


(SMILES)

Molecules can be represented as strings according to defined rules


Molecular applications
3. Molecular generative model
Motivation : de novo molecular design
Many SMILES-based generative models exist

Segler, Marwin HS, et al. "Generating focused molecule libraries for drug discovery
with recurrent neural networks." ACS central science 4.1 (2017): 120-131.

Gómez-Bombarelli, Rafael, et al. "Automatic chemical design using a data-driven


continuous representation of molecules." ACS central science 4.2 (2018): 268-276.
Molecular applications
3. Molecular generative model
Motivation : de novo molecular design

SMILES representation also has a fatal problem that


small changes of structure can lead to quite different expressions.
→ Difficult to reflect topological information of molecules
Molecular applications
3. Molecular generative model

Literatures

• Li, Y., Vinyals, O., Dyer, C., Pascanu, R., & Battaglia, P. (2018). Learning deep generative models of graphs.
arXiv preprint arXiv:1803.03324.
• Jin, Wengong, Regina Barzilay, and Tommi Jaakkola. "Junction Tree Variational Autoencoder for Molecular
Graph Generation." arXiv preprint arXiv:1802.04364 (2018).
• Constrained Graph Variational Autoencoders for Molecule Design." arXiv preprint arXiv:1805.09076
(2018).
• "GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders." arXiv preprint
arXiv:1802.03480 (2018).
• De Cao, Nicola, and Thomas Kipf. "MolGAN: An implicit generative model for small molecular graphs."
arXiv preprint arXiv:1805.11973 (2018).
• …
Molecular applications
3. Molecular generative model

Li, Yujia, et al.


"Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications
3. Molecular generative model

1. Sample whether to add a new node of a particular type or terminate : if a node type is chosen
2. Add a node of this type to the graph
3. Check if any further edges are needed to connect the new node to the existing graph
4. If yes, select a node in the graph and add an edge connecting the new to the selected node.

Li, Yujia, et al.


"Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications
3. Molecular generative model

• Determine that add a node or not

Li, Yujia, et al.


"Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications
3. Molecular generative model

• If a node is added, determine that add edges between current node and other nodes.

Li, Yujia, et al.


"Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications
3. Molecular generative model

Li, Yujia, et al.


"Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications
3. Molecular generative model

Graph propagation process : readout all node states and generating a graph feature

𝒉𝑮 = 𝒉𝑮𝑽 or 𝒉𝑮 = 𝒈𝑮𝒗 ⊙ 𝒉𝐺𝑣 𝒈𝑮𝒗 = 𝜎 𝑔𝑚 𝒉𝒗


𝑣∈𝒱 𝑣∈𝒱

𝒉′𝒗 = 𝑓𝑛 𝒂𝒗 , 𝒉𝒗 ∀𝑣 ∈ 𝒱 𝒂𝒗 = 𝑓𝑒 𝒉𝒖 , 𝒉𝒗 , 𝒙𝒖,𝒗 ∀𝑣 ∈ 𝒱
𝑢,𝑣∈ℰ
Li, Yujia, et al.
"Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications
3. Molecular generative model

Add node : a step to decide whether or not to add a node

𝒇𝒂𝒅𝒅𝒏𝒐𝒅𝒆 𝑮 = softmax 𝑓𝑎𝑛 𝒉𝑮


(𝑻)
The new node vectors 𝒉𝑽 are carried over to the next step
If “yes”

Li, Yujia, et al.


"Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications
3. Molecular generative model

Add edge : a step to add an edge to the new node

(𝑻)
𝑓𝑎𝑑𝑑𝑒𝑑𝑔𝑒 𝐺, 𝑣 = 𝜎 𝑓𝑎𝑒 𝒉𝑮 , 𝒉𝒗 : the probability of adding an edge to the newly created node 𝑣.
𝑻 (𝑻)
𝑓𝑛𝑜𝑑𝑒𝑠 𝐺, 𝑣 = softmax 𝑓𝑠 𝒉𝒖 , 𝒉𝒗 : Score for each node to
If “yes” connect the edges
Li, Yujia, et al.
"Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications
3. Molecular generative model
Objective function

𝑝(𝐺, 𝜋)
𝑝(𝐺) : marginal likelihood 𝑝 𝐺 = 𝑝(𝐺, 𝜋) = 𝔼𝑞 𝜋|𝐺
permutation 𝑞 𝜋|𝐺
𝜋∈𝒫(𝐺)

Following all possible permutations is intractable


→ samples from data : 𝑞 𝜋|𝐺 ≈ 𝑝𝑑𝑎𝑡𝑎 (𝜋|𝐺)

𝔼𝑝𝑑𝑎𝑡𝑎 (𝐺,𝜋) log 𝑝(𝐺, 𝜋) = 𝔼𝑝𝑑𝑎𝑡𝑎(𝐺) 𝔼𝑝𝑑𝑎𝑡𝑎 𝜋|𝐺 log 𝑝(𝐺, 𝜋)

Li, Yujia, et al.


"Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications
3. Molecular generative model

Li, Yujia, et al.


"Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications
3. Molecular generative model

Li, Yujia, et al.


"Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Interacting physical system

• Interacting systems
• Nodes : particles, Edges : (physical) interaction between particle pairs
• Latent code : the underlying interaction graph

Kipf, Thomas, et al.


"Neural relational inference for interacting systems." arXiv preprint arXiv:1802.04687 (2018).
Interacting physical system

Input graph : 𝓖 = 𝓥, 𝓔 with vertices 𝑣 ∈ 𝓥 and edges 𝑒 = (𝑣, 𝑣 ′ ) ∈ ℰ

𝑣 → 𝑒 ∶ 𝒉𝒍(𝒊,𝒋) = 𝑓𝑒𝑙 𝒉𝒍𝒊 , 𝒉𝒍𝒋 , 𝒙 𝒊,𝒋

𝑒 → 𝑣 ∶ 𝒉𝒍+𝟏
𝒋 = 𝑓𝑣𝑙 𝒉𝒍 𝒊,𝒋 , 𝒙𝒋 𝒙𝒊 : initial node features, 𝒙 𝒊,𝒋 : initial edge features
𝑖∈𝒩𝑗

Kipf, Thomas, et al.


"Neural relational inference for interacting systems." arXiv preprint arXiv:1802.04687 (2018).
Interacting physical system

Kipf, Thomas, et al.


"Neural relational inference for interacting systems." arXiv preprint arXiv:1802.04687 (2018).
Interacting physical system
Encoder

𝒉𝟏(𝒊,𝒋) = 𝑓𝑒1 𝒉𝟏𝒊 , 𝒉𝟏𝒋 𝒉𝟐𝒋 = 𝑓𝑣1 𝒉𝟏(𝒊,𝒋) 𝒉𝟐(𝒊,𝒋) = 𝑓𝑒2 𝒉𝟐𝒊 , 𝒉𝟐𝒋
𝑖≠𝑗

𝒉𝟏𝒋 = 𝑓𝑒𝑚𝑏 (𝒙𝒋 )

2
𝑞𝜙 𝑧𝑖𝑗 𝑥 = softmax 𝑓𝑒𝑛𝑐,𝜙 𝑥 𝑖𝑗,1:𝐾 = softmax ℎ(𝑖,𝑗)

Kipf, Thomas, et al.


"Neural relational inference for interacting systems." arXiv preprint arXiv:1802.04687 (2018).
Interacting physical system 𝒕
𝑀𝑆𝐺𝑗𝑡 = 𝒉(𝒊,𝒋)
𝑖≠𝑗
Decoder 𝒕
𝒉𝒕+𝟏
𝒋 = 𝐺𝑅𝑈 𝑀𝑆𝐺𝑗𝑡 , 𝒙𝒕𝒋 , 𝒉𝒋
𝒉𝒕(𝒊,𝒋) = 𝑧𝑖𝑗,𝑘 𝑓𝑒𝑘 𝒉𝒕𝒊 , 𝒉𝒕𝒋 𝝁𝒕+𝟏 = 𝒙𝒕𝒋 + 𝑓𝑜𝑢𝑡 𝒉𝒋
𝒕+𝟏
𝒋
𝑘
𝑝 𝒙𝒕+𝟏 |𝒙𝒕 , 𝒛 = 𝒩 𝝁𝒕+𝟏 , 𝜎 2 𝐈

Kipf, Thomas, et al.


"Neural relational inference for interacting systems." arXiv preprint arXiv:1802.04687 (2018).
Interacting physical system

ℒ = 𝔼 𝑞𝜙 𝒛|𝒙 log 𝑝𝜃 𝒙|𝒛 − KL 𝑞𝜙 𝒛|𝒙 ||𝑝(𝒛)

𝑝𝜃 𝒙|𝒛 = 𝑇
𝑡=1 𝑝𝜃 𝒙𝒕+𝟏 |𝒙𝒕 , … , 𝒙𝟏 |𝒛 = 𝑇
𝑡=1 𝑝𝜃 𝒙𝒕+𝟏 |𝒙𝒕 |𝒛 , since the dynamics is Markovian.

𝑝 𝒛 = 𝒊≠𝒋 𝑝𝜃 𝒛𝒊𝒋 , the prior is a factorized uniform distribution over edge types

Kipf, Thomas, et al.


"Neural relational inference for interacting systems." arXiv preprint arXiv:1802.04687 (2018).
Interacting physical system

Kipf, Thomas, et al.


"Neural relational inference for interacting systems." arXiv preprint arXiv:1802.04687 (2018).
Interacting physical system

Kipf, Thomas, et al.


"Neural relational inference for interacting systems." arXiv preprint arXiv:1802.04687 (2018).
References

Geometric Deep Learning and Surveys on Graph Neural Networks

• Bronstein, Michael M., et al. "Geometric deep learning: going beyond euclidean data." IEEE Signal Processing
Magazine 34.4 (2017): 18-42.
• [NIPS 2017] Tutorial - Geometric deep learning on graphs and manifolds,
https://nips.cc/Conferences/2017/Schedule?showEvent=8735
• Goyal, Palash, and Emilio Ferrara. "Graph embedding techniques, applications, and performance: A survey."
Knowledge-Based Systems 151 (2018): 78-94., https://github.com/palash1992/GEM
• Battaglia, Peter W., et al. "Relational inductive biases, deep learning, and graph networks." arXiv preprint
arXiv:1806.01261 (2018).
• Awesome Graph Embedding And Representation Learning Papers,
https://github.com/benedekrozemberczki/awesome-graph-embedding

https://github.com/SeongokRyu/Graph-neural-networks
References

Graph Convolutional Network (GCN)

• Defferrard, Michaël, Xavier Bresson, and Pierre Vandergheynst. "Convolutional neural networks on graphs with fast
localized spectral filtering." Advances in Neural Information Processing Systems. 2016.
• Kipf, Thomas N., and Max Welling. "Semi-supervised classification with graph convolutional networks." arXiv preprint
arXiv:1609.02907 (2016).
• van den Berg, Rianne, Thomas N. Kipf, and Max Welling. "Graph Convolutional Matrix Completion." stat 1050 (2017): 7.
• Schlichtkrull, Michael, et al. "Modeling relational data with graph convolutional networks." European Semantic Web
Conference. Springer, Cham, 2018.
• Levie, Ron, et al. "Cayleynets: Graph convolutional neural networks with complex rational spectral filters." arXiv preprint
arXiv:1705.07664 (2017).

https://github.com/SeongokRyu/Graph-neural-networks
References

Attention mechanism in GNN

• Velickovic, Petar, et al. "Graph attention networks." arXiv preprint arXiv:1710.10903 (2017).
• GRAM: Graph-based Attention Model for Healthcare Representation Learning
• Shang, C., Liu, Q., Chen, K. S., Sun, J., Lu, J., Yi, J., & Bi, J. (2018). Edge Attention-based Multi-Relational Graph
Convolutional Networks. arXiv preprint arXiv:1802.04944.
• Lee, John Boaz, et al. "Attention Models in Graphs: A Survey." arXiv preprint arXiv:1807.07984 (2018).

Message Passing Neural Network (MPNN)

• Li, Yujia, et al. "Gated graph sequence neural networks." arXiv preprint arXiv:1511.05493 (2015).
• Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., & Dahl, G. E. (2017). Neural message passing for quantum
chemistry. arXiv preprint arXiv:1704.01212.

https://github.com/SeongokRyu/Graph-neural-networks
References

Graph AutoEncoder and Graph Generative Model

• Kipf, Thomas N., and Max Welling. "Variational graph auto-encoders." arXiv preprint arXiv:1611.07308 (2016).
• Simonovsky, Martin, and Nikos Komodakis. "GraphVAE: Towards Generation of Small Graphs Using Variational
Autoencoders." arXiv preprint arXiv:1802.03480 (2018).
• Liu, Qi, et al. "Constrained Graph Variational Autoencoders for Molecule Design." arXiv preprint arXiv:1805.09076
(2018).
• Pan, Shirui, et al. "Adversarially Regularized Graph Autoencoder." arXiv preprint arXiv:1802.04407 (2018).
• Li, Y., Vinyals, O., Dyer, C., Pascanu, R., & Battaglia, P. (2018). Learning deep generative models of graphs. arXiv
preprint arXiv:1803.03324.
• Jin, Wengong, Regina Barzilay, and Tommi Jaakkola. "Junction Tree Variational Autoencoder for Molecular Graph
Generation." arXiv preprint arXiv:1802.04364 (2018).
• Liu, Qi, et al. "Constrained Graph Variational Autoencoders for Molecule Design." arXiv preprint arXiv:1805.09076
(2018).

https://github.com/SeongokRyu/Graph-neural-networks
References

Applications of GNN

• Duvenaud, David K., et al. "Convolutional networks on graphs for learning molecular fingerprints." Advances in
neural information processing systems. 2015.
• Kearnes, Steven, et al. "Molecular graph convolutions: moving beyond fingerprints." Journal of computer-aided
molecular design 30.8 (2016): 595-608.
• Schütt, K. T., Arbabzadah, F., Chmiela, S., Müller, K. R., & Tkatchenko, A. (2017). Quantum-chemical insights from
deep tensor neural networks. Nature communications, 8, 13890.
• Wu, Z., Ramsundar, B., Feinberg, E. N., Gomes, J., Geniesse, C., Pappu, A. S., ... & Pande, V. (2018). MoleculeNet: a
benchmark for molecular machine learning. Chemical Science, 9(2), 513-530.
• Feinberg, Evan N., et al. "Spatial Graph Convolutions for Drug Discovery." arXiv preprint arXiv:1803.04465 (2018).

https://github.com/SeongokRyu/Graph-neural-networks
References

Applications of GNN

• Jin, Wengong, Regina Barzilay, and Tommi Jaakkola. "Junction Tree Variational Autoencoder for Molecular Graph
Generation." arXiv preprint arXiv:1802.04364 (2018).
• Liu, Qi, et al. "Constrained Graph Variational Autoencoders for Molecule Design." arXiv preprint arXiv:1805.09076
(2018).
• De Cao, Nicola, and Thomas Kipf. "MolGAN: An implicit generative model for small molecular graphs." arXiv preprint
arXiv:1805.11973 (2018).
• Selvan, Raghavendra, et al. "Extraction of Airways using Graph Neural Networks." arXiv preprint arXiv:1804.04436
(2018).
• Kipf, Thomas, et al. "Neural relational inference for interacting systems." arXiv preprint arXiv:1802.04687 (2018).

• And many other interesting works!

https://github.com/SeongokRyu/Graph-neural-networks

You might also like