0% found this document useful (0 votes)

112 views7 pages

A New Model For Learning in Graph Domains

Uploaded by

常亮

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

112 views7 pages

A New Model For Learning in Graph Domains

Uploaded by

常亮

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/4202380

A new model for earning in raph domains

Conference Paper · January 2005

DOI: 10.1109/IJCNN.2005.1555942 · Source: IEEE Xplore

CITATIONS READS
71 2,929

3 authors:

Maria Cristina Gori Gabriele Monfardini

Department of Neurology and Psichiatry Università degli Studi di Siena
118 PUBLICATIONS 2,806 CITATIONS 9 PUBLICATIONS 494 CITATIONS

SEE PROFILE SEE PROFILE

Franco Scarselli
Università degli Studi di Siena
81 PUBLICATIONS 1,455 CITATIONS

SEE PROFILE

All content following this page was uploaded by Franco Scarselli on 30 May 2014.

The user has requested enhancement of the downloaded file.

A New Model for Learning in Graph Domains
Marco Gori Gabriele Monfardini Franco Scarselli
Dipartimento di Ingegneria Dipartimento di Ingegneria Dipartimento di Ingegneria
dell’Informazione dell’Informazione dell’Informazione
Università di Siena, Italy Università di Siena, Italy Università di Siena, Italy
E-mail: [email protected] E-mail: [email protected] E-mail: [email protected]

Abstract— In several applications the information is naturally

represented by graphs. Traditional approaches cope with graphi-
cal data structures using a preprocessing phase which transforms
the graphs into a set of flat vectors. However, in this way,
important topological information may be lost and the achieved
results may heavily depend on the preprocessing stage. This paper
presents a new neural model, called graph neural network (GNN),
capable of directly processing graphs. GNNs extends recursive
neural networks and can be applied on most of the practically
useful kinds of graphs, including directed, undirected, labelled
and cyclic graphs. A learning algorithm for GNNs is proposed
and some experiments are discussed which assess the properties
of the model.

Fig. 1. An image and its graphical representation by a RAG.

I. I NTRODUCTION

In several machine learning applications the data of interest Recursive neural networks (RNNs) [3], [4] are a new neural
can be suitably represented in form of sequences, trees, and, model that tries to overcome this problem. In fact, RNNs can
more generally, directed or undirected graphs, f.i. in chemics directly process graphs. The main idea consists of encoding the
[1], software engineering, image processing [2]. In those graphical information into a set of states associated with the
applications, the goal consists of learning from examples a graph nodes. The states are dynamically updated following the
function τ that maps a graph G and one of its nodes n to a topological relationship among the nodes. Finally, an output is
I m.
vector of reals: τ (G, n) ∈ R computed using the encodings stored in the states. However,
More precisely, we can distinguish two classes of applica- the RNN model suffers from a number of limitations. In fact,
tions according whether τ (G, n) depends or not on the node RNNs can process only directed and acyclic graphs and can
n. Those applications will be called node focused and graph be used only on graph focused problems, i.e. τ (G, n) must be
focused, respectively. Object localization is an example of independent from n.
node focused applications. An image can be represented by In this paper, we present a new neural network model, called
a Region Adjacency Graph (RAG), where the nodes denote graph neural network (GNN), that extends recursive neural
the homogeneous regions of the image and the arcs represent networks. GNNs can process most of the practically useful
their adjacency relationship (Fig. 1). This problem can be graphs and can be applied both on graph and node focused
solved by a function τ which classifies the nodes of the RAG problems. A learning algorithm for GNNs is also described
according to whether the corresponding region belongs to the along with some experimental results that assess the properties
object or not. For example, the output of τ for Fig. 1 might of model. Finally, it is worth to mention that, under mild
be 1 for the black nodes, which correspond to the house, and conditions, any function τ on graphs can be approximated
−1 otherwise. On the other hand, image classification is an in probability by a GNN. Such a result, which, for reasons of
example of graph focused applications. For instance, τ (G) space, is not further discussed in this paper, is proved in [5].
may classify an image represented by G into different classes, The structure of the paper is as follows: Section II presents
e.g., houses, cars, people, and so on. the GNN model along with its main properties. Section III
Traditional applications usually cope with graphs by a contains some experimental results. Finally, in Section IV
preprocessing procedure that transforms the graphs to simpler conclusions are drawn.
representations (e.g. vectors or sequences of reals) which can
be successfully elaborated with common machine learning II. G RAPH NEURAL NETWORKS
techniques. However, valuable information may be lost during In the following, |·| represents the module or the cardinality
the preprocessing and, as a consequence, the application may operator according to whether it is applied on a real number
suffer from a poor performance and generalization. or a set, respectively. The norm one of vector v is denoted
P
by kvk1 , i.e. kvk1 = i |vi |. A graph G is a pair (N , E), where Fw and Gw are the composition of |N | instances of
where N is a set of nodes and E a set of edges. The nodes fw and gw , respectively.
connected to n by an arc are represented by ne[n]. Each node Notice that x is correctly defined only if the solution
may have a label that is denoted by ln ∈ R I q . Usually labels of system (3) is unique. The key choice adopted in the
include features of the object corresponding to the node. For proposed approach consists of designing fw such that Fw is
example, in the case of a RAG (Figure 1), node labels may a contraction mapping1 w.r.t. the state x. In fact, the Banach
represent properties of the regions, e.g., area, perimeter. fixed point theorem [6] guarantees that if Fw is a contraction
The considered graphs may be either positional or mapping, then Eq. (3) has a solution and the solution is unique.
non–positional. Non–positional graphs are those described so Thus, Eqs. (1) and (2) define a method to produce an
far. In positional graphs, an injective function µn : ne[n] → IN output on for each node, i.e. they realize a parametric function
is defined for each node n. Here, IN is the set of natural ϕw (G, n) = on which operates on graphs. The corresponding
numbers, and µn assigns to each neighbor u ∈ ne[n] a different machine learning problem consists of adapting the parameters
position. Actually, the position can be used to store useful w such that ϕw approximates the data in the learning set
information, e.g. a sorting of the neighbors according to their L = {(Gi , ni , ti )| 1 ≤ i ≤ p}, where each triple (Gi , ni , ti )
importance. denotes a graph Gi , one of its nodes ni and the desired output
The intuitive idea underlying GNNs is that nodes in a ti . In practice, the learning problem can be implemented by
graph represent objects or concepts and edges represent their the minimization of a quadratic error function
relationships. Thus, we can attach to each node n a vector
p
xn ∈ R I s , called state, which collects a representation of X
the object denoted by n. In order to define xn , we observe ew = (ti − ϕw (Gi , ni ))2 . (4)
that the related nodes are connected by edges. Thus, xn i=1

is naturally specified using the information contained in the

Notice that system (1) is particularly suited to process
neighborhood of n (see Figure 2). More precisely, let w be a
positional graphs, since the label and the state of each neighbor
set of parameters and fw be a parametric transition function
is assigned to a predefined position in the inputs of fw . For
that expresses the dependence of a node on its neighborhood.
non–positional graphs, it may be useful to replace Eq. (1) with
The state xn is defined as the solution of the system of
equations: X
xn = hw (ln , xu , lu ), n ∈ N (5)
xn = fw (ln , xne[n] , lne[n] ), n ∈ N (1) u∈ne[n]

where ln , xne[n] , lne[n] are the label of n, and the states and The intuitive idea underlying Eq. (5) consists of computing
the labels of the nodes in the neighborhood of n, respectively. the state xn by the summing a set of “contributions”. Each
contribution is generated considering only one node in the
x4
x6 neighborhood of n. A similar approach was already used with
l4 x7
l6 l3 success in recursive neural networks [7], [8].
x3 l7
Moreover, it is worth to mention that GNNs can be applied
x2 also to directed graphs. For this purpose, the input of fw
x5
l2 x1 l5 (or hw ) must be extended with information about the edge
l1 directions, f.i. a flag du for each node u ∈ ne[n] such that
du = 1, if the edge (u, v) is directed toward v, and du =
x1 = fw (l1 , x2 , x3 , x5 , l2 , l3 , l5 ) 0, otherwise. Finally, in graph focused applications only one
output for each graph is produced. This can be achieved in
x ne[1] l ne[1] several ways. For example a special node s can be selected in
Fig. 2. State x1 depends on the neighborhood information.
each graph and the corresponding output os is returned. Such
an approach is also used in recursive neural networks.
For each node n, an output vector on ∈ IRm is also In order to implement the model formally defined by
defined which depends on the state xn and the label ln . The Equations (1) and (2), the following items must be provided:
dependence is described by a parametric output function gw 1 A method to solve Eq. (1);
2 A learning algorithm to adapt fw and gw by exam-
on = gw (xn , ln ), n ∈ N . (2) ples from the train set;
3 An implementation of fw and gw .
Let x and l be respectively the vectors constructed by stacking
all the states and all the labels. Then, Equations (1) and (2) These aspects will be considered in the following subsections.
can be written as:
1 A function l : R Ia →R I a is a contraction mapping w.r.t. a vector norm
x = Fw (x, l) k · k, if it exists a real µ, 0 ≤ µ < 1, such that for any y 1 ∈ RI a, y2 ∈ R
I a,
(3)
o = Gw (x, l) kl(y 1 ) − l(y 2 )k ≤ µky 1 − y 2 k.
A. Computing the states for training recursive neural networks [4], and the Almeida–
The Banach fixed point theorem suggests a simple algorithm Pineda algorithm [9], [10]. The latter is a particular version of
to compute the fixed point of Eq. (3). It states that if Fw is a the backpropagation through time algorithm which can be used
contraction mapping, then the following dynamical system to train recurrent networks. Our approach applies the Almeida–
Pineda algorithm to the encoding network, where all instances
x(t + 1) = Fw (x(t), l) , (6) of fw and gw are considered to be independent networks. It
produces a set of gradients, one for each instance of fw and
where x(t) denotes the t-th iterate of x, converges exponen- gw . Those gradients are accumulated to compute ∂ww .
∂e (T )
tially fast to the solution of Eq. (3) for any initial state x(0).
Thus, xn and on can be obtained by iterating: C. Implementing the transition and the output functions
xn (t + 1) = fw (ln , xne[n] (t), lne[n] ), In the following, two different GNN models, called linear
(7) GNN and neural GNN, are described. In both the cases, the
on (t + 1) = gw (xn (t + 1), ln ), n ∈ N.
output function gw is implemented by a multilayer feedfor-
Note that the computation described in Eq. (7) can be in- ward neural network and the transition function defined in (5)
terpreted as the representation of a neural network, called is used. On the other hand, linear and neural GNNs differ in
encoding network, that consists of units which compute fw the implementation of the function hw of Eq. (5) and in the
and gw (see Figure 3). In order to build the encoding network, strategy adopted to ensure that Fw is a contraction mapping.
each node of the graph can be replaced by a unit computing
1) Linear GNN: In this model, hw is
the function fw . Each unit stores the current state xn (t) of
the corresponding node n, and, when activated, it calculates hw (ln , xu , lu ) = An,u xu + bn .
the state xn (t + 1) using the labels and the states stored in
its neighborhood. The simultaneous and repeated activation of where the vector bn ∈ RI s and the matrix An,u ∈ R I s×s are
the units produces the behavior described by Eq. (7). In the defined by the output of two feedforward neural networks,
encoding network, the output of node n is produced by another whose parameters correspond to the parameters of the GNN.
2
unit which implements gw . More precisely, let φw : RI 2q → R I s and ρw : R Iq → R Is
be the functions implemented by two multilayer feedforward
o2 (t)
neural networks. Then,
l2 gw µ
An,u = · Resize (φw (ln , lu ))
l2
l1
x2 (t)
l3 s|ne[u]|
fw l 2 , l 1 ,..
o1 (t) gw
x1 (t)
gw o3 (t) bn = ρw (ln ) ,
l1 x2 (t) x3 (t)
l3
l 1 , l 4 ,.. fw x1 (t)
x3 (t)
fw l 3 ,l 4 where µ ∈ (0, 1) and Resize(·) denotes the operator that
l4
x4 (t) allocate the elements of s2 -dimensional vector into a s × s
x4 (t) matrix. Here, it is further assumed that kφw (ln , lu )k1 ≤
fw l 4 , l 3 ,..
s2 holds, which is straightforwardly verified if the output
gw l4
neurons of the network implementing φw use an appropriately
o4 (t) bounded activation function, f.i. a hyperbolic tangent.
Notice that, in this case, Fw is a contraction function for
Fig. 3. A graph and its corresponding encoding network.
any set of parameters w. In fact,
Fw (x, l) = Ax + b , (8)
B. A learning algorithm
where b is the vector constructed by stacking all the bn , and
The learning algorithm consists of two phases:
A is a block matrix {Ān,u }, with Ān,u = An,u if u is a
(a) the states xn (t) are iteratively updated, using Eq. (7) until neighbor of n and Ān,u = 0, otherwise. By simple algebra,
they reach a stable fixed point x(T ) = x at time T ; ∂Fw
it is easily proved that k ∂ x k1 = kAk1 ≤ µ which implies
∂e (T )
(b) the gradient ∂ww is computed and the weights w are that Fw is a contraction function.
updated according to a gradient descent strategy. 2) Neural GNN: In this model, hw is implemented by a
Thus, while phase (a) moves the system to a stable point, feedforward neural network. Since three layer neural networks
phase (b) adapts the weights to change the outputs towards are universal approximators, this method allows to implement
the desired target. These two phases are repeated until a given any function hw . However, not all the parameters w can be
stopping criterion is reached. It can be formally proved that used, because it must be ensured that the corresponding global
if Fw and Gw in Eq. (3) are differentiable w.r.t. w and transition function Fw is a contraction. In practice, this goal
x, then the above learning procedure implements a gradient can be achieved by adding a penalty term to the error function
descendent strategy on the error function ew [5]. p
In fact our algorithm is obtained by combining the back-
X ∂Fw
ew = (ti − ϕw (G, ni ))2 + βL ,
propagation through structure algorithm, which is adopted i=1
∂x 1
where L(y) = (y − µ)2 , if y > µ, and L(y) = 0 other- set. A clique of size 5 was forced into every graph of the
wise. Moreover, β is a predefined parameter balancing the dataset. Thus, each graph had at least one clique, but it might
importance of the penalty term and the error on patterns, and contain more cliques due to the random dataset construction.
the parameter µ ∈ (0, 1) defines a desired upper bound on The desired target tn = τ (G, n) of each node was generated
contraction constant of Fw . by a brute force algorithm that looked for cliques in the graphs.
Table I shows the accuracies4 achieved on this problem by a
III. E XPERIMENTAL RESULTS set of GNNs obtained varying the number of hidden neurons
The approach has been evaluated on a set of toy problems of the feedforward networks. For sake of simplicity, all the
derived from graph theory and applications of practical rele- feedforward networks involved in a GNN contained the same
vance in machine learning. Each problem belongs to one of number of hidden neurons.
the following categories: TABLE I
1) connection-based problems; R ESULTS ON THE C LIQUE PROBLEM
2) label-based problems;
3) general problems. Accuracy Time
Model Hidden
The first category contains problems where τ (G, n) depends Test Train Test Train
only on the graph connectivity and is independent from the 2 83.73% 83.45% 14.2s 36m 21s
5 86.95% 86.60% 20.0s 52m 18s
labels. On the other hand, in label-based problems τ (G, n) 10 90.74% 90.33% 31.3s 1h 15m 52s
can be computed using only the label ln of node n. Finally, neural
20 90.20% 89.72% 50.3s 1h 34m 03s
the last category collects examples in which GNNs must use 30 90.32% 89.82% 1m 10s 2h 115 3s
both topological and labeling information. 2 81.25% 86.90% 2.6s 53m 21s
5 79.87% 86.90% 2.8s 1h 01m 20s
Both the linear and the neural model were tested. Three
10 82.92% 87.54% 3.1s 48m 54s
layer (one hidden layer) feedforward networks with sigmoidal linear
20 80.90% 88.51% 3.6s 1h 00m 33s
activation functions were used to implement the functions 30 77.79% 84.94% 4.1s 1h 06m 21s
involved in the two models, i.e. gw , φw , and ρw in linear
GNNs, and gw , hw in neural GNNs (see Section II-C).
Unless otherwise stated, state dimension s was 2. The The clique problem is a difficult test for GNNs. In fact,
presented results were averaged on five different trials. In each GNNs are based on local computation framework, where the
trial, the dataset was a collection of random connected graphs computing activity is localized on the nodes of the graph (see
with a given density δ. The dataset construction procedure Eq. (1)). On the other hand, the detection of a clique requires
consisted of two steps: i) each pair of nodes is connected with the knowledge of properties of all the nodes involved in the
a probability δ; ii) the graph is checked to verify whether it is clique. Notwithstanding, the results of Tab. I confirmed that
connected or not and, if it is not, random edges are inserted GNNs can learn to solve this problem.
until the condition is satisfied. The dataset was splitted into a Notice that Tab. I compares the accuracy achieved on test
train, a validation and a test set. The validation set was used set with the accuracy of train set. The results, are very close,
to select the best GNN produced by the learning procedure. particularly for the neural model. This proves that the GNN
Actually, in every trial, the training procedure performed 5000 model does not suffer from generalization problems on this
epochs and every 20 epochs it evaluated the current GNN on experiment. It is also observed that, for the neural GNN, the
validation set. The best GNN was the one that achieved the number of hidden neurons has a clear influence on the results.
lowest error on validation set. In fact, a larger number of hiddens corresponds to a better
GNN software was implemented in Matlab R 7.0.12 . The averaged accuracy. On the other hand, a clear relationship
experiments were run on a Power Mac G5 with a 2 GHz between number of hidden neurons and accuracy is not evident
PowerPC processor and 2 GB RAM. For all the experiments for the linear model.
Finally, Tab. I displays the time spent by the training
memory requirements never grew beyond 200 MB.
and the testing procedures. It is worth to mention that the
A. Connection-based problems computational cost of each learning epoch may depend on
the particular train dataset. For example, the number of the
1) The Clique problem: A Clique of size k is a complete
iterations of system (7), needed to reach the fixed point,
subgraph with k nodes3 in a larger graph. The goal of this
depends on the initial state x(0) (see Section II-C). For this
experiment consisted of detecting all the cliques of size 5 in
reason, in some cases, even if the neural networks involved in
the input graphs. More precisely, the function τ that should
the learning procedure are larger, the computation time may be
be implemented by the GNN was τ (G, n) = 1 if n belongs to
smaller, e.g. the linear model with 5 and 10 hidden neurons.
a clique of size 5 and τ (G, n) = −1 otherwise. The dataset
In [11], it is stated that the performance of recursive neural
contained 1400 random graphs with 20 nodes: 200 graphs in
networks may be improved if the labels of the graphs are
the train set, 200 in the validation set, and 1000 in the test
4 Accuracy is defined as the ratio between the correct results and the total
2 Copyright c 1994-2004 by The MathWorks, Inc. number of patterns. A zero threshold was used to decide if the output of the
3 A graph is complete if there is an edge between each pair of nodes. GNN for a certain node is positive or negative.
TABLE III
extended with random vectors. Intuitively, the reason why
R ESULTS ON T HE N EIGHBORS PROBLEM
this approach works is that the random vectors are a sort
of identifiers that allow the RNNs to distinguish among the
Test
nodes. In practice, the method may or may not work, since Model Hidden
er < 0.05 er < 0.1
Training time
the random vectors also inject noise on the dataset, making the 2 73.64% 77.40% 47m 28s
learning more difficult. In order to investigate whether such a 5 89.56% 89.76% 1h 06m 20s
result holds also for GNNs, we added integer random labels neural 10 90.64% 91.44% 1h 21m 00s
between 0 and 8 to the graphs of the previous dataset and 20 99.04% 99.72% 2h 23m 27s
30 88.48% 89.48% 2h 33m 03s
we ran again the experiments. Table II seems to confirm that 2 72.48% 77.24% 58m 45s
the result holds also for GNNs, since in most cases GNNs on 5 89.60% 89.84% 46m 38s
graph with random labels outperform GNNs on graphs with 10 99.44% 99.72% 42m 57s
linear
no labels. 20 98.92% 99.68% 42m 53s
30 99.16% 99.68% 49m 58s
TABLE II
R ESULTS ON THE C LIQUE PROBLEM WITH RANDOM LABELS

from n by a path containing 2 edges: the nodes that are

Accuracy Time connected to n by several paths must be counted only once
Model Hidden
Test Train Test Train
and n must not be counted. For this reason, such a problem
2 87.30% 87.51% 16s 39m 56s
5 90.48% 90.53% 23s 52m 11s is more difficult than the previous one. Table IV shows the
10 90.37% 90.33% 35s 1h 08m 08s achieved results.
neural
20 90.69% 91.27% 52s 1h 25m 43s
30 89.92% 90.38% 1m 13s 2h 02m 04s TABLE IV
2 89.05% 89.42% 2.8s 41m 13s T HE 2- ORDER N EIGHBORS PROBLEM
5 86.81% 88.62% 3.0 s 45m 11s
10 88.59% 89.05% 3.3s 42m 29s
linear Test
20 81.07% 87.62% 4.3s 1h 17m 31s Model Hidden Training time
er < 0.05 er < 0.1
30 84.28% 89.12% 4.6s 1h 10m 43s
2 65.96% 81.88% 1h 00m 46s
5 66.00% 81.64% 1h 34m 14s
neural 10 66.04% 80.00% 1h 26m 05s
2) The Neighbors problem: This very simple problem con- 20 72.48% 88.48% 2h 06m 03s
sists of finding the number of neighbors of each node. Since 30 70.40% 81.08% 3h 27m 52s
the information needed to calculate such a number is directly 2 65.92% 81.56% 1h 04m 14s
available to the transition function fw , GNNs are expected to 5 69.96% 84.04% 48m 02s
10 68.84% 84.04% 53m 35s
perform well on such a problem. linear
20 73.40% 86.23% 46m 47s
On the other hand, the peculiarity of this experiment is 30 64.72% 73.96% 48m 25s
in the fact that the dataset involved only one single graph
G. Notice that this situation may really arise in practical
applications. The problem of classifying the Web pages is B. Label-based problems
a straightforward example. The Web is a single big graph,
1) The Parity problem: In the Parity problem, learn and val-
where the nodes represent the pages and the edges stand for
idation sets contained 500 graphs, while the test set included
the hyperlinks. The learning data set consists of a set of pages
2000 graphs. Each node had a label which was a random
whose classification is known, whereas the other pages must
vector with 8 boolean elements, i.e. l n = [λ1 , . . . , λ8 ], where
be classified by generalization.
λi ∈ {0, 1}. The function τ (G, n) to be learnt is the parity of
For this experiment, the graph had 500 nodes and the dataset
ln , i.e. τ (G, n) = 1, if ln contains an even number of ones,
{(G, n1 , t1 ) . . . (G, n500 , t500 )} contained 500 patterns, one
and τ (G, n) = −1, otherwise.
for each node of the graph. Dataset was randomly splitted
The purpose of this experiment is to verify whether the
into a train (100 patterns), a validation (100 patterns), and a
GNN model is able to discard the information contained in
test set (300 patterns). The performance is measured by the
the topology of the graphs in those cases where such an
percentages of the patterns where GNNs achieved an abso-
information is not needed to solve the problem. Table V shows
lute relative error er 5 lower than 0.05 and 0.1, respectively.
the achieved results. GNNs with a number of hidden neurons
Table III shows the achieved results.
smaller than 5 have a poor performance. However, this has a
3) The 2-order Neighbors problem: The datasets of this
simple explanation. In this experiment, the output function g w
experiments were the same as in the neighbors problem. Here,
must approximate the parity function: this can be done only
the problem consists of computing, for each node n, the
if the network that implements gw has a sufficient number of
number of distinct neighbors’ neighbors. In other words, the
hidden neurons.
GNN should count the number of nodes that are reachable
The parity problem can be worked out also by a tradi-
5 The absolute relative error on pattern i is defined as |ti −ϕw (G, ni )|/ti . tional feedforward neural network applied on the node labels.
TABLE VI
For comparison purposes, Table V shows also the accuracy
T HE S UBGRAPH M ATCHING PROBLEM
achieved by a three layer feedforward neural network (FNN)
with 20 hidden neurons. Number of nodes in G
6 10 14 18 Avg.
TABLE V
neural 92.4 90.0 90.0 84.3 89.1
T HE PARITY PROBLEM 3 linear 93.3 84.5 86.7 84.7 87.1
FNN 81.4 78.2 79.6 82.2 80.3
neural 91.3 87.7 84.9 83.3 86.8
Accuracy Time
Model Hidden 5 linear 90.4 85.8 85.3 80.6 85.5
Test Train Test Train
FNN 85.2 73.2 65.2 75.5 74.8
2 53.90% 55.41% 10.8s 29m 22s Number
neural 89.8 84.6 79.9 84.8
5 92.77% 93.41% 14.5s 34m 20s of nodes
7 linear 91.3 84.4 79.2 85.0
10 97.08% 97.48% 21.3s 46m 03s in S
neural FNN 84.2 66.9 64.6 71.6
20 89.64% 90.05% 34.5s 1h 07m 36s neural 93.3 84.0 77.8 85.0
30 92.04% 92.85% 47.8s 1h 30m 05s 9 linear 92.2 84.0 77.7 84.7
2 63.51% 64.48% 1.9s 34m 39s FNN 91.6 73.7 67.0 77.4
5 96.08% 96.55% 2.1s 40m 02s neural 91.8 90.2 86.9 81.3
10 98.21% 98.50% 2.3s 44m 25s Avg. linear 91.9 85.1 80.3 84.3
linear
20 99.36% 99.52% 3.0s 51m 25s FNN 83.3 81.8 71.1 73.3
30 99.40% 99.64% 3.5s 59m 47s neural 86.7
FNN 20 99.46% 99.45% 0.3s 1h 09m 04s Total
linear 85.7
avg
FNN 75.3

C. General problems
matter of future research. From a theoretical point of view, it
1) The Subgraph Matching problem: The Subgraph Match-
is interesting to study also the case when the input graph is
ing problem consists of identifying the presence of a subgraph
not predefined but it changes during the learning procedure.
S in a larger graph G. Such a problem has a number of appli-
cations, including object localization and detection of active R EFERENCES
parts in chemical compounds. Machine learning techniques [1] T. Schmitt and C. Goller, “Relating chemical structure to activity: An
are useful for this problem when the subgraph is not known application of the neural folding architecture,” in Workshop on Fuzzy–
in advance and is available only from a set of examples or Neuro Systems ’98 and Conference on Egineering Applications of Neural
Networks, EANN ’98, 1998.
when the graphs are corrupted by noise. [2] E. Francesconi, P. Frasconi, M. Gori, S. Marinai, J. Sheng, G. Soda,
In our tests, we used 600 connected random graphs, equally and A. Sperduti, “Logo recognition by recursive neural networks,” in
divided into the train, the validation and the test set. A smaller Lecture Notes in Computer Science — Graphics Recognition, K. Tombre
and A. K. Chhabra, Eds. Springer, 1997, GREC’97 Proceedings.
subgraph S, which was randomly generated in each trial, was [3] P. Frasconi, M. Gori, and A. Sperduti, “A general framework for adaptive
inserted into every graph of the dataset. The nodes had integer processing of data structures,” IEEE Transactions on Neural Networks,
labels in the range [0, 10], and a small normal noise, with zero vol. 9, no. 5, pp. 768–786, September 1998.
[4] A. Sperduti and A. Starita, “Supervised neural networks for the
mean and a standard deviation of 0.25, was added to all the classification of structures,” IEEE Transactions on Neural Networks,
labels. The goal consisted of predicting whether n is a node vol. 8, pp. 429–459, 1997.
of a subgraph S, i.e. τ (G, n) = 1, if n belongs to S, and [5] F. Scarselli, A. C. Tsoi, M. Gori, and M. Hagenbuchner, “A new
neural network model for graph processing,” Department of Information
τ (G, n) = −1, otherwise. Engineering, University of Siena, Tech. Rep. DII 01/05, 2005.
In all the experiments, the state dimension was s = 5 and [6] M. A. Khamsi, An Introduction to Metric Spaces and Fixed Point Theory.
all the neural networks involved in the GNNs had 5 hidden John Wiley & Sons Inc, 2001.
[7] M. Gori, M. Maggini, and L. Sarti, “A recursive neural network
neurons. Table VI shows the results with several dimensions model for processing directed acyclic graphs with labeled edges,” in
for S and G. In order to evaluate the relative importance Proceedings of the International Joint Conference on Neural Networkss,
of the labels and the connectivity in the localization of the Portland (USA), July 2003, pp. 1351–1355.
[8] M. Bianchini, P. Mazzoni, L. Sarti, and F. Scarselli, “Face spotting in
subgraph, also a feedforward neural network (FNN) with 20 color images using recursive neural networks,” in Proceedings of the 1st
hidden neurons was exercised on this test. The FNN tries to ANNPR Workshop, Florence (Italy), Sept. 2003.
solve the problem using only the label ln of the node. Table VI [9] L. Almeida, “A learning rule for asynchronous perceptrons with feed-
back in a combinatorial environment,” in IEEE International Conference
shows that GNNs outperform the FNNs, confirming that the on Neural Networks, M. Caudill and C. Butler, Eds., vol. 2. San Diego,
GNNs used also the graph topology to find S. 1987: IEEE, New York, 1987, pp. 609–618.
[10] F. Pineda, “Generalization of back–propagation to recurrent neural
IV. C ONCLUSIONS networks,” Physical Review Letters, vol. 59, pp. 2229–2232, 1987.
[11] M. Bianchini, M. Gori, and F. Scarselli, “Recursive processing of cyclic
A new neural model, called graph neural network (GNN), graphs,” in Proceedings of IEEE International Conference on Neural
was presented. GNNs extend recursive neural networks, since Networks,, Washington, DC, USA, May 2002, pp. 154–159.
they can process a larger class of graphs and can be used
on node focused problems. Some preliminary experimental
results confirmed that the model is very promising. The
experimentation of the approach on larger applications is a

View publication stats

2022 Book GraphNeuralNetworksFoundations PDF
100% (2)
2022 Book GraphNeuralNetworksFoundations PDF
701 pages
The Song of Mahamudra
No ratings yet
The Song of Mahamudra
8 pages
ETOH Case Study
No ratings yet
ETOH Case Study
5 pages
Original GNN
No ratings yet
Original GNN
22 pages
The Graph Neural Network Model
No ratings yet
The Graph Neural Network Model
20 pages
GNN Review
No ratings yet
GNN Review
26 pages
Graph Neural Networks: A Review of Methods and Applications
No ratings yet
Graph Neural Networks: A Review of Methods and Applications
22 pages
Graph Convolutional Networks: A Comprehensive Review: Open Access Research
No ratings yet
Graph Convolutional Networks: A Comprehensive Review: Open Access Research
23 pages
Approximation - and Quantization-Aware Training For Graph Neural Networks
No ratings yet
Approximation - and Quantization-Aware Training For Graph Neural Networks
14 pages
Masked Attention Is All You Need For Graphs: Duvenaud Et Al. 2015 Kearnes Et Al. 2016 Gilmer Et Al. 2017
No ratings yet
Masked Attention Is All You Need For Graphs: Duvenaud Et Al. 2015 Kearnes Et Al. 2016 Gilmer Et Al. 2017
15 pages
Seminar Presentation
No ratings yet
Seminar Presentation
19 pages
Graph Neural Networks: A Review of Methods and Applications
No ratings yet
Graph Neural Networks: A Review of Methods and Applications
20 pages
A Comparison Between Recursive Neural Networks and Graph Neural Networks
No ratings yet
A Comparison Between Recursive Neural Networks and Graph Neural Networks
8 pages
Graph Element Networks: Adapative Structured Computation and Memory
No ratings yet
Graph Element Networks: Adapative Structured Computation and Memory
11 pages
Deep Learning On Graphs: A Survey: Ziwei Zhang, Peng Cui and Wenwu Zhu, Fellow, IEEE
No ratings yet
Deep Learning On Graphs: A Survey: Ziwei Zhang, Peng Cui and Wenwu Zhu, Fellow, IEEE
24 pages
A Comprehensive Survey On Graph Neural Networks
No ratings yet
A Comprehensive Survey On Graph Neural Networks
22 pages
Bacciu 2020
No ratings yet
Bacciu 2020
62 pages
A Survey of Graph Neural Networks in Various Learning Paradigms Methods, Applications, and Challenges
No ratings yet
A Survey of Graph Neural Networks in Various Learning Paradigms Methods, Applications, and Challenges
70 pages
Improving Global Awareness of Linkset Predictions Using Cross-Attentive Modulation Tokens
No ratings yet
Improving Global Awareness of Linkset Predictions Using Cross-Attentive Modulation Tokens
17 pages
Yang 20 A
No ratings yet
Yang 20 A
16 pages
Why Are Graph Neural Networks Effective For EDA Problems
No ratings yet
Why Are Graph Neural Networks Effective For EDA Problems
8 pages
Embedding Knowledge Graphs Attentive To Positional and Centrality Qualities
No ratings yet
Embedding Knowledge Graphs Attentive To Positional and Centrality Qualities
20 pages
GNNS
No ratings yet
GNNS
7 pages
2024 - Introduction To Graph Neural Networks A Starting
No ratings yet
2024 - Introduction To Graph Neural Networks A Starting
49 pages
Defence Transcription
No ratings yet
Defence Transcription
4 pages
Computing Graph Neural Networks: A Survey From Algorithms To Accelerators
No ratings yet
Computing Graph Neural Networks: A Survey From Algorithms To Accelerators
38 pages
Graph Representation Learning
No ratings yet
Graph Representation Learning
141 pages
A Comprehensive Survey of Graph Neural Networks PDF
No ratings yet
A Comprehensive Survey of Graph Neural Networks PDF
22 pages
Graph Neural Networks Methods Applications and Opp
No ratings yet
Graph Neural Networks Methods Applications and Opp
35 pages
2022 - Chuan Shi, Xiao Wang, Cheng Yang - Advances in Graph Neural Networks-Springer
No ratings yet
2022 - Chuan Shi, Xiao Wang, Cheng Yang - Advances in Graph Neural Networks-Springer
207 pages
Suevey On GNN
No ratings yet
Suevey On GNN
31 pages
Dirac-Bianconi Graph Neural Networks - Enabling Non-Diffusive Long-Range Graph Predictions
No ratings yet
Dirac-Bianconi Graph Neural Networks - Enabling Non-Diffusive Long-Range Graph Predictions
14 pages
Papers Papers PDF
No ratings yet
Papers Papers PDF
48 pages
Thesis Master 2022 Application of GNN For Graph Classification
No ratings yet
Thesis Master 2022 Application of GNN For Graph Classification
81 pages
End-To-End Learning of Latent Edge Weights For Graph Convolutional Networks
No ratings yet
End-To-End Learning of Latent Edge Weights For Graph Convolutional Networks
49 pages
An End-To-End Attention-Based Approach For Learning On Graphs
No ratings yet
An End-To-End Attention-Based Approach For Learning On Graphs
16 pages
Edgenets: Edge Varying Graph Neural Networks: Elvin Isufi, Fernando Gama and Alejandro Ribeiro
No ratings yet
Edgenets: Edge Varying Graph Neural Networks: Elvin Isufi, Fernando Gama and Alejandro Ribeiro
15 pages
RGCN
No ratings yet
RGCN
15 pages
GNN Foundations Frontiers and Applications Chapter3
No ratings yet
GNN Foundations Frontiers and Applications Chapter3
11 pages
Graph Neural Networks: Primeview
No ratings yet
Graph Neural Networks: Primeview
1 page
Graph-Based Representations in Pattern Recognition
No ratings yet
Graph-Based Representations in Pattern Recognition
387 pages
2020 - William L. Hamilton - Graph Representation Learning-Morgan & Claypool
No ratings yet
2020 - William L. Hamilton - Graph Representation Learning-Morgan & Claypool
161 pages
A Survey On Network Embedding
No ratings yet
A Survey On Network Embedding
20 pages
Machine Learning On Graphs: A Model and Comprehensive Taxonomy
No ratings yet
Machine Learning On Graphs: A Model and Comprehensive Taxonomy
49 pages
Ec32022 164
No ratings yet
Ec32022 164
9 pages
Graph Neural Networks A Bibliometric Mapping of TH
No ratings yet
Graph Neural Networks A Bibliometric Mapping of TH
17 pages
Cini 2023 SparseGraphLearningFromSpatiotemporal Time Series
No ratings yet
Cini 2023 SparseGraphLearningFromSpatiotemporal Time Series
36 pages
Geometric Deep Learning On Graphs and Manifolds Using Mixture Model Cnns
No ratings yet
Geometric Deep Learning On Graphs and Manifolds Using Mixture Model Cnns
13 pages
Transfer Learning For Deep Learning On Graph-Structured Data
No ratings yet
Transfer Learning For Deep Learning On Graph-Structured Data
7 pages
Structure Learning in Graphical Modeling
No ratings yet
Structure Learning in Graphical Modeling
28 pages
23 - AAAI - Substructure Aware Graph Neural Networks
No ratings yet
23 - AAAI - Substructure Aware Graph Neural Networks
9 pages
Introduction To Graph Neural Networks - Zhiyuan Liu & Jie Zhou
No ratings yet
Introduction To Graph Neural Networks - Zhiyuan Liu & Jie Zhou
142 pages
Content Augmented Graph Neural Networks
No ratings yet
Content Augmented Graph Neural Networks
15 pages
Fusion Graph Convolutional Networks
No ratings yet
Fusion Graph Convolutional Networks
10 pages
GNN Foundations Frontiers and Applications Chapter2
No ratings yet
GNN Foundations Frontiers and Applications Chapter2
10 pages
Generalized Simplicial Attention Neural Networks
No ratings yet
Generalized Simplicial Attention Neural Networks
13 pages
Representation Learning On Graphs: Methods and Applications
No ratings yet
Representation Learning On Graphs: Methods and Applications
23 pages
KERNEL Geometric Deep Learning: LeCun
No ratings yet
KERNEL Geometric Deep Learning: LeCun
22 pages
Rolip2 Report GNN
No ratings yet
Rolip2 Report GNN
6 pages
29256-Article Text-33310-1-2-20240324
No ratings yet
29256-Article Text-33310-1-2-20240324
9 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
From Everand
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
Fouad Sabry
No ratings yet
High Current Linear Regulated Bench Power Supply
No ratings yet
High Current Linear Regulated Bench Power Supply
14 pages
Useful WordsFor Creative Writing
No ratings yet
Useful WordsFor Creative Writing
22 pages
Biology Investigatory Project On Effects of Music Genres in Heart Rates
No ratings yet
Biology Investigatory Project On Effects of Music Genres in Heart Rates
47 pages
Albania 2017 2D Seismic RFI - Final PDF
No ratings yet
Albania 2017 2D Seismic RFI - Final PDF
5 pages
Just A Pretty Face
No ratings yet
Just A Pretty Face
2 pages
Modern Tanks and Afvs 1991present Russell Hart Stephen Hart Download
No ratings yet
Modern Tanks and Afvs 1991present Russell Hart Stephen Hart Download
33 pages
Angol Nyelvi Próbafelvételi
No ratings yet
Angol Nyelvi Próbafelvételi
10 pages
Influencer Script - Nidhi Gupta - 15.03.23
No ratings yet
Influencer Script - Nidhi Gupta - 15.03.23
2 pages
PSIC - Final of Domestic Electrical Appliances
No ratings yet
PSIC - Final of Domestic Electrical Appliances
82 pages
Single Function Timers Model No. 740
No ratings yet
Single Function Timers Model No. 740
6 pages
Practical Work 3 Zener Diode
No ratings yet
Practical Work 3 Zener Diode
9 pages
Science8 - q4 - CLAS1 - Phases of Digestion - v5 - Carissa Calalin
No ratings yet
Science8 - q4 - CLAS1 - Phases of Digestion - v5 - Carissa Calalin
11 pages
Grade 5 - Week 13 - Science Questions
No ratings yet
Grade 5 - Week 13 - Science Questions
4 pages
Aviation Week 1970 01 05
No ratings yet
Aviation Week 1970 01 05
48 pages
Presentation of Joule Thomson Effect
100% (6)
Presentation of Joule Thomson Effect
16 pages
Transport Phenomena 1
No ratings yet
Transport Phenomena 1
8 pages
Vedant's Resume
No ratings yet
Vedant's Resume
2 pages
Metal Losses in Pyrometallurgical Operations - A Review - Bellemans Et Al., 2018
No ratings yet
Metal Losses in Pyrometallurgical Operations - A Review - Bellemans Et Al., 2018
17 pages
Amber Training
100% (1)
Amber Training
36 pages
Bronchitis Nursing Care Plan
No ratings yet
Bronchitis Nursing Care Plan
8 pages
Official Views of The World's Columbian Exposition
No ratings yet
Official Views of The World's Columbian Exposition
464 pages
Scheerlinckk Depthconfigurations - Proximity, Permeabilityandterritorialboundariesinurbanprojects
No ratings yet
Scheerlinckk Depthconfigurations - Proximity, Permeabilityandterritorialboundariesinurbanprojects
16 pages
2 Children S Misconceptions in Primary Science A Survey of Teachers Views
No ratings yet
2 Children S Misconceptions in Primary Science A Survey of Teachers Views
19 pages
PDPII - May 2011 - Dhanaraj - 10644
No ratings yet
PDPII - May 2011 - Dhanaraj - 10644
96 pages
Thomas Printz' Private Bulletin - 1953 - Vol. 01 - 34
No ratings yet
Thomas Printz' Private Bulletin - 1953 - Vol. 01 - 34
2 pages
Chapter 1: Introduction: Department of Information and Communication Engineering (ICE)
No ratings yet
Chapter 1: Introduction: Department of Information and Communication Engineering (ICE)
20 pages
Biopsychology and Neuroscience Reviewer
No ratings yet
Biopsychology and Neuroscience Reviewer
4 pages
Global Humanitarian Overview 2025 (Abridged Report)
No ratings yet
Global Humanitarian Overview 2025 (Abridged Report)
20 pages

A New Model For Learning in Graph Domains

Uploaded by

A New Model For Learning in Graph Domains

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

A new model for earning in raph domains

Conference Paper · January 2005

Maria Cristina Gori Gabriele Monfardini

SEE PROFILE SEE PROFILE

The user has requested enhancement of the downloaded file.

Abstract— In several applications the information is naturally

Fig. 1. An image and its graphical representation by a RAG.

is naturally specified using the information contained in the

from n by a path containing 2 edges: the nodes that are

View publication stats

You might also like