0% found this document useful (0 votes)
32 views16 pages

Neural Subgraph Matching

Uploaded by

illusion1asd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views16 pages

Neural Subgraph Matching

Uploaded by

illusion1asd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Neural Subgraph Matching

N EURAL S UBGRAPH M ATCHING


Rex Ying, Andrew Wang, Jiaxuan You, Chengtao Wen, Arquimedes Canedo, Jure Leskovec
Stanford University and Siemens Corporate Technology

A BSTRACT

Subgraph matching is the problem of determining the presence of a given query


graph in a large target graph. Despite being an NP-complete problem, the subgraph
arXiv:2007.03092v2 [cs.LG] 27 Oct 2020

matching problem is crucial in domains ranging from network science and database
systems to biochemistry and cognitive science. However, existing techniques
based on combinatorial matching and integer programming cannot handle matching
problems with both large target and query graphs. Here we propose NeuroMatch, an
accurate, efficient, and robust neural approach to subgraph matching. NeuroMatch
decomposes query and target graphs into small subgraphs and embeds them using
graph neural networks. Trained to capture geometric constraints corresponding
to subgraph relations, NeuroMatch then efficiently performs subgraph matching
directly in the embedding space. Experiments demonstrate that NeuroMatch is
100x faster than existing combinatorial approaches and 18% more accurate than
existing approximate subgraph matching methods.

1. I NTRODUCTION
Given a query graph, the problem of subgraph isomorphism matching is to determine if a query graph
is isomorphic to a subgraph of a large target graph. If the graphs include node and edge features, both
the topology as well as the features should be matched.
Subgraph matching is a crucial problem in many biology, social network and knowledge graph
applications (Gentner, 1983; Raymond et al., 2002; Yang & Sze, 2007; Dai et al., 2019). For example,
in social networks and biomedical network science, researchers investigate important subgraphs by
counting them in a given network (Alon et al., 2008). In knowledge graphs, common substructures
are extracted by querying them in the larger target graph (Gentner, 1983; Plotnick, 1997).
Traditional approaches make use of combinatorial search algorithms (Cordella et al., 2004; Gallagher,
2006; Ullmann, 1976). However, they do not scale to large problem sizes due to the NP-complete
nature of the problem. Existing efforts to scale up subgraph isomorphism (Sun et al., 2012) make use
of expensive pre-processing to store locations of many small 2-4 node components, and decompose
the queries into these components. Although this allows matching to scale to large target graphs,
the size of the query cannot scale to more than a few tens of nodes before decomposing the query
becomes a hard problem by itself.
Here we propose NeuroMatch, an efficient neural approach for subgraph matching. The core of
NeuroMatch is to decompose the target GT as well as the query GQ into many small overlapping
graphs and use a Graph Neural Network (GNN) to embed the individual graphs such that we can then
quickly determine whether one graph is a subgraph of another.
Our approach works in two stages, an embedding stage and a query stage. At the embedding stage,
we decompose the target graph GT into many sub-networks Gu : For every node u ∈ GT we extract
a k-hop sub-network Gu around u and use a GNN to obtain an embedding for u, capturing the
neighborhood structure of u. At the query stage, we compute embedding of every node q in the query
graph GQ based on q’s neighborhood. We then compare embeddings of all pairs of nodes q and u to
determine whether GQ is a subgraph of GT .
The key insight that makes NeuroMatch work is to define an embedding space where subgraph rela-
tions are preserved. We observe that subgraph relationships induce a partial ordering over subgraphs.
This observation inspires the use of geometric set embeddings such as order embeddings (McFee &
Lanckriet, 2009),which induce a partial ordering on embeddings with geometric shapes. By ensuring
that the partial ordering on embeddings reflects the ordering on subgraphs, we equip our model with

1
Neural Subgraph Matching

Figure 1: Overview of NeuroMatch. We decompose target graph GT by extracting k-hop neighbor-


hood Gu around at every node u. We then use a GNN to embed each Gu (left). We refer to u as
the center node of Gu . We train the GNN to reflect the subgraph relationships: If Gv is a subgraph
of Gu , then node v should be embedded to the lower-left of u. For example, since the 2-hop graph
of the violet node is a subgraph of the 2-hop graph of the red node, the embedding of the violet
square is to the lower-left of the red square node. At the query stage, we decompose the query GQ by
picking an anchor node q and embed it. From the embedding itself we can quickly determine that
Query 1 is a subgraph of the neighborhood around red, blue, and green nodes in target graph because
its embedding is to the lower-left of them. Similarly, Query 2 is a subgraph of the purple and red
nodes and is thus positioned to the lower-left of both nodes. Notice NeuroMatch avoids expensive
combinatorial matching of subgraphs.

a powerful set of inductive biases while greatly simplifying the query process. Our work differs from
many previous works (Bai et al., 2019; Li et al., 2019; Xu et al., 2019) that embed graphs into vector
spaces, which do not impose geometric structure in the embedding space. In contrast, order embed-
dings have properties that naturally correspond to many properties of subgraph relationships, such as
transitivity, symmetry and closure under intersection. Enforcing the order embedding constraint both
leads to a well-structured embedding space and also allows us to efficiently navigate it in order to
find subgraphs as well as supergraphs (Fig. 1).
NeuroMatch trains a graph neural network to learn the order embedding, and uses a max-margin
loss to ensure that the subgraph relationships are captured. Furthermore, the embedding stage can
be conducted offline, producing precomputed embeddings for the query stage. The query stage is
extremely efficient due to the geometric constraints imposed at training time, and it only requires
linear time both in the size of the query and the target graphs. Lastly, NeuroMatch can naturally
operate on graphs which include categorical node and edge features, as well as multiple target graphs.
We compare the accuracy and speed of NeuroMatch with state-of-the-art exact and approximate
methods for subgraph matching (Cordella et al., 2004; Bonnici et al., 2013) as well as recent neural
methods for graph matching, which we adapted to the subgraph matching problem. Experiments
show that NeuroMatch runs two orders of magnitude faster than exact combinatorial approaches
and can scale to larger query graphs. Compared to neural graph matching methods, NeuroMatch
achieves an 18% improvement in AUROC for subgraph matching. Furthermore, we demonstrate the
generalization of NeuroMatch, by testing on queries sampled with different sampling strategies, and
transferring the model trained on synthetic datasets to make subgraph predictions on real datasets.
2. N EURO M ATCH A RCHITECTURE
2.1. P ROBLEM S ETUP
We first describe the general problem of subgraph matching. Let GT = (VT , ET ) be a large target
graph where we aim to identify the query graph. Let XT be the associated categorical node features
for all nodes in V 1 . Let GQ = (VQ , EQ ) be a query graph with associated node features XQ . The
goal of a subgraph matching algorithm is to identify the set of all subgraphs H = {H|H ⊆ GT } that
are isomorphic to GQ , that is, ∃ bijection f : VH 7→ VQ such that (f (v), f (u)) ∈ EQ iff (v, u) ∈ EH .
Furthermore, we say GQ is a subgraph of GT if H is non-empty. When node and edge features are
present, the subgraph isomorphism further requires that the bijection f has to match these features.

1
We consider the case of a single target and query graph, but NeuroMatch applies to any number of
target/query graphs. We also assume that the query is connected (otherwise it can be easily split into 2 queries).

2
Neural Subgraph Matching

Algorithm 1: NeuroMatch Query Stage


Input: Target graph GT , graph embeddings Zu of node u ∈ GT , and query graph GQ .
Output: Subgraph of GT that is isomorphic to GQ .
1: For every node q ∈ GQ , create Gq , and embed its center node q.
2: Compute matching between embeddings Zq and embeddings ZT using subgraph prediction
function f (zq , zu ).
3: Repeat for all q ∈ GQ , u ∈ GT ; make prediction based on the average score of all f (zq , zu ).

In the literature, subgraph matching commonly refers to two subproblems: node-induced matching
and edge-induced matching. In node-induced matching, the set of possible subgraphs of GT are
restricted to graphs H = (VH , EH ) such that VH ⊆ VT and EH = {(u, v)|u, v ∈ VH , (u, v) ∈ ET }.
Edge-induced matching, in contrast, restricts possible subgraphs by EH ⊆ ET , and contains all nodes
that are incident to edges in EH . To demonstrate, here we consider the more general edge-induced
matching, although NeuroMatch can be applied to both.
In this paper, we investigate the following decision problems of subgraph matching.
Problem 1. Matching query to datasets. Given a target graph GT and a query GQ , predict if GQ
is isomorphic to a subgraph of GT .
We use neural model to decompose Problem 1 and solve (with certain accuracy) the following
neighborhood matching subproblem.
Problem 2. Matching neighborhoods. Given a neighborhood Gu around node u and query GQ
anchored at node q, make binary prediction of whether Gq is a subgraph of Gu where node q
corresponds to u.
Here we define an anchor node q ∈ GQ , and predict existence of subgraph isomorphism mapping
that also maps q to u. At prediction time, similar to (Bai et al., 2018), we compute the alignment
score that measures how likely GQ anchored at q is a subgraph of Gu , for all q ∈ GQ and u ∈ Gu ,
and aggregate the scores to make the final prediction to Problem 1.
2.2. OVERVIEW OF N EURO M ATCH
NeuroMatch adopts a two-stage process: embedding stage where GT is decomposed into many small
overlapping graphs and each graph is embedded. And the query stage where query graph is compared
to the target graph directly in the embedding space so no expensive combinatorial search is required.
Embedding stage. In the embedding stage, NeuroMatch decomposes target graph GT into many
small overlapping neighborhoods Gu and uses a graph neural network to embed them. For every
node u in GT , we extract the k-hop neighborhood of u, Gu (Figure 1). GNN then maps node u (that
is, the structure of its network neighborhood Gu ) into an embedding zu .
Note a subtle but an important point: By using a k-layer GNN to embed node u, we are essentially
embedding/capturing the k-hop network neighborhood structure Gu around the center node u. Thus,
embedding u is equivalent to embedding Gu (a k-hop subgraph centered at node u), and by comparing
embeddings of two nodes u and v, we are essentially comparing the structure of subgraphs Gu , Gv .
Query stage (Alg. 1). The goal of the query stage is to determine whether GQ is a subgraph of GT
and identify the mapping of nodes of GQ to nodes of GT . However, rather than directly solving
this problem, we develop a fast routine to determine whether Gq is a subgraph of Gu : We design
a subgraph prediction function f (zq , zu ) that predicts whether the GQ anchored at q ∈ GQ is a
subgraph of the k-hop neighborhood of node u ∈ GT , which implies that q corresponds to u in the
subgraph isomorphism mapping by Problem 2. We thus formulate the subgraph matching problem as
a node-level task by using f (zq , zu ) to predict the set of nodes v that can be matched to node q (that
is, find a set of graphs Gu that are super-graphs of Gq ). To determine wither GQ is a subgraph of
GT , we then aggregate the alignment matrix consisting of f (zq , zu ) for all q ∈ GQ and u ∈ GT to
make the binary prediction for the decision problem of subgraph matching.
Practical considerations and design choices. The choice of the number of layers, k, depends on
the size of the query graphs. We assume k is at least the diameter of the query graph, to allow the
information of all nodes to be propagated to the anchor node in the query. In experiments, we observe
that inference via voting can consistently reach peak performance for k = 10, due to the small-world
property of many real-world graphs.

3
Neural Subgraph Matching

NeuroMatch is flexible in terms of the GNN model used for the embedding step. We adopt a variant
of GIN (Xu et al., 2018) incorporating skip layers to encode the query graphs and the neighborhoods,
which shows performance advantages. Although GIN showed limitation in expressive power beyond
WL test, our GNN additionally uses a feature to distinguish anchor nodes, which results in higher
expressive power in distinguishing d-regular graphs, beyond WL test (see Limitation Section and
Appendix I).
2.3. S UBGRAPH P REDICTION F UNCTION f (zq , zu )
Given the target graph node embeddings zu and the center node q ∈ GQ , the subgraph prediction
function decides if u ∈ GT has a k-hop neighborhood that is subgraph isomorphic to q’s k-hop
neighborhood in GQ . The key is that subgraph prediction function makes this decision based only on
the embeddings zq and zu of nodes q and u (Figure 1).
Capturing subgraph relations in the embedding space. We enforce the embedding geometry to
directly capture subgaph relations. This approach has the additional benefit of ensuring that the
subgraph predictions have negligible cost at the query stage, since we can just compare the coordinates
of two node embeddings. In particular, NeuroMatch satisfies the following properties for subgraph
relations (Refer to Appendix A for proofs of the properties):
• Transitivity: If G1 is a subgraph of G2 and G2 is a subgraph of G3 , then G1 is a subgraph of G3 .
• Anti-symmetry: If G1 is subgraph of G2 , G2 is a subgraph of G1 iff they are isomorphic.
• Intersection set: The intersection of the set of G1 ’s subgraphs and the set of G2 ’s subgraphs
contains all common subgraphs of G1 and G2 .
• Non-trivial intersection: The intersection of any two graphs contains at least the trivial graph.
We use the notion of set embeddings (McFee & Lanckriet, 2009) to capture these inductive biases.
Common examples include order embeddings and box embeddings. In contrast to Euclidean point
embeddings, set embeddings enjoy properties that correspond naturally to the subgraph relationships.
Subgraph prediction function. The idea of order embeddings is illustrated in Figure 1. Order
embeddings ensure that the subgraph relations are properly reflected in the embedding space: if Gq is
a subgraph of Gu , then the embedding zq of node q has to be to the “lower-left” of u’s embedding zu :

zq [i] ≤ zu [i]∀D
i=1 iff Gq ⊆ Gu (1)
where D is the embedding dimension. We thus train the GNN that produces the embeddings using
the max margin loss:
X X
L(zq , zu ) = E(zq , zu ) + max{0, α − E(zq , zu )}, where (2)
(zq ,zu )∈P (zq ,zu )∈N

E(zq , zu ) = || max{0, zq − zu }||22 (3)


Here P denotes the set of positive examples in minibatch where the neighborhood of q is a subgraph
of neighborhood of u, and N denotes the set of negative examples. A violation of the subgraph
constraint happens when in any dimension i, zq [i] > zu [i], and E(zq , zu ) represents its magnitude.
For positive examples P , E(zq , zu ) is minimized when all the elements in the query node embedding
zq are less than the corresponding elements in target node embedding zu . For negative pairs (zq , zu )
the amount of violation E(zq , zu ) should be at least α, in order to have zero loss.
We further use a threshold t on the violation E(zq , zu ) to make decision of whether the query is a
subgraph of the target. The subgraph prediction function f is defined as:

1 iff E(zq , zu ) < t
f (zq , zu ) = (4)
0 otherwise

2.4. M ATCHING N ODES VIA VOTING


At the query time, our goal is to predict if query node q ∈ GQ and target node u ∈ GT have
subgraph-isomorphic k-hop neighborhoods Gq and Gu (Problem 2). A simple solution is to use the
subgraph prediction function f (zq , zu ) to predict the subgraph relationship between Gq and Gu .
Matching via voting. We further propose a voting method that improves the accuracy of matching a
pair of anchor nodes based on their neighboring nodes. Our insight is that matching a pair of anchor

4
Neural Subgraph Matching

nodes imposes constraints on the neighborhood structure of the pair. Suppose we want to predict if
node q ∈ GQ and node u ∈ GT match. We have (proof in Appendix C):
Observation 1. Let N (k) (u) denote the k-hop network neighborhood of node u. Then, if q ∈ Gq
and node u ∈ Gu match, then for all nodes i ∈ N (k) (q), ∃ node j ∈ N (l) (u), l ≤ k such that node i
and node j match.
Based on this observation, we propose a voting-based inference method. Suppose that node q ∈ GQ
matches node u ∈ GT . We check if all neighbors of node q satisfy Observation 1, i.e. each neighbor
of q has a match to neighbor of u, as summarized in Appendix Algorithm 2.
2.5. T RAINING N EURO M ATCH
The training of subgraph matching consists of the following component: (1) Sample training query
GQ from target graph GT . (2) Sample node q and neighborhood Gq in GQ and find q’s corresponding
node u and its Gu ⊆ GT . (3) Generate negative example w and its Gw ⊆ GT . (4) Compute node
embeddings for q, u, w with GNN, and the loss in Equation 2 for backprop. We now detail the
following components in this training process.
Training data. To achieve high generalization performance on unseen queries, we train the network
with randomly generated query graphs. We sample a positive pair, we sample Gu ∈ GT , and
Gq ∈ Gu . To sample Gu , we first selecting a node u ∈ GT , and perform a random breadth-first
traversal (BFS) of the graph. The sampler traverse each edge in BFS with a fixed probability. We
then sample Gq by performing the same random BFS traversal on Gu starting at u, and treat u as the
anchor in Gq , which ensures existence of subgraph isomorphism mapping that maps q to u.
Given a positive pair (Gq , Gu ), we generate 2 types of negative examples. The first type of negative
examples are created by randomly choosing different nodes u and q in GT and perform random
traversal. The second type of negatives are generated by perturbing the query to make it no longer a
subgraph of the target graph, which is a more challenging case for the model to distinguish.
Test data. To demonstrate generalization, we use 3 different sampling strategies to generate test
queries. Aside from the mentioned random BFS traversal, we further use the random walk sampling
by performing random walk with restart at u, and the degree-weighted sampling strategy used in the
motif mining algorithm MFinder (Cho et al., 2013). Experiments demonstrate that NeuroMatch can
generalize to test queries with different sampling strategies.
Curriculum.
We introduce a curriculum training scheme that im-
proves performance. We first train the model on a
small number of easy queries and then train on suc-
cessively more complex queries with increased batch
size. Initially the model is trained with a single 1 hop
query. Each time the training performance plateaus, Figure 2: Example sampled queries GQ at
the model samples larger queries. Figure 2 shows each level of the curriculum in the MSRC_21
examples of queries at each curriculum level. The dataset. The diameter and number of nodes
complexity of queries increases as training proceeds. increase as curriculum level advances.
2.6. RUNTIME COMPLEXITY
The embedding stage uses GNNs to train embeddings to obey the subgraph constraint. Its complexity
is O(K(|ET | + |EQ |)), where K is the number of GNN layers. In the query stage, to solve Problem 1
we need to compute a total of O(|VT ||VQ |) scores. The quadratic time complexity allows NeuroMatch
to scale to larger datasets, whereas the complexity of the exact methods grow exponentially with size.
In many use cases, the target graphs are available in advance, but we need to solve for new incoming
unseen queries. Prior to inference time, the embeddings for all nodes in the target graph can be
pre-computed with complexity O(K|ET |). For a new query, its node embeddings can be computed
in O(K|EQ |) time, which is much faster since queries are smaller. With order embedding, we do
not need additional neural network modules at query stage and simply compute the order relations
between query node embeddings and the pre-computed node embeddings in the target graph.
3. E XPERIMENTS
To investigate the effectiveness of NeuroMatch, we compare its runtime and performance with a
range of existing popular subgraph matching methods. We evaluate performance on synthetic datasets

5
Neural Subgraph Matching

Dataset S YNTHETIC COX2 DD MSRC_21 F IRST MMDB PPI W ORD N ET 18


GMNN (Xu et al., 2019) 73.6 ± 1.1 75.9 ± 0.8 80.6 ± 1.5 82.5 ± 1.7 81.5 ± 2.9 72.0 ± 1.9 80.3 ± 2.0
Ablation Base
RDGCN (Wu et al., 2019) 79.5 ± 1.2 80.1 ± 0.4 81.3 ± 1.2 81.9 ± 1.9 82.4 ± 3.4 76.8 ± 2.2 79.6 ± 2.5
N O C URRICULUM 82.4 ± 0.6 95.0 ± 1.6 96.7 ± 2.1 89.2 ± 2.0 87.2 ± 6.8 82.6 ± 1.7 81.4 ± 2.2
NM-MLP 88.7 ± 0.5 95.4 ± 1.6 97.1 ± 0.3 93.5 ± 1.0 92.9 ± 4.3 85.5 ± 1.4 86.3 ± 0.9
NM-NTN 89.1 ± 1.9 89.3 ± 0.9 96.4 ± 1.4 94.7 ± 3.2 89.6 ± 1.1 85.7 ± 2.4 85.0 ± 1.1
NM-B OX 84.5 ± 2.1 88.5 ± 1.2 91.4 ± 0.5 90.8 ± 1.4 93.1 ± 1.7 77.4 ± 3.1 82.7 ± 2.5
N EURO M ATCH 93.5± 1.1 97.2 ± 0.4 97.9 ± 1.3 96.1 ± 0.2 95.5 ± 2.1 89.9 ± 1.9 89.3 ± 2.4
% I MPROVEMENT 4.9 1.9 0.8 1.5 2.6 4.9 3.4

Table 1: Given a neighborhood Gu of u and query GQ containing q, make binary prediction of


whether Gu is a subgraph of Gu where node q corresponds to u. We report AUROC (unit 0.01).
NeuroMatch performs the best with median AUROC 95.5, 20% higher than the neural baselines.
to probe data efficiency and generalization ability, as well as a variety of real-world datasets spanning
many fields to evaluate whether the model can be adapted to real-world graph structures.
3.1. DATASETS AND BASELINES
Synthetic dataset. We use a synthetic dataset including Erdős-Rényi (ER) random graphs (Erdős &
Rényi, 1960) and extended Barabasi graphs (Albert & Barabási, 2000). At test time, we evaluate on
test query graphs that were not seen during training. See Appendix E for dataset details, where we
also show experiments to transfer the learned model to unseen real dataset without fine-tuning.
Real-world datasets. We use a variety of real-world datasets from different domains. We evaluate on
graph benchmarks in chemistry (COX2), biology (Enzymes, DD, PPI networks), image processing
(MSRC_21), point cloud (F IRST MMDB), and knowledge graph (W ORD N ET 18). We do not include
node features for PPI networks since the goal is to match various protein interaction patterns without
considering the identity of proteins. W ORD N ET 18 contains no node features, but we use its edge
types information in matching. For all other datasets, we require that the matching takes categorical
features of nodes into account. Refer to the Appendix for statistics of all datasets.
Baselines. We first consider popular existing combinatorial approaches. We adopt the most com-
monly used efficient methods: the VF2 (Cordella et al., 2004) and the RI algorithm (Bonnici et al.,
2013). We further consider popular approximate matching algorithms FastPFP (Lu et al., 2012) and
IsoRankN (Liao et al., 2009), and compare with neural approaches in terms of accuracy and runtime.
Recent development of GNNs has not been applied to subgraph matching. We therefore adapt two
recent state-of-the-art methods for graph matching, Graph Matching Neural Networks (GMNN) (Xu
et al., 2019) and RDGCN (Wu et al., 2019), by changing their objective from predicting whether two
graphs have a match to predicting the subgraph relationship. Both methods are computationally more
expensive than NeuroMatch due to cross-graph attention between nodes.
Training details. We use the epoch with the best validation result for testing. See Appendix D for
hardware and hyperparameter configurations.
3.2. R ESULTS
(1) Matching individual node network neighborhoods (Problem 2). Table 1 summarizes the
AUROC results for predicting subgraph relation for Problem 2: is node q’s k-hop neighborhood Gq a
subgraph of u’s neighborhood Gu . This is a subroutine to determine is a query is present in a large
target graph. The number of pairs Gq , Gu with positive labels is equal to the number of pairs with
negative labels. We observe that NeuroMatch with order embeddings obtains, on average, a 20%
improvement over neural baselines. This benefit is a result of avoiding the loss of information when
pooling node embeddings and a better inductive bias stemming from order embeddings.
(2) Ablation studies. Although learning subgraph matching has not been extensively studied, we
explore alternatives to components of NeuroMatch. We compare with the following variants:
• N O C URRICULUM: Same as N EURO M ATCH but with no curriculum training scheme.
• NM-MLP: uses MLP and cross entropy to replace the order embedding loss.
• NM-NTN: uses Neural Tensor Network (Socher et al., 2013) and cross entropy to replace order
embedding loss.
• NM-B OX: uses box embedding loss (Vilnis et al., 2018) to replace the order embedding loss.

6
Neural Subgraph Matching

Dataset COX2 DD MSRC_21 F IRST MMDB E NZYMES S YNTHETIC Avg Runtime


I SO R ANK N 72.1 ± 2.5 61.2 ± 1.3 67.0 ± 2.0 77.0 ± 2.3 50.4 ± 1.4 62.7 ± 3.4 1.45 ± 0.04
FAST PFP 63.2 ± 3.8 72.9 ± 1.1 83.5 ± 1.5 83.0 ± 1.5 76.6 ± 1.9 77.0 ± 2.0 0.56 ± 0.01
NM-MLP 73.8 ± 3.7 87.8 ± 1.5 74.2 ± 1.0 88.9 ± 0.9 87.9 ± 1.0 92.1 ± 0.5 1.29 ± 0.10
N EURO M ATCH 89.9 ± 1.1 95.7 ± 0.4 84.5 ± 1.5 91.9 ± 1.0 92.9 ± 1.2 75.2 ± 1.8 0.90 ± 0.09

Table 2: Given a query GQ and a target graph GT from a dataset, make binary prediction for whether
GQ is a subgraph of GT (the decision problem of subgraph isomorphism), in AUROC (unit: 0.01).
As shown in Table 1, box embeddings cannot guarantee intersection, i.e. common subgraphs, between
two graphs, while variable sizes of the target graph makes neural tensor network (NTN) variant hard
to learn. NeuroMatch outperforms all the variants.
We additionally observe that the learning curriculum is crucial to the performance of learning the
subgraph relationships. The use of the curriculum increases the performance by an average of 6%,
while significantly reducing the performance variance and increasing the convergence speed. This
benefit is due to the compositional nature of the subgraph matching task.
3) Matching query to target graph (Problem 1). Given a target GT , we randomly sample a query
GQ centered at q. The goal is to answer the decision problem of whether GQ is a subgraph of GT .
Unlike the previous tasks, it requires prediction of subgraph relations between GQ and neighborhoods
Gu for all u ∈ GT . We perform the tasks by traversing over all nodes in query graph, and all nodes in
target graph as anchor nodes, and outputs an alignment matrix A of dimension |VT |-by-|VQ |, where
Ai,j denotes the matching score f (zi , zj ), as illustrated in Algorithm 1. The performance trend
of Table 1 also holds here in Problem 1. We further compare NeuroMatch with high-performing
hueristic methods, FastPFP and IsoRankN, and show an average of 18.4% improvement in AUROC
over all datasets. Appendix D contains additional implementation details.
Additionally, we make the task harder by sampling test queries with a different sampling strategy. At
training time, the query is randomly sampled with the random BFS procedure, whereas at test time
the query is randomly sampled using degree-weighted sampling (see Section 2.5).
We further compute the statistics of query graphs and target graphs (in Appendix E). On average
across all datasets, the size of query is 51% of the size of the target graphs, indicating that the
model is learning the problem of subgraph matching in a data-driven way, rather than learning graph
isomorphism, which previous works focus on.
4) Generalization. We further conduct experiments to demonstrate the generalization of NeuroMatch.
Firstly, we investigate model generalization to un- BFS MFinder Random Walks
seen subgraph queries sampled from different distri- BFS 98.79 98.58 98.38
butions. We consider 3 sampling strategies: random MFinder 93.09 96.34 96.07
Random Walks 95.65 97.21 97.53
BFS, degree-weighted sampling and random walk
sampling (see Section 2.5). Table 3 shows the perfor- Table 3: Generalization to new sampling
mance of NeuroMatch when trained with examples methods for MSRC dataset. Performance
sampled with one strategy (rows), and tested with measured in AUROC (unit 0.01).
examples sampled with another strategy (columns).
We observe that NeuroMatch can generalize to queries generated with different sampling strategies,
without much performance change. Among strategies considered, random BFS is the most robust
sampling strategy for training.
Secondly, we investigate whether the model is able to generalize to perform matching on pairs of
query and target that are from a variety of datasets, while only training on a synthetic dataset. In
Appendix F, we similarly find that NeuroMatch is robust to test queries sampled from different
real-world datasets.
Order embedding space analysis. Figure 3 shows the TSNE embedding of the learned order
embedding space. The yellow color points correspond to embeddings of graphs with larger sizes;
the purple color points correspond to embeddings of graphs with smaller sizes. Red points are
example embeddings for which we also visualize the corresponding graphs. We observe that the order
constraints are well-preserved. We further conduct experiment by randomly sampling 2 graphs in the
dataset and test their subgraph relationship. NeuroMatch achieves 0.61 average precision, compared
to 0.35 with the NM-MLP baseline.
Comparison with exact methods.

7
Neural Subgraph Matching

Although exact methods always achieve the correct


answer, they take exponential time in worst case. We
run the exact methods VF2 and RI and record the av-
erage runtime, using exactly the same test queries and
target as in Table 2. If the subgraph matching runs
for more than 10 minutes, it is deemed as unsuccess-
ful. We show in Appendix F the runtime comparison Figure 3: TSNE visualization of order embed-
showing 100 times speedup with NeuroMatch, and ding for a subset of subgraphs sampled from
the figure of the success rate of the baselines, which the E NZYMES dataset. As seen by examples
drop below 60% when the query size is more than to the right, the order constraints are well-
30. As query size grows, the runtime of the exact preserved. Graphs are colored by number of
methods grow exponentially, whereas the runtime of edges.
NeuroMatch grows linearly. Although VF2 and RI
are exact algorithms, NeuroMatch shows the poten-
tial of learning to predict subgraph relationship, in applications requiring high-throughput inference.
Additionally, NeuroMatch is also 10 times more efficient than the other baselines such as NM-MLP
and GMNN due to its efficient inference using order embedding properties.
4. L IMITATIONS
NeuroMatch provides a novel approach to demonstrate the promising potential of GNNs and geomet-
ric embeddings to make predictions of subgraph relationships. However, future work is needed in
exploring neural approaches to this NP-Complete problem. Previous works (Xu et al., 2018) have
identified expressive power limitations of GNNs in terms of the WL graph isomorphism test. In
NeuroMatch, we alleviate the limitation by distinguishing between the anchor node via node features
(illustrated in Appendix H). Since NeuroMatch does not explicitly rely on a GNN backbone, future
work on more expressive GNNs can be directly applied to NeuroMatch. We hope that NeuroMatch
opens a new direction in investigating subgraph matching as a potential application and benchmark in
graph representation learning.
5. R ELATED W ORK
Subgraph matching algorithms. Determining if a query is a subgraph of a target graph requires
comparison of their structure and features (Gallagher, 2006). Conventional algorithms (Ullmann,
1976) focus on graph structures only. Other works (Aleman-Meza et al., 2005; Coffman et al., 2004)
also consider categorical node features. Our NeuroMatch model can operate under both settings.
Approximate solutions to the problem have also been proposed (Christmas et al., 1995; Umeyama,
1988) NeuroMatch is related in a sense that it is an approximate algorithm using machine learning.
We further provide detailed comparison with a survey of heuristic methods (Ribeiro et al., 2019).
Neural graph matching. Earlier work (Scarselli et al., 2008) has demonstrated the potential of GNN
in small-scale subgraph matching, showing advantage of GNN over feed forward neural networks.
Recently, graph neural networks (Kipf & Welling, 2017; Hamilton et al., 2017; Xu et al., 2018) have
been proposed for graph isomorphism (Bai et al., 2019; Li et al., 2019; Guo et al., 2018) and have
achieved state-of-the-art results (Zhang & Lee, 2019; Wang et al., 2019; Xu et al., 2019). However,
these methods cannot be directly employed in subgraph isomorphism since there is no one-to-one
correspondence between nodes in query and target graphs. We demonstrate that our contributions in
using node-based representations, order embedding space can significantly outperform applications
of graph matching methods in the subgraph isomorphism setting. Additionally, recent works (Bai
et al., 2018; Fey et al., 2020) provide solutions to compute discrete matching correspondences from
the neural prediction of isomorphism mapping and are complementary to our work.
6. C ONCLUSION
In this paper we presented a neural subgraph matching algorithm, NeuroMatch, that uses graph
neural networks and geometric embeddings to learn subgraph relationships. We observe that order
embeddings are natural fit to model subgraph relationships in embedding space. NeuroMatch out-
performs adaptations of existing graph-isomorphism related architectures and show advantages and
potentials compared to heuristic algorithms.

8
Neural Subgraph Matching

R EFERENCES
Réka Albert and Albert-László Barabási. Topology of evolving networks: local events and universality.
Physical review letters, 85(24):5234, 2000.
Boanerges Aleman-Meza, Christian Halaschek-Wiener, Satya Sanket Sahoo, Amit Sheth, and I Budak
Arpinar. Template based semantic similarity for security applications. In International Conference
on Intelligence and Security Informatics. Springer, 2005.
Noga Alon, Phuong Dao, Iman Hajirasouliha, Fereydoun Hormozdiari, and S Cenk Sahinalp.
Biomolecular network motif counting and discovery by color coding. Bioinformatics, 2008.
Yunsheng Bai, Hao Ding, Yizhou Sun, and Wei Wang. Convolutional set matching for graph similarity.
In NeurIPS, 2018.
Yunsheng Bai, Hao Ding, Song Bian, Ting Chen, Yizhou Sun, and Wei Wang. Simgnn: A neural
network approach to fast graph similarity computation. In WSDM. ACM, 2019.
Vincenzo Bonnici, Rosalba Giugno, Alfredo Pulvirenti, Dennis Shasha, and Alfredo Ferro. A
subgraph isomorphism algorithm and its application to biochemical data. BMC bioinformatics,
2013.
Zhengdao Chen, Soledad Villar, Lei Chen, and Joan Bruna. On the equivalence between graph
isomorphism testing and function approximation with gnns. In NeurIPS, 2019.
Young-Rae Cho, Marco Mina, Yanxin Lu, Nayoung Kwon, and Pietro H Guzzi. M-finder: Uncovering
functionally associated proteins from interactome data integrated with go annotations. Proteome
science, 2013.
William J. Christmas, Josef Kittler, and Maria Petrou. Structural matching in computer vision using
probabilistic relaxation. PAMI, 1995.
Thayne Coffman, Seth Greenblatt, and Sherry Marcus. Graph-based technologies for intelligence
analysis. Communications of the ACM, 2004.
Luigi P Cordella, Pasquale Foggia, Carlo Sansone, and Mario Vento. A (sub) graph isomorphism
algorithm for matching large graphs. PAMI, 2004.
Hanjun Dai, Chengtao Li, Connor Coley, Bo Dai, and Le Song. Retrosynthesis prediction with
conditional graph logic network. In NeurIPS, 2019.
Paul Erdős and Alfréd Rényi. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci,
1960.
Matthias Fey, Jan E. Lenssen, Christopher Morris, Jonathan Masci, and Nils M. Kriege. Deep graph
matching consensus. In ICLR, 2020.
Brian Gallagher. Matching structure and semantics: A survey on graph-based pattern matching. In
AAAI Fall Symposium, 2006.
Dedre Gentner. Structure-mapping: A theoretical framework for analogy. Cognitive science, 1983.
Michelle Guo, Edward Chou, De-An Huang, Shuran Song, Serena Yeung, and Li Fei-Fei. Neural
graph matching networks for fewshot 3d action recognition. In ECCV, 2018.
Will Hamilton, Zhitao Ying, and Jure Leskovec. Inductive representation learning on large graphs. In
NeurIPS, 2017.
Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks.
In ICLR, 2017.
Yujia Li, Chenjie Gu, Thomas Dullien, Oriol Vinyals, and Pushmeet Kohli. Graph matching networks
for learning the similarity of graph structured objects. In ICML, 2019.
Chung-Shou Liao, Kanghao Lu, Michael Baym, Rohit Singh, and Bonnie Berger. Isorankn: spectral
methods for global alignment of multiple protein networks. Bioinformatics, 2009.

9
Neural Subgraph Matching

Yao Lu, Kaizhu Huang, and Cheng-Lin Liu. A fast projected fixed-point algorithm for large graph
matching. arXiv preprint arXiv:1207.1114, 2012.
Brian McFee and Gert Lanckriet. Partial order embedding with multiple kernels. In ICML. ACM,
2009.
Christopher Morris, Martin Ritzert, Matthias Fey, William L Hamilton, Jan Eric Lenssen, Gaurav
Rattan, and Martin Grohe. Weisfeiler and leman go neural: Higher-order graph neural networks.
In AAAI, pp. 4602–4609, 2019.
Eric Plotnick. Concept mapping: A graphical system for understanding the relationship between
concepts. ERIC Clearinghouse on Information and Technology Syracuse, NY, 1997.
John W Raymond, Eleanor J Gardiner, and Peter Willett. Heuristics for similarity searching of
chemical graphs using a maximum common edge subgraph algorithm. Journal of chemical
information and computer sciences, 2002.
Pedro Ribeiro, Pedro Paredes, Miguel EP Silva, David Aparicio, and Fernando Silva. A survey on
subgraph counting: concepts, algorithms and applications to network motifs and graphlets. arXiv
preprint arXiv:1910.13011, 2019.
Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The
graph neural network model. IEEE Transactions on Neural Networks, 2008.
Richard Socher, Danqi Chen, Christopher D Manning, and Andrew Ng. Reasoning with neural tensor
networks for knowledge base completion. In NeurIPS, 2013.
Zhao Sun, Hongzhi Wang, Haixun Wang, Bin Shao, and Jianzhong Li. Efficient subgraph matching
on billion node graphs. Proceedings of the VLDB Endowment, 2012.
Julian R Ullmann. An algorithm for subgraph isomorphism. Journal of the ACM (JACM), 1976.
Shinji Umeyama. An eigendecomposition approach to weighted graph matching problems. PAMI,
1988.
Luke Vilnis, Xiang Li, Shikhar Murty, and Andrew McCallum. Probabilistic embedding of knowledge
graphs with box lattice measures. ACL, 2018.
Runzhong Wang, Junchi Yan, and Xiaokang Yang. Learning combinatorial embedding networks for
deep graph matching. In ICCV, 2019.
John Winn, Antonio Criminisi, and Thomas Minka. Object categorization by learned universal visual
dictionary. In ICCV. IEEE, 2005.
Yuting Wu, Xiao Liu, Yansong Feng, Zheng Wang, Rui Yan, and Dongyan Zhao. Relation-aware
entity alignment for heterogeneous knowledge graphs. 2019.
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural
networks? ICLR, 2018.
Kun Xu, Liwei Wang, Mo Yu, Yansong Feng, Yan Song, Zhiguo Wang, and Dong Yu. Cross-lingual
knowledge graph alignment via graph matching neural network. 2019.
Qingwu Yang and Sing-Hoi Sze. Path matching and graph matching in biological networks. Journal
of Computational Biology, 2007.
Zhen Zhang and Wee Sun Lee. Deep graphical feature learning for the feature matching problem. In
ICCV, 2019.

10
Neural Subgraph Matching

A. P ROOFS OF S UBGRAPH P ROPERTIES


The paper introduces the following observations that justify the use of order embeddings in subgraph
matching.
Transitivity. Suppose that G1 is a subgraph of G2 with bijection f mapping all nodes from G1 to a
subset of nodes in G2 , and G2 is a subgraph of G3 with bijection g. Let v1 , v2 , v3 be anchor nodes
of G1 , G2 , G3 respectively. By definition of anchored subgraph, f (v1 ) = v2 and g(v2 ) = v3 . Then
the composition g ◦ f is a bijection. Moreover, g ◦ f (v1 ) = g(v2 ) = v3 , where Therefore G1 is a
subgraph of G3 , and thus the transitivity property.
This corresponds to the transitivity of order embedding.
Anti-symmetry. Suppose that G1 is a subgraph of G2 with bijection f , and G2 is a subgraph of G1
with bijection g. Let |V1 | and |V2 | be the number of nodes in G1 and G2 respectively. By definition of
subgraph isomorphism, G1 is a subgraph of G2 implies that |V1 | ≤ |V2 |. Similarly, G2 is a subgraph
of G1 implies |V2 | ≤ |V1 |. Hence |V1 | = |V2 |. The mapping between all nodes in G1 and G2 is
bijective. By definition of isomorphism, G1 and G2 are graph-isomorphic.
This corresponds to the anti-symmetry of order embedding.
Intersection. By definition, if G3 is a common subgraph of G1 , G2 , the G3 is a subgraph of both G1
and G2 . Since a trivial node is a subgraph of any graph, there is always a non-empty intersection set
between two graphs.
Correspondingly, if z3  z1 and z3  z2 , then z3  min{z1 , z2 }. Here min denotes the element-
wise minimum of two embeddings. Note that the order embedding z1 and z2 are positive, and
therefore min{z1 , z2 } is another valid order embedding, corresponding to the non-empty intersection
set between two graphs.
Note that this paper assumes the frequent motifs are connected graphs. And thus it also assumes that
all neighborhoods in a given datasets are connected and contain at least 2 nodes (an edge). This is a
reasonable assumption since we can remove isolated nodes from the datasets, as connected motifs of
size k (k > 1) can never contain isolated nodes. In this case, the trivial intersection corresponds to a
graph of 2 nodes and 1 edge.
For all datasets, we randomly sample connected subgraph queries as test sets, with diameter less than
8, a mild assumption since most of the graph datasets have diameter less than 8.
B. O RDER EMBEDDING COMPOSITION
We can show that the order constraints in Equation 1 hold under the composition of multiple message
passing layers of the GNN, assuming simple GNN models such as in paper “Simplifying Graph
COnvolutional Networks” and “Scalable Inception Graph Networks”.
Suppose that we use a k-layer GNN to encode nodes u and v in the search and query graphs
respectively. If the k-hop neighborhood of u is a subgraph of the k-hop neighborhood of v, then
∀s ∈ Nv , ∃t ∈ Nu such that the (k −1)-hop neighborhood of s must be a subgraph of the (k −1)-hop
neighborhood of t. Neighborhoods of u’s neighbors are subgraphs of a subset of the (k − 1)-hop
neighborhoods of v’s neighbors.
Consequently, we can guarantee the following observation with order embeddings:
Observation 2. Suppose that all GNN embeddings at layer k − 1 satisfy order constraints after
transformation. Then when using sum-based neighborhood aggregation, the GNN embeddings at
layer k also satisfy the order constraints.

After applying linear transformations and non-linearities in the GNN at layer k − 1, if the order
embedding of all neighbors of node v are no greater than that of the corresponding matched nodes
in the target graph (i.e. satisfy the order constraint), then when summing the order embeddings
of neighbors to compute embedding of v at layer k, it is guaranteed that node v also satisfies the
order constraint at layer k. This corresponds to the property of composition of subgraphs into larger
subgraphs.
In other GNN architectures, such properties do not necessarily hold, due to the presence of transforma-
tion and non-linearity at each convolution layer. However, this provides another alignment between

11
Neural Subgraph Matching

Model Accuracy
SAGE (2- LAYER , 32- DIM , DROPOUT =0.2) 77.5
SAGE (6- LAYER , 32- DIM , DROPOUT =0.2) 85.3
SAGE (8- LAYER , 64- DIM , DROPOUT =0.2) 86.3
GCN (6- LAYER , 64- DIM , DROPOUT =0.2) 69.9
GCN (9- LAYER , 128- DIM , DROPOUT =0.2) 82.3
GIN (4- LAYER , 32- DIM , DROPOUT =0.2) 81.0
GIN (4- LAYER , 64- DIM , DROPOUT =0) 87.0
GIN (8- LAYER , 64- DIM , DROPOUT =0) 88.4
SAGE (4- LAYER , 64- DIM , DROPOUT =0) 87.6
SAGE (8- LAYER , 64- DIM , DROPOUT =0) 89.4
SAGE (12- LAYER , 64- DIM , DROPOUT =0) 90.5
SAGE (8- LAYER , 64- DIM , DROPOUT =0, S KIP - LAYER ) 91.5

Table 4: The accuracy (unit: 0.01) for matching on the ENZYMES dataset for different model
configurations.
the order embedding objective and the subgraph matching task in terms of growing neighborhoods,
and motivates the use of curriculum learning for this task.
C. VOTING P ROCEDURE
The voting procedure is used to improve cer-
tainty of matched pairs by considering presence Algorithm 2: NeuroMatch Voting Algorithm
of nearby matched pairs in neighborhoods of the Input: Query node q ∈ GQ , target node u ∈ GT .
matched pairs. The method is motivated by the Threshold t for violation below which we
following observation. predict positive subgraph relation between the
(l) neighborhoods of q and u.
Observation 3. Let N denotes the l-hop Output: Whether the node pair matches.
neighborhood. Then, if q ∈ GQ and node Compute embeddings for neighbors of q, u
u ∈ GT match, then for all nodes i ∈ N (k) (q), within K hops
∃ node j ∈ N (l) (u), l ≤ k such that node i and for hop k ≤ K do
node j match. for node i ∈ N (k) (q) do
m = min{E(zi , zj )|∀j ∈ N (k) (u)}
Since the query graph GQ is a subgraph of target
If m > t, return False
graph GT , all paths in GQ have corresponding
return True
paths in GT . Hence the shortest distance of a
node i ∈ N (k) (q) to q is at most the shortest
distance of node j ∈ N (l) (u) in GT , where j is the corresponding node in GT defined by the
subgraph isomorphism mapping. However, the shortest paths are not necessarily of equal lengths,
since in GT there might be additional short-cuts from j to u that do not exist in GQ .
D. T RAINING DETAILS AND HYPERPARAMETERS
All models are trained on a single GeForce RTX 2080 GPU, and both the heuristics and neural models
use a Intel Xeon E7-8890 v3 CPU.
Curriculum training. In each epoch, we iterate over all target graphs in the curriculum and randomly
sample one query per target graph. We lower bound the number of iterations per epoch to 64 for
datasets that are too small. For the E-R dataset, where we generate neighborhoods at random, and
the WN dataset which consists of only a single graph, we use a fixed 64 iterations per epoch. On all
datasets except for the E-R dataset, we used 256 target graphs where possible. At training time, we
enforce a 3:1 negative to positive ratio in the training examples, which is necessary since in reality
there is a heavy skew in the dataset towards negative examples. 10% of the negative examples are
hard negatives; among the remaining 90%, half are negative examples drawn from the same target
graph as the query, and half are negative examples drawn from different target graphs.
The model is trained with a learning rate of 1 × 10−3 using the Adam optimizer. The learning rate
is annealed with a cosine annealer with restarts every 100 epochs. The curriculum starts with 1
target graph with a radius of 1; it is updated every time there are 20 consecutive epochs without an

12
Neural Subgraph Matching

improvement of more than 0.1. The curriculum update increases the radius of the target graphs by 1
up to a maximum of 4, after which it doubles the number of target graphs for every update up to a
maximum of 256. The dataset is regenerated every 50 epochs.

Predicted + Predicted −
Positive 68.2 8.3
Negative 70.5 1030.9

Table 5: Average confusion Matrix for matching small queries (size ≤ 7) to all node neighborhoods
in the DD dataset.

Hyperparameters. We performed a comprehensive sweep over hyperparameters used in the model.


Table 4 shows the effect of hyperparameters and GNN models on the performance, using the
NeuroMatch framework. We list the design choices we made that are observed to perform well in
both synthetic and real-world datasets:
• Sum aggregation usually works the best, confirming previous theoretical studies (Xu et al.,
2018). Both the GraphSAGE and GIN architecture we implemented uses the sum neighbor-
hood aggregation.
• We observe slight improvement in performance when using LeakyReLU instead of ReLU
for non-linearity.
• Dropout does not have a significant impact on performance.
• Adding structural features, such as node degree, clustering coefficient, and average path
length improves the convergence speed.
Matching query to target graph by aggregating scores. In Problem 1 (Table 2), all methods
must make a binary prediction of whether the query is a subgraph of the target graph based on the
alignment matrix A of scores f (zq , zu ) between all pairs of query and target neighborhoods. In
order to aggregate the scores contained in the alignment matrix, we adopt the simple strategy of
taking the mean of all entries in the matrix, which we found to outperform the commonly-used
Hungarian algorithm on our binary decision task. The exception is FAST PFP, which provides a
discrete assignment matrix matching each query node to a target node; for this method, we adopt the
following prediction score:

T
kApred − Aquery k1 Xpred Xquery
+
|Vquery|2 |Vquery|
where Apred is the adjacency matrix of the predicted matched graph, Aquery is the adjacency matrix
of the query graph, |Vquery | is the number of nodes in the query and Xpred and Xquery are the feature
matrices of nodes in the predicted and query graph, respectively. This score measures the degree to
which the predicted and query graph match in terms of topology and node labels, and is based on
the loss function used in the paper, but is adapted to compare matchings across varying query and
target sizes. In general, we found these aggregation strategies to be effective in our setting containing
diverse query and target sizes, but our method is agnostic to such downstream processing of the
alignment matrix. In particular, the Hungarian algorithm or other alignment resolution algorithms
can still be used with the alignment matrix generated by NeuroMatch, especially when an explicit
matching (rather than a binary subgraph prediction) is desired.
For baseline hyperparameters: for I SO R ANK N, we set K = 10, threshold to 1e-4, alpha to 0.9 and
the maximum vector length to 1000000. For FAST PFP, we set lambda to 1, alpha to 0.5, and both
thresholds to 1e-4.
E. S UBGRAPH M ATCHING DATA S TATISTICS
E.1. DATASETS
Biology and chemistry datasets. COX2 contains 467 graphs of chemical molecules with an average
of 41 nodes and 44 edges each. DD contains 1178 graphs with an average of 284 nodes and 716 edges.

13
Neural Subgraph Matching

Dataset ENZYMES COX2 AIDS PPI IMDB- BINARY


I N -D OMAIN 92.9 97.2 94.3 89.9 81.8
T RANSFER 78.9 93.9 92.2 81.0 74.2

Table 6: The AUROC (unit: 0.01) for matching on real datasets, where we either train on the synthetic
dataset and test generalization to the real dataset (T RANSFER), or train directly on the dataset that we
test on (I N -D OMAIN).

Dataset COX2 DD MSRC_21 F IRST MMDB E NZYMES S YNTHETIC


Target size (nodes) 41.6 30.0 79.6 30.0 35.7 30.2
Target size (edges) 43.8 61.2 204.0 49.9 67.2 119.1
Query size (nodes) 22.4 17.8 22.6 17.8 17.4 17.5
Query size (edges) 23.0 34.4 46.3 27.9 29.4 53.4
Query:target size ratio (nodes) 53.8 59.3 28.4 59.3 48.7 57.9

Table 7: Statistics of target and query graphs used in evaluation of Problem 1 (Table 2).
It describes protein structure graphs where nodes are amino acids and edges represent positional
proximity. We use node labels for both of the datasets. PPI dataset contains the protein-protein
interaction graphs for human tissues. It has 24 graphs corresponding to different PPI networks of
different human tissues. In total, there are 56944 nodes and 818716 edges. We do not include node
features for PPI networks since the goal is to match various protein interaction patterns without
considering the identity of proteins.
MSRC_21 is a semantic image processing dataset introduced in (Winn et al., 2005), containing 563
graph each representing the graphical model of an image. It has an average of 78 nodes and 199
edges.
FIRSTMMDB is a point cloud dataset containing 3d point clouds for various household objects. It
tocontains 41 graphs with an average of 1377 nodes and 3074 edges each.
Label imbalance. We performed additional experiments to investigate the confusion matrix for the
DD dataset averaged across test queries. Table 5 shows extreme imbalance (subgraphs are rare).
Matching query to target graph. Table 7 shows the statistics of target and query graphs used to
evaluate performance on Problem 1 (Table 2).
F. G ENERALIZATION AND RUNTIME
F.1. P RETRAINING ON SYNTHETIC DATASET
To demonstrate the use and generalizability of the synthetic dataset, we conducted the experiment
where the subgraph matching model is trained only on the synthetic dataset, and is then tested on
real-world datasets. Table 6 shows the generalization performance. The first row corresponds to the
model performance when trained and tested on the same dataset. The second row corresponds to
the model performance when trained on the synthetic dataset, and tested on queries sampled from
real-world datasets (listed in each column), Although there is a drop in performance when the model
only sees the synthetic dataset, the model is able generalize to a diverse setting of subgraph matching
scenarios, in biology, chemistry and social network domains, even out-performing some baseline
methods that are specifically trained on the real-world datsets.
However, a shortcoming is that since the synthetic dataset does not contain node features, and real
datasets have varying node feature dimensions, the model is only able to consider subgraph matching
task that does not take feature into account. Incorporation of feature in transfer learning of subgraph
matching remains to be an open problem.
G. C OMPARISON TO E XACT AND A PPROXIMATE H EURISTICS
G.1. E XACT H EURISTICS M ETHODS
Exact heuristics such as VF2 and RI algorithms guarantees to make the correct prediction of whether
query is a subgraph of the target. However, even for relatively small queries (of size 20), matching is

14
Neural Subgraph Matching

Figure 4: Runtime analysis. Success rate of baseline heuristic matching algorithms (VF2 and RI) for
matching in under 20 seconds. NeuroMatch achieves 100% success rate.
costly and can sometimes take unexpectedly long time in the order of hours. As such, these algorithms
are not suitable in online or high-throughput scenarios where efficiency is priority.
To demonstrate the runtime efficiency, we show in Figure 4 the success rate of the exact methods,
which drop below 60% when the query size is increased to more than 30. In comparison, NeuroMatch
always finishes under 0.1 second.
Table 8 shows the runtime comparison between NeuroMatch and the exact baselines consid-
ered (VF2 and RI). NeuroMatch achieves 100 times speedup compared to these exact methods.

Moreover, since in practice, it is feasible to pre-train Datasets E-R MSRC_21 DD


VF2 25.9 19.7 22.8
the NeuroMatch model on synthetic datasets, and op- RI 12.8 7.5 11.0
tionally finetune few epochs on real-world datasets, N EURO M ATCH -MLP 0.49 0.48 0.44
the training time for model when given a new dataset N EURO M ATCH -O RDER 0.04 0.03 0.03
is also negligible. However, such approach has the
Table 8: Average runtime (in seconds) com-
limitation that the model cannot account for node cat-
parison between heuristic methods and our
egorical features when performing subgraph match-
method with query size up to 50. NeuroMatch
ing, since the synthetic dataset does not contain any
is about 100x faster than alternatives.
node feature.
G.2. A PPROXIMATE H EURISTICS M ETHODS
Additionally, there have been many works focusing on heuristic methods for motif/subgraph count-
ing (Ribeiro et al., 2019), notable methods include Rand-ESU, MFinder, Motivo, ORCA. However,
these works primarily focus on fast enumeration of small motifs typically of size less than 6. In our
cases, the size of target and query is much larger (up to hundreds in size), and we do not focus on
enumeration of motifs of certain size.
A related line of work is graph matching, or finding an explicit (sub)graph isomorphism mapping
between query and target nodes. Methods include convex relaxations (FastPFP, PATH) and spectral
approaches (IsoRankN). Such approaches are inherently heuristic-based due to the hardness of
approximation of the subgraph matching problem.
H. GNN EXPRESSIVE POWER
Previous works (Xu et al., 2018; Morris et al., 2019) have identified limitations of a class of GNNs.
More specifically, GNNs face difficulties when asked to distinguish regular graphs. In this work, we
circumvent the problem by distinguishing the anchor node and other nodes in the neighborhood via
one-hot encoding (See Section 3.2). The idea is explored in a concurrent work “Identity-aware Graph
Neural Networks” (ID-GNNs). It uses Figure 5 to demonstrate the expressive power of ID-GNN,
which distinguishes anchor node from other nodes. For example, while d-regular graphs such as
3-cycle and 4-cycle graphs have the same GNN computational graphs, their ID-GNN computational
graphs are different, due to identification of anchor nodes via node features. Such modification
enables better expressive power than message-passing GNNs such as GIN.
A future direction is to investigate the performance of recently proposed more expressive GNNs (Chen
et al., 2019) in the context of subgraph mining. The NeuroMatch framework is general and any GNN
can be used in its decoder component, and could benefit from more expressive GNNs.

15
Neural Subgraph Matching

Node classification Link prediction Graph classification


A B
A !! B !" !! !"
Example input
graphs A B

!#

For each node: For each node:


!! A B !"
= !! A = B !" A B
Existing GNNs’
computational
graphs =
… …
… …
(root nodes are colored with identity) (!# is colored with identity) (root nodes are colored with identity)
!! A B !"
≠ !! A ≠ B !" A B
ID-GNNs’
computational ≠
graphs
… … … …
A B Class labels node with augmented identity node without augmented identity

Figure 5: An overview of the proposed ID-GNN model. We consider node, edge and graph level
tasks, and assume nodes do not have additional features. Across all examples, the task requires an
embedding that allows for the differentiation of the label A vs. B nodes in their respective graphs.
However, across all tasks, existing GNNs, regardless of depth, will always assign the same embedding
to both classes of nodes, because for all tasks the computational graphs are identical. In contrast, the
colored computation graphs provided by ID-GNNs allows for clear differentiation between the nodes
of class A and class B, as the colored computation graph are no longer identical across all tasks.

16

You might also like