Neural Subgraph Matching
Neural Subgraph Matching
A BSTRACT
matching problem is crucial in domains ranging from network science and database
systems to biochemistry and cognitive science. However, existing techniques
based on combinatorial matching and integer programming cannot handle matching
problems with both large target and query graphs. Here we propose NeuroMatch, an
accurate, efficient, and robust neural approach to subgraph matching. NeuroMatch
decomposes query and target graphs into small subgraphs and embeds them using
graph neural networks. Trained to capture geometric constraints corresponding
to subgraph relations, NeuroMatch then efficiently performs subgraph matching
directly in the embedding space. Experiments demonstrate that NeuroMatch is
100x faster than existing combinatorial approaches and 18% more accurate than
existing approximate subgraph matching methods.
1. I NTRODUCTION
Given a query graph, the problem of subgraph isomorphism matching is to determine if a query graph
is isomorphic to a subgraph of a large target graph. If the graphs include node and edge features, both
the topology as well as the features should be matched.
Subgraph matching is a crucial problem in many biology, social network and knowledge graph
applications (Gentner, 1983; Raymond et al., 2002; Yang & Sze, 2007; Dai et al., 2019). For example,
in social networks and biomedical network science, researchers investigate important subgraphs by
counting them in a given network (Alon et al., 2008). In knowledge graphs, common substructures
are extracted by querying them in the larger target graph (Gentner, 1983; Plotnick, 1997).
Traditional approaches make use of combinatorial search algorithms (Cordella et al., 2004; Gallagher,
2006; Ullmann, 1976). However, they do not scale to large problem sizes due to the NP-complete
nature of the problem. Existing efforts to scale up subgraph isomorphism (Sun et al., 2012) make use
of expensive pre-processing to store locations of many small 2-4 node components, and decompose
the queries into these components. Although this allows matching to scale to large target graphs,
the size of the query cannot scale to more than a few tens of nodes before decomposing the query
becomes a hard problem by itself.
Here we propose NeuroMatch, an efficient neural approach for subgraph matching. The core of
NeuroMatch is to decompose the target GT as well as the query GQ into many small overlapping
graphs and use a Graph Neural Network (GNN) to embed the individual graphs such that we can then
quickly determine whether one graph is a subgraph of another.
Our approach works in two stages, an embedding stage and a query stage. At the embedding stage,
we decompose the target graph GT into many sub-networks Gu : For every node u ∈ GT we extract
a k-hop sub-network Gu around u and use a GNN to obtain an embedding for u, capturing the
neighborhood structure of u. At the query stage, we compute embedding of every node q in the query
graph GQ based on q’s neighborhood. We then compare embeddings of all pairs of nodes q and u to
determine whether GQ is a subgraph of GT .
The key insight that makes NeuroMatch work is to define an embedding space where subgraph rela-
tions are preserved. We observe that subgraph relationships induce a partial ordering over subgraphs.
This observation inspires the use of geometric set embeddings such as order embeddings (McFee &
Lanckriet, 2009),which induce a partial ordering on embeddings with geometric shapes. By ensuring
that the partial ordering on embeddings reflects the ordering on subgraphs, we equip our model with
1
Neural Subgraph Matching
a powerful set of inductive biases while greatly simplifying the query process. Our work differs from
many previous works (Bai et al., 2019; Li et al., 2019; Xu et al., 2019) that embed graphs into vector
spaces, which do not impose geometric structure in the embedding space. In contrast, order embed-
dings have properties that naturally correspond to many properties of subgraph relationships, such as
transitivity, symmetry and closure under intersection. Enforcing the order embedding constraint both
leads to a well-structured embedding space and also allows us to efficiently navigate it in order to
find subgraphs as well as supergraphs (Fig. 1).
NeuroMatch trains a graph neural network to learn the order embedding, and uses a max-margin
loss to ensure that the subgraph relationships are captured. Furthermore, the embedding stage can
be conducted offline, producing precomputed embeddings for the query stage. The query stage is
extremely efficient due to the geometric constraints imposed at training time, and it only requires
linear time both in the size of the query and the target graphs. Lastly, NeuroMatch can naturally
operate on graphs which include categorical node and edge features, as well as multiple target graphs.
We compare the accuracy and speed of NeuroMatch with state-of-the-art exact and approximate
methods for subgraph matching (Cordella et al., 2004; Bonnici et al., 2013) as well as recent neural
methods for graph matching, which we adapted to the subgraph matching problem. Experiments
show that NeuroMatch runs two orders of magnitude faster than exact combinatorial approaches
and can scale to larger query graphs. Compared to neural graph matching methods, NeuroMatch
achieves an 18% improvement in AUROC for subgraph matching. Furthermore, we demonstrate the
generalization of NeuroMatch, by testing on queries sampled with different sampling strategies, and
transferring the model trained on synthetic datasets to make subgraph predictions on real datasets.
2. N EURO M ATCH A RCHITECTURE
2.1. P ROBLEM S ETUP
We first describe the general problem of subgraph matching. Let GT = (VT , ET ) be a large target
graph where we aim to identify the query graph. Let XT be the associated categorical node features
for all nodes in V 1 . Let GQ = (VQ , EQ ) be a query graph with associated node features XQ . The
goal of a subgraph matching algorithm is to identify the set of all subgraphs H = {H|H ⊆ GT } that
are isomorphic to GQ , that is, ∃ bijection f : VH 7→ VQ such that (f (v), f (u)) ∈ EQ iff (v, u) ∈ EH .
Furthermore, we say GQ is a subgraph of GT if H is non-empty. When node and edge features are
present, the subgraph isomorphism further requires that the bijection f has to match these features.
1
We consider the case of a single target and query graph, but NeuroMatch applies to any number of
target/query graphs. We also assume that the query is connected (otherwise it can be easily split into 2 queries).
2
Neural Subgraph Matching
In the literature, subgraph matching commonly refers to two subproblems: node-induced matching
and edge-induced matching. In node-induced matching, the set of possible subgraphs of GT are
restricted to graphs H = (VH , EH ) such that VH ⊆ VT and EH = {(u, v)|u, v ∈ VH , (u, v) ∈ ET }.
Edge-induced matching, in contrast, restricts possible subgraphs by EH ⊆ ET , and contains all nodes
that are incident to edges in EH . To demonstrate, here we consider the more general edge-induced
matching, although NeuroMatch can be applied to both.
In this paper, we investigate the following decision problems of subgraph matching.
Problem 1. Matching query to datasets. Given a target graph GT and a query GQ , predict if GQ
is isomorphic to a subgraph of GT .
We use neural model to decompose Problem 1 and solve (with certain accuracy) the following
neighborhood matching subproblem.
Problem 2. Matching neighborhoods. Given a neighborhood Gu around node u and query GQ
anchored at node q, make binary prediction of whether Gq is a subgraph of Gu where node q
corresponds to u.
Here we define an anchor node q ∈ GQ , and predict existence of subgraph isomorphism mapping
that also maps q to u. At prediction time, similar to (Bai et al., 2018), we compute the alignment
score that measures how likely GQ anchored at q is a subgraph of Gu , for all q ∈ GQ and u ∈ Gu ,
and aggregate the scores to make the final prediction to Problem 1.
2.2. OVERVIEW OF N EURO M ATCH
NeuroMatch adopts a two-stage process: embedding stage where GT is decomposed into many small
overlapping graphs and each graph is embedded. And the query stage where query graph is compared
to the target graph directly in the embedding space so no expensive combinatorial search is required.
Embedding stage. In the embedding stage, NeuroMatch decomposes target graph GT into many
small overlapping neighborhoods Gu and uses a graph neural network to embed them. For every
node u in GT , we extract the k-hop neighborhood of u, Gu (Figure 1). GNN then maps node u (that
is, the structure of its network neighborhood Gu ) into an embedding zu .
Note a subtle but an important point: By using a k-layer GNN to embed node u, we are essentially
embedding/capturing the k-hop network neighborhood structure Gu around the center node u. Thus,
embedding u is equivalent to embedding Gu (a k-hop subgraph centered at node u), and by comparing
embeddings of two nodes u and v, we are essentially comparing the structure of subgraphs Gu , Gv .
Query stage (Alg. 1). The goal of the query stage is to determine whether GQ is a subgraph of GT
and identify the mapping of nodes of GQ to nodes of GT . However, rather than directly solving
this problem, we develop a fast routine to determine whether Gq is a subgraph of Gu : We design
a subgraph prediction function f (zq , zu ) that predicts whether the GQ anchored at q ∈ GQ is a
subgraph of the k-hop neighborhood of node u ∈ GT , which implies that q corresponds to u in the
subgraph isomorphism mapping by Problem 2. We thus formulate the subgraph matching problem as
a node-level task by using f (zq , zu ) to predict the set of nodes v that can be matched to node q (that
is, find a set of graphs Gu that are super-graphs of Gq ). To determine wither GQ is a subgraph of
GT , we then aggregate the alignment matrix consisting of f (zq , zu ) for all q ∈ GQ and u ∈ GT to
make the binary prediction for the decision problem of subgraph matching.
Practical considerations and design choices. The choice of the number of layers, k, depends on
the size of the query graphs. We assume k is at least the diameter of the query graph, to allow the
information of all nodes to be propagated to the anchor node in the query. In experiments, we observe
that inference via voting can consistently reach peak performance for k = 10, due to the small-world
property of many real-world graphs.
3
Neural Subgraph Matching
NeuroMatch is flexible in terms of the GNN model used for the embedding step. We adopt a variant
of GIN (Xu et al., 2018) incorporating skip layers to encode the query graphs and the neighborhoods,
which shows performance advantages. Although GIN showed limitation in expressive power beyond
WL test, our GNN additionally uses a feature to distinguish anchor nodes, which results in higher
expressive power in distinguishing d-regular graphs, beyond WL test (see Limitation Section and
Appendix I).
2.3. S UBGRAPH P REDICTION F UNCTION f (zq , zu )
Given the target graph node embeddings zu and the center node q ∈ GQ , the subgraph prediction
function decides if u ∈ GT has a k-hop neighborhood that is subgraph isomorphic to q’s k-hop
neighborhood in GQ . The key is that subgraph prediction function makes this decision based only on
the embeddings zq and zu of nodes q and u (Figure 1).
Capturing subgraph relations in the embedding space. We enforce the embedding geometry to
directly capture subgaph relations. This approach has the additional benefit of ensuring that the
subgraph predictions have negligible cost at the query stage, since we can just compare the coordinates
of two node embeddings. In particular, NeuroMatch satisfies the following properties for subgraph
relations (Refer to Appendix A for proofs of the properties):
• Transitivity: If G1 is a subgraph of G2 and G2 is a subgraph of G3 , then G1 is a subgraph of G3 .
• Anti-symmetry: If G1 is subgraph of G2 , G2 is a subgraph of G1 iff they are isomorphic.
• Intersection set: The intersection of the set of G1 ’s subgraphs and the set of G2 ’s subgraphs
contains all common subgraphs of G1 and G2 .
• Non-trivial intersection: The intersection of any two graphs contains at least the trivial graph.
We use the notion of set embeddings (McFee & Lanckriet, 2009) to capture these inductive biases.
Common examples include order embeddings and box embeddings. In contrast to Euclidean point
embeddings, set embeddings enjoy properties that correspond naturally to the subgraph relationships.
Subgraph prediction function. The idea of order embeddings is illustrated in Figure 1. Order
embeddings ensure that the subgraph relations are properly reflected in the embedding space: if Gq is
a subgraph of Gu , then the embedding zq of node q has to be to the “lower-left” of u’s embedding zu :
zq [i] ≤ zu [i]∀D
i=1 iff Gq ⊆ Gu (1)
where D is the embedding dimension. We thus train the GNN that produces the embeddings using
the max margin loss:
X X
L(zq , zu ) = E(zq , zu ) + max{0, α − E(zq , zu )}, where (2)
(zq ,zu )∈P (zq ,zu )∈N
4
Neural Subgraph Matching
nodes imposes constraints on the neighborhood structure of the pair. Suppose we want to predict if
node q ∈ GQ and node u ∈ GT match. We have (proof in Appendix C):
Observation 1. Let N (k) (u) denote the k-hop network neighborhood of node u. Then, if q ∈ Gq
and node u ∈ Gu match, then for all nodes i ∈ N (k) (q), ∃ node j ∈ N (l) (u), l ≤ k such that node i
and node j match.
Based on this observation, we propose a voting-based inference method. Suppose that node q ∈ GQ
matches node u ∈ GT . We check if all neighbors of node q satisfy Observation 1, i.e. each neighbor
of q has a match to neighbor of u, as summarized in Appendix Algorithm 2.
2.5. T RAINING N EURO M ATCH
The training of subgraph matching consists of the following component: (1) Sample training query
GQ from target graph GT . (2) Sample node q and neighborhood Gq in GQ and find q’s corresponding
node u and its Gu ⊆ GT . (3) Generate negative example w and its Gw ⊆ GT . (4) Compute node
embeddings for q, u, w with GNN, and the loss in Equation 2 for backprop. We now detail the
following components in this training process.
Training data. To achieve high generalization performance on unseen queries, we train the network
with randomly generated query graphs. We sample a positive pair, we sample Gu ∈ GT , and
Gq ∈ Gu . To sample Gu , we first selecting a node u ∈ GT , and perform a random breadth-first
traversal (BFS) of the graph. The sampler traverse each edge in BFS with a fixed probability. We
then sample Gq by performing the same random BFS traversal on Gu starting at u, and treat u as the
anchor in Gq , which ensures existence of subgraph isomorphism mapping that maps q to u.
Given a positive pair (Gq , Gu ), we generate 2 types of negative examples. The first type of negative
examples are created by randomly choosing different nodes u and q in GT and perform random
traversal. The second type of negatives are generated by perturbing the query to make it no longer a
subgraph of the target graph, which is a more challenging case for the model to distinguish.
Test data. To demonstrate generalization, we use 3 different sampling strategies to generate test
queries. Aside from the mentioned random BFS traversal, we further use the random walk sampling
by performing random walk with restart at u, and the degree-weighted sampling strategy used in the
motif mining algorithm MFinder (Cho et al., 2013). Experiments demonstrate that NeuroMatch can
generalize to test queries with different sampling strategies.
Curriculum.
We introduce a curriculum training scheme that im-
proves performance. We first train the model on a
small number of easy queries and then train on suc-
cessively more complex queries with increased batch
size. Initially the model is trained with a single 1 hop
query. Each time the training performance plateaus, Figure 2: Example sampled queries GQ at
the model samples larger queries. Figure 2 shows each level of the curriculum in the MSRC_21
examples of queries at each curriculum level. The dataset. The diameter and number of nodes
complexity of queries increases as training proceeds. increase as curriculum level advances.
2.6. RUNTIME COMPLEXITY
The embedding stage uses GNNs to train embeddings to obey the subgraph constraint. Its complexity
is O(K(|ET | + |EQ |)), where K is the number of GNN layers. In the query stage, to solve Problem 1
we need to compute a total of O(|VT ||VQ |) scores. The quadratic time complexity allows NeuroMatch
to scale to larger datasets, whereas the complexity of the exact methods grow exponentially with size.
In many use cases, the target graphs are available in advance, but we need to solve for new incoming
unseen queries. Prior to inference time, the embeddings for all nodes in the target graph can be
pre-computed with complexity O(K|ET |). For a new query, its node embeddings can be computed
in O(K|EQ |) time, which is much faster since queries are smaller. With order embedding, we do
not need additional neural network modules at query stage and simply compute the order relations
between query node embeddings and the pre-computed node embeddings in the target graph.
3. E XPERIMENTS
To investigate the effectiveness of NeuroMatch, we compare its runtime and performance with a
range of existing popular subgraph matching methods. We evaluate performance on synthetic datasets
5
Neural Subgraph Matching
6
Neural Subgraph Matching
Table 2: Given a query GQ and a target graph GT from a dataset, make binary prediction for whether
GQ is a subgraph of GT (the decision problem of subgraph isomorphism), in AUROC (unit: 0.01).
As shown in Table 1, box embeddings cannot guarantee intersection, i.e. common subgraphs, between
two graphs, while variable sizes of the target graph makes neural tensor network (NTN) variant hard
to learn. NeuroMatch outperforms all the variants.
We additionally observe that the learning curriculum is crucial to the performance of learning the
subgraph relationships. The use of the curriculum increases the performance by an average of 6%,
while significantly reducing the performance variance and increasing the convergence speed. This
benefit is due to the compositional nature of the subgraph matching task.
3) Matching query to target graph (Problem 1). Given a target GT , we randomly sample a query
GQ centered at q. The goal is to answer the decision problem of whether GQ is a subgraph of GT .
Unlike the previous tasks, it requires prediction of subgraph relations between GQ and neighborhoods
Gu for all u ∈ GT . We perform the tasks by traversing over all nodes in query graph, and all nodes in
target graph as anchor nodes, and outputs an alignment matrix A of dimension |VT |-by-|VQ |, where
Ai,j denotes the matching score f (zi , zj ), as illustrated in Algorithm 1. The performance trend
of Table 1 also holds here in Problem 1. We further compare NeuroMatch with high-performing
hueristic methods, FastPFP and IsoRankN, and show an average of 18.4% improvement in AUROC
over all datasets. Appendix D contains additional implementation details.
Additionally, we make the task harder by sampling test queries with a different sampling strategy. At
training time, the query is randomly sampled with the random BFS procedure, whereas at test time
the query is randomly sampled using degree-weighted sampling (see Section 2.5).
We further compute the statistics of query graphs and target graphs (in Appendix E). On average
across all datasets, the size of query is 51% of the size of the target graphs, indicating that the
model is learning the problem of subgraph matching in a data-driven way, rather than learning graph
isomorphism, which previous works focus on.
4) Generalization. We further conduct experiments to demonstrate the generalization of NeuroMatch.
Firstly, we investigate model generalization to un- BFS MFinder Random Walks
seen subgraph queries sampled from different distri- BFS 98.79 98.58 98.38
butions. We consider 3 sampling strategies: random MFinder 93.09 96.34 96.07
Random Walks 95.65 97.21 97.53
BFS, degree-weighted sampling and random walk
sampling (see Section 2.5). Table 3 shows the perfor- Table 3: Generalization to new sampling
mance of NeuroMatch when trained with examples methods for MSRC dataset. Performance
sampled with one strategy (rows), and tested with measured in AUROC (unit 0.01).
examples sampled with another strategy (columns).
We observe that NeuroMatch can generalize to queries generated with different sampling strategies,
without much performance change. Among strategies considered, random BFS is the most robust
sampling strategy for training.
Secondly, we investigate whether the model is able to generalize to perform matching on pairs of
query and target that are from a variety of datasets, while only training on a synthetic dataset. In
Appendix F, we similarly find that NeuroMatch is robust to test queries sampled from different
real-world datasets.
Order embedding space analysis. Figure 3 shows the TSNE embedding of the learned order
embedding space. The yellow color points correspond to embeddings of graphs with larger sizes;
the purple color points correspond to embeddings of graphs with smaller sizes. Red points are
example embeddings for which we also visualize the corresponding graphs. We observe that the order
constraints are well-preserved. We further conduct experiment by randomly sampling 2 graphs in the
dataset and test their subgraph relationship. NeuroMatch achieves 0.61 average precision, compared
to 0.35 with the NM-MLP baseline.
Comparison with exact methods.
7
Neural Subgraph Matching
8
Neural Subgraph Matching
R EFERENCES
Réka Albert and Albert-László Barabási. Topology of evolving networks: local events and universality.
Physical review letters, 85(24):5234, 2000.
Boanerges Aleman-Meza, Christian Halaschek-Wiener, Satya Sanket Sahoo, Amit Sheth, and I Budak
Arpinar. Template based semantic similarity for security applications. In International Conference
on Intelligence and Security Informatics. Springer, 2005.
Noga Alon, Phuong Dao, Iman Hajirasouliha, Fereydoun Hormozdiari, and S Cenk Sahinalp.
Biomolecular network motif counting and discovery by color coding. Bioinformatics, 2008.
Yunsheng Bai, Hao Ding, Yizhou Sun, and Wei Wang. Convolutional set matching for graph similarity.
In NeurIPS, 2018.
Yunsheng Bai, Hao Ding, Song Bian, Ting Chen, Yizhou Sun, and Wei Wang. Simgnn: A neural
network approach to fast graph similarity computation. In WSDM. ACM, 2019.
Vincenzo Bonnici, Rosalba Giugno, Alfredo Pulvirenti, Dennis Shasha, and Alfredo Ferro. A
subgraph isomorphism algorithm and its application to biochemical data. BMC bioinformatics,
2013.
Zhengdao Chen, Soledad Villar, Lei Chen, and Joan Bruna. On the equivalence between graph
isomorphism testing and function approximation with gnns. In NeurIPS, 2019.
Young-Rae Cho, Marco Mina, Yanxin Lu, Nayoung Kwon, and Pietro H Guzzi. M-finder: Uncovering
functionally associated proteins from interactome data integrated with go annotations. Proteome
science, 2013.
William J. Christmas, Josef Kittler, and Maria Petrou. Structural matching in computer vision using
probabilistic relaxation. PAMI, 1995.
Thayne Coffman, Seth Greenblatt, and Sherry Marcus. Graph-based technologies for intelligence
analysis. Communications of the ACM, 2004.
Luigi P Cordella, Pasquale Foggia, Carlo Sansone, and Mario Vento. A (sub) graph isomorphism
algorithm for matching large graphs. PAMI, 2004.
Hanjun Dai, Chengtao Li, Connor Coley, Bo Dai, and Le Song. Retrosynthesis prediction with
conditional graph logic network. In NeurIPS, 2019.
Paul Erdős and Alfréd Rényi. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci,
1960.
Matthias Fey, Jan E. Lenssen, Christopher Morris, Jonathan Masci, and Nils M. Kriege. Deep graph
matching consensus. In ICLR, 2020.
Brian Gallagher. Matching structure and semantics: A survey on graph-based pattern matching. In
AAAI Fall Symposium, 2006.
Dedre Gentner. Structure-mapping: A theoretical framework for analogy. Cognitive science, 1983.
Michelle Guo, Edward Chou, De-An Huang, Shuran Song, Serena Yeung, and Li Fei-Fei. Neural
graph matching networks for fewshot 3d action recognition. In ECCV, 2018.
Will Hamilton, Zhitao Ying, and Jure Leskovec. Inductive representation learning on large graphs. In
NeurIPS, 2017.
Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks.
In ICLR, 2017.
Yujia Li, Chenjie Gu, Thomas Dullien, Oriol Vinyals, and Pushmeet Kohli. Graph matching networks
for learning the similarity of graph structured objects. In ICML, 2019.
Chung-Shou Liao, Kanghao Lu, Michael Baym, Rohit Singh, and Bonnie Berger. Isorankn: spectral
methods for global alignment of multiple protein networks. Bioinformatics, 2009.
9
Neural Subgraph Matching
Yao Lu, Kaizhu Huang, and Cheng-Lin Liu. A fast projected fixed-point algorithm for large graph
matching. arXiv preprint arXiv:1207.1114, 2012.
Brian McFee and Gert Lanckriet. Partial order embedding with multiple kernels. In ICML. ACM,
2009.
Christopher Morris, Martin Ritzert, Matthias Fey, William L Hamilton, Jan Eric Lenssen, Gaurav
Rattan, and Martin Grohe. Weisfeiler and leman go neural: Higher-order graph neural networks.
In AAAI, pp. 4602–4609, 2019.
Eric Plotnick. Concept mapping: A graphical system for understanding the relationship between
concepts. ERIC Clearinghouse on Information and Technology Syracuse, NY, 1997.
John W Raymond, Eleanor J Gardiner, and Peter Willett. Heuristics for similarity searching of
chemical graphs using a maximum common edge subgraph algorithm. Journal of chemical
information and computer sciences, 2002.
Pedro Ribeiro, Pedro Paredes, Miguel EP Silva, David Aparicio, and Fernando Silva. A survey on
subgraph counting: concepts, algorithms and applications to network motifs and graphlets. arXiv
preprint arXiv:1910.13011, 2019.
Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The
graph neural network model. IEEE Transactions on Neural Networks, 2008.
Richard Socher, Danqi Chen, Christopher D Manning, and Andrew Ng. Reasoning with neural tensor
networks for knowledge base completion. In NeurIPS, 2013.
Zhao Sun, Hongzhi Wang, Haixun Wang, Bin Shao, and Jianzhong Li. Efficient subgraph matching
on billion node graphs. Proceedings of the VLDB Endowment, 2012.
Julian R Ullmann. An algorithm for subgraph isomorphism. Journal of the ACM (JACM), 1976.
Shinji Umeyama. An eigendecomposition approach to weighted graph matching problems. PAMI,
1988.
Luke Vilnis, Xiang Li, Shikhar Murty, and Andrew McCallum. Probabilistic embedding of knowledge
graphs with box lattice measures. ACL, 2018.
Runzhong Wang, Junchi Yan, and Xiaokang Yang. Learning combinatorial embedding networks for
deep graph matching. In ICCV, 2019.
John Winn, Antonio Criminisi, and Thomas Minka. Object categorization by learned universal visual
dictionary. In ICCV. IEEE, 2005.
Yuting Wu, Xiao Liu, Yansong Feng, Zheng Wang, Rui Yan, and Dongyan Zhao. Relation-aware
entity alignment for heterogeneous knowledge graphs. 2019.
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural
networks? ICLR, 2018.
Kun Xu, Liwei Wang, Mo Yu, Yansong Feng, Yan Song, Zhiguo Wang, and Dong Yu. Cross-lingual
knowledge graph alignment via graph matching neural network. 2019.
Qingwu Yang and Sing-Hoi Sze. Path matching and graph matching in biological networks. Journal
of Computational Biology, 2007.
Zhen Zhang and Wee Sun Lee. Deep graphical feature learning for the feature matching problem. In
ICCV, 2019.
10
Neural Subgraph Matching
After applying linear transformations and non-linearities in the GNN at layer k − 1, if the order
embedding of all neighbors of node v are no greater than that of the corresponding matched nodes
in the target graph (i.e. satisfy the order constraint), then when summing the order embeddings
of neighbors to compute embedding of v at layer k, it is guaranteed that node v also satisfies the
order constraint at layer k. This corresponds to the property of composition of subgraphs into larger
subgraphs.
In other GNN architectures, such properties do not necessarily hold, due to the presence of transforma-
tion and non-linearity at each convolution layer. However, this provides another alignment between
11
Neural Subgraph Matching
Model Accuracy
SAGE (2- LAYER , 32- DIM , DROPOUT =0.2) 77.5
SAGE (6- LAYER , 32- DIM , DROPOUT =0.2) 85.3
SAGE (8- LAYER , 64- DIM , DROPOUT =0.2) 86.3
GCN (6- LAYER , 64- DIM , DROPOUT =0.2) 69.9
GCN (9- LAYER , 128- DIM , DROPOUT =0.2) 82.3
GIN (4- LAYER , 32- DIM , DROPOUT =0.2) 81.0
GIN (4- LAYER , 64- DIM , DROPOUT =0) 87.0
GIN (8- LAYER , 64- DIM , DROPOUT =0) 88.4
SAGE (4- LAYER , 64- DIM , DROPOUT =0) 87.6
SAGE (8- LAYER , 64- DIM , DROPOUT =0) 89.4
SAGE (12- LAYER , 64- DIM , DROPOUT =0) 90.5
SAGE (8- LAYER , 64- DIM , DROPOUT =0, S KIP - LAYER ) 91.5
Table 4: The accuracy (unit: 0.01) for matching on the ENZYMES dataset for different model
configurations.
the order embedding objective and the subgraph matching task in terms of growing neighborhoods,
and motivates the use of curriculum learning for this task.
C. VOTING P ROCEDURE
The voting procedure is used to improve cer-
tainty of matched pairs by considering presence Algorithm 2: NeuroMatch Voting Algorithm
of nearby matched pairs in neighborhoods of the Input: Query node q ∈ GQ , target node u ∈ GT .
matched pairs. The method is motivated by the Threshold t for violation below which we
following observation. predict positive subgraph relation between the
(l) neighborhoods of q and u.
Observation 3. Let N denotes the l-hop Output: Whether the node pair matches.
neighborhood. Then, if q ∈ GQ and node Compute embeddings for neighbors of q, u
u ∈ GT match, then for all nodes i ∈ N (k) (q), within K hops
∃ node j ∈ N (l) (u), l ≤ k such that node i and for hop k ≤ K do
node j match. for node i ∈ N (k) (q) do
m = min{E(zi , zj )|∀j ∈ N (k) (u)}
Since the query graph GQ is a subgraph of target
If m > t, return False
graph GT , all paths in GQ have corresponding
return True
paths in GT . Hence the shortest distance of a
node i ∈ N (k) (q) to q is at most the shortest
distance of node j ∈ N (l) (u) in GT , where j is the corresponding node in GT defined by the
subgraph isomorphism mapping. However, the shortest paths are not necessarily of equal lengths,
since in GT there might be additional short-cuts from j to u that do not exist in GQ .
D. T RAINING DETAILS AND HYPERPARAMETERS
All models are trained on a single GeForce RTX 2080 GPU, and both the heuristics and neural models
use a Intel Xeon E7-8890 v3 CPU.
Curriculum training. In each epoch, we iterate over all target graphs in the curriculum and randomly
sample one query per target graph. We lower bound the number of iterations per epoch to 64 for
datasets that are too small. For the E-R dataset, where we generate neighborhoods at random, and
the WN dataset which consists of only a single graph, we use a fixed 64 iterations per epoch. On all
datasets except for the E-R dataset, we used 256 target graphs where possible. At training time, we
enforce a 3:1 negative to positive ratio in the training examples, which is necessary since in reality
there is a heavy skew in the dataset towards negative examples. 10% of the negative examples are
hard negatives; among the remaining 90%, half are negative examples drawn from the same target
graph as the query, and half are negative examples drawn from different target graphs.
The model is trained with a learning rate of 1 × 10−3 using the Adam optimizer. The learning rate
is annealed with a cosine annealer with restarts every 100 epochs. The curriculum starts with 1
target graph with a radius of 1; it is updated every time there are 20 consecutive epochs without an
12
Neural Subgraph Matching
improvement of more than 0.1. The curriculum update increases the radius of the target graphs by 1
up to a maximum of 4, after which it doubles the number of target graphs for every update up to a
maximum of 256. The dataset is regenerated every 50 epochs.
Predicted + Predicted −
Positive 68.2 8.3
Negative 70.5 1030.9
Table 5: Average confusion Matrix for matching small queries (size ≤ 7) to all node neighborhoods
in the DD dataset.
T
kApred − Aquery k1 Xpred Xquery
+
|Vquery|2 |Vquery|
where Apred is the adjacency matrix of the predicted matched graph, Aquery is the adjacency matrix
of the query graph, |Vquery | is the number of nodes in the query and Xpred and Xquery are the feature
matrices of nodes in the predicted and query graph, respectively. This score measures the degree to
which the predicted and query graph match in terms of topology and node labels, and is based on
the loss function used in the paper, but is adapted to compare matchings across varying query and
target sizes. In general, we found these aggregation strategies to be effective in our setting containing
diverse query and target sizes, but our method is agnostic to such downstream processing of the
alignment matrix. In particular, the Hungarian algorithm or other alignment resolution algorithms
can still be used with the alignment matrix generated by NeuroMatch, especially when an explicit
matching (rather than a binary subgraph prediction) is desired.
For baseline hyperparameters: for I SO R ANK N, we set K = 10, threshold to 1e-4, alpha to 0.9 and
the maximum vector length to 1000000. For FAST PFP, we set lambda to 1, alpha to 0.5, and both
thresholds to 1e-4.
E. S UBGRAPH M ATCHING DATA S TATISTICS
E.1. DATASETS
Biology and chemistry datasets. COX2 contains 467 graphs of chemical molecules with an average
of 41 nodes and 44 edges each. DD contains 1178 graphs with an average of 284 nodes and 716 edges.
13
Neural Subgraph Matching
Table 6: The AUROC (unit: 0.01) for matching on real datasets, where we either train on the synthetic
dataset and test generalization to the real dataset (T RANSFER), or train directly on the dataset that we
test on (I N -D OMAIN).
Table 7: Statistics of target and query graphs used in evaluation of Problem 1 (Table 2).
It describes protein structure graphs where nodes are amino acids and edges represent positional
proximity. We use node labels for both of the datasets. PPI dataset contains the protein-protein
interaction graphs for human tissues. It has 24 graphs corresponding to different PPI networks of
different human tissues. In total, there are 56944 nodes and 818716 edges. We do not include node
features for PPI networks since the goal is to match various protein interaction patterns without
considering the identity of proteins.
MSRC_21 is a semantic image processing dataset introduced in (Winn et al., 2005), containing 563
graph each representing the graphical model of an image. It has an average of 78 nodes and 199
edges.
FIRSTMMDB is a point cloud dataset containing 3d point clouds for various household objects. It
tocontains 41 graphs with an average of 1377 nodes and 3074 edges each.
Label imbalance. We performed additional experiments to investigate the confusion matrix for the
DD dataset averaged across test queries. Table 5 shows extreme imbalance (subgraphs are rare).
Matching query to target graph. Table 7 shows the statistics of target and query graphs used to
evaluate performance on Problem 1 (Table 2).
F. G ENERALIZATION AND RUNTIME
F.1. P RETRAINING ON SYNTHETIC DATASET
To demonstrate the use and generalizability of the synthetic dataset, we conducted the experiment
where the subgraph matching model is trained only on the synthetic dataset, and is then tested on
real-world datasets. Table 6 shows the generalization performance. The first row corresponds to the
model performance when trained and tested on the same dataset. The second row corresponds to
the model performance when trained on the synthetic dataset, and tested on queries sampled from
real-world datasets (listed in each column), Although there is a drop in performance when the model
only sees the synthetic dataset, the model is able generalize to a diverse setting of subgraph matching
scenarios, in biology, chemistry and social network domains, even out-performing some baseline
methods that are specifically trained on the real-world datsets.
However, a shortcoming is that since the synthetic dataset does not contain node features, and real
datasets have varying node feature dimensions, the model is only able to consider subgraph matching
task that does not take feature into account. Incorporation of feature in transfer learning of subgraph
matching remains to be an open problem.
G. C OMPARISON TO E XACT AND A PPROXIMATE H EURISTICS
G.1. E XACT H EURISTICS M ETHODS
Exact heuristics such as VF2 and RI algorithms guarantees to make the correct prediction of whether
query is a subgraph of the target. However, even for relatively small queries (of size 20), matching is
14
Neural Subgraph Matching
Figure 4: Runtime analysis. Success rate of baseline heuristic matching algorithms (VF2 and RI) for
matching in under 20 seconds. NeuroMatch achieves 100% success rate.
costly and can sometimes take unexpectedly long time in the order of hours. As such, these algorithms
are not suitable in online or high-throughput scenarios where efficiency is priority.
To demonstrate the runtime efficiency, we show in Figure 4 the success rate of the exact methods,
which drop below 60% when the query size is increased to more than 30. In comparison, NeuroMatch
always finishes under 0.1 second.
Table 8 shows the runtime comparison between NeuroMatch and the exact baselines consid-
ered (VF2 and RI). NeuroMatch achieves 100 times speedup compared to these exact methods.
15
Neural Subgraph Matching
!#
Figure 5: An overview of the proposed ID-GNN model. We consider node, edge and graph level
tasks, and assume nodes do not have additional features. Across all examples, the task requires an
embedding that allows for the differentiation of the label A vs. B nodes in their respective graphs.
However, across all tasks, existing GNNs, regardless of depth, will always assign the same embedding
to both classes of nodes, because for all tasks the computational graphs are identical. In contrast, the
colored computation graphs provided by ID-GNNs allows for clear differentiation between the nodes
of class A and class B, as the colored computation graph are no longer identical across all tasks.
16