2024 - A Survey On Hypergraph Neural Networks - An In-Depth and Step-by-Step Guide
2024 - A Survey On Hypergraph Neural Networks - An In-Depth and Step-by-Step Guide
Abstract
Higher-order interactions (HOIs) are ubiquitous in real-world com-
plex systems and applications. Investigation of deep learning for
HOIs, thus, has become a valuable agenda for the data mining
and machine learning communities. As networks of HOIs are ex-
pressed mathematically as hypergraphs, hypergraph neural net-
works (HNNs) have emerged as a powerful tool for representation
learning on hypergraphs. Given the emerging trend, we present the
first survey dedicated to HNNs, with an in-depth and step-by-step (a) Co-authors of publications (b) Hypergraph
guide. Broadly, the present survey overviews HNN architectures, Figure 1: An example hypergraph modeling the co-
training strategies, and applications. First, we break existing HNNs authorship relationship among five authors across three
down into four design components: (i) input features, (ii) input struc- publications. Each node represents an author, while each
tures, (iii) message-passing schemes, and (iv) training strategies. hyperedge includes all co-authors of a publication.
Second, we examine how HNNs address and learn HOIs with each
of their components. Third, we overview the recent applications
1 Introduction
of HNNs in recommendation, bioinformatics and medical science, Higher-order interactions (HOIs) are pervasive in real-world com-
time series analysis, and computer vision. Lastly, we conclude with plex systems and applications. These relations describe multi-way
a discussion on limitations and future directions. or group-wise interactions, occurring from physical systems [8],
microbial communities [100], brain functions [30], and social net-
CCS Concepts works [52], to name a few. HOIs reveal structural patterns unob-
• Computing methodologies → Machine learning. served in their pairwise counterparts and inform network dynamics.
For example, they have been shown to affect or correlate with syn-
Keywords chronization in physical systems [7], bacteria invasion inhibition
in microbial communities [98], cortical dynamics in brains [161],
Hypergraph Neural Network, Self-supervised Learning and contagion in social networks [22].
ACM Reference Format: Hypergraphs mathematically express higher-order networks
Sunwoo Kim, Soo Yong Lee, Yue Gao, Alessia Antelmi, Mirko Polato, and Ki- or networks of HOIs [11], where nodes and hyperedges respec-
jung Shin. 2024. A Survey on Hypergraph Neural Networks: An In-Depth tively represent entities and their HOIs. In contrast to an edge
and Step-by-Step Guide. In Proceedings of the 30th ACM SIGKDD Con- connecting only two nodes in pairwise graphs, a hyperedge can
ference on Knowledge Discovery and Data Mining (KDD ’24), August 25–
connect any number of nodes, offering hypergraphs advantages
29, 2024, Barcelona, Spain. ACM, New York, NY, USA, 11 pages. https:
//doi.org/10.1145/3637528.3671457
in their descriptive power. For instance, as shown in Fig. 1, the
∗ Equal contribution
co-authorship relations among researchers can be represented as
† Corresponding author a hypergraph. With their expressiveness and flexibility, hyper-
graphs have been routinely used to model higher-order networks
Permission to make digital or hard copies of all or part of this work for personal or in various domains [6, 22, 32, 43] to uncover their structural pat-
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation terns [24, 61, 62, 71–73].
on the first page. Copyrights for components of this work owned by others than the As hypergraphs are extensively utilized, the demand grew to
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specific permission
make predictions on them, estimating node properties or identi-
and/or a fee. Request permissions from [email protected]. fying missing hyperedges. Hypergraph neural networks (HNNs)
KDD ’24, August 25–29, 2024, Barcelona, Spain have shown strong promise in solving such problems. For example,
© 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM. they have shown state-of-the-art performances in industrial and
ACM ISBN 979-8-4007-0490-1/24/08
https://doi.org/10.1145/3637528.3671457
KDD ’24, August 25–29, 2024, Barcelona, Spain Sunwoo Kim, Soo Yong Lee, Yue Gao, Alessia Antelmi, Mirko Polato, and Kijung Shin
Encoding: Input feature Encoding: Input structure Encoding: Message passing Training: Objective
External Structural Identity Reductive Non-reductive Target Message Aggregate Learning to Learning to Learning to
info. info. info. transformation transformation selection representation function classify contrast generate
Membership
Tensor
level
Figure 2: Taxonomy on modeling higher-order interactions. The term neg. sam. denotes negative sampling.
scientific applications, including missing metabolic reaction pre- Table 1: Frequently-used symbols
diction [16], brain classification [54], traffic forecast [169], product Notation Definition
recommendation [55], and more [42, 93, 145]. G = ( V, E ) Hypergraph with nodes set V and hyperedges set E
H ∈ {0, 1} |V |×|E | Incidence matrix
The research on HNNs has been exponentially growing. Simul- X ∈ R |V |×𝑑 , Y ∈ R |E |×𝑑
′
Node features (X) and hyperedge features (Y)
taneously, further research on deep learning for higher-order net- P (ℓ ) ∈ R |V |×𝑘 , Q (ℓ ) ∈ R |E |×𝑘
′
ℓ-th layer embeddings of nodes (P (ℓ ) ) and hyperedges (Q (ℓ ) )
NE (𝑣𝑖 ) Incident hyperedges of node 𝑣𝑖
works is an imminent agenda for the data mining and machine I𝑛 𝑛-by-𝑛 identity matrix
I[cond] Indicator function that returns 1 if cond is True, 0 otherwise
learning communities [103]. Therefore, we provide a timely survey 𝜎 (·) Non-linear activation function
on HNNs that addresses the following questions: M𝑖,: B 𝒎𝑖
M𝑖,𝑗 B 𝑚𝑖 𝑗
𝑖-th row of matrix M
(𝑖, 𝑗 )-entry of matrix M
• Encoding (Sec. 3). How do HNNs effectively capture HOIs?
vector representation for other nodes (or hyperedges) to aggregate.
• Training (Sec. 4). How to encode HOIs with training objectives,
The message passing operation is repeated 𝐿 times, where each iter-
especially when external labels are scarce or absent?
ation corresponds to one HNN layer. Here, we denote the ℓ-th layer
• Application (Sec. 5). What are notable applications of HNNs?
embedding matrix of nodes and hyperedges as P (ℓ ) ∈ R | V | ×𝑘 and
Our scope is largely confined to HNNs for undirected, static, and ′
Q (ℓ ) ∈ R | E | ×𝑘 , respectively. Unless otherwise stated, we assume
homogeneous hypergraphs, with node classification or hyperedge P (0) = X and Q (0) = Y. We use I𝑛 , ∥, ⊙, and 𝜎 (·) to denote the
prediction as their downstream tasks. The survey aims to provide 𝑛-by-𝑛 identity matrix, vector concatenation, elementwise product,
an in-depth and step-by-step guide, with HNNs’ design components and a non-linear activation function, respectively.
(see Fig. 2) and their analysis (see Table 2).
3 Encoder Design Guidance
2 Preliminaries
In this section, we provide a step-by-step description of how HNNs
In this section, we present definitions of basic concepts related to encode higher-order interactions (HOIs).
hypergraphs and HNNs. See Table 1 for frequently-used symbols.
A hypergraph G = (V, E) is defined as a set of nodes V = 3.1 Step 1: Design features to reflect HOIs
{𝑣 1, 𝑣 2, · · · , 𝑣 | V | } and a set of hyperedges E = {𝑒 1, 𝑒 2, · · · , 𝑒 | E | }.
First, HNNs require a careful choice of input node features X ∈
Each hyperedge 𝑒 𝑗 is a non-empty subset of nodes (i.e., ∅ ≠ 𝑒 𝑗 ⊆ ′
R | V | ×𝑑 and/or hyperedge features Y ∈ R | E | ×𝑑 . Their quality can
V). Alternatively, E can be represented with an incidence matrix
be vital for a successful application of HNNs [74, 166]. Thus, studies
H ∈ {0, 1} | V | × | E | , where H𝑖,𝑗 = 1 if 𝑣𝑖 ∈ 𝑒 𝑗 and 0 otherwise. The
have crafted input features to enhance HNNs in encoding HOIs.
incident hyperedges of a node 𝑣𝑖 , denoted as NE (𝑣𝑖 ), is the set of
Three primary approaches include the use of (i) external features
hyperedges that contain 𝑣𝑖 (i.e., NE (𝑣𝑖 ) = {𝑒𝑘 ∈ E : 𝑣𝑖 ∈ 𝑒𝑘 }).
or labels, (ii) structural features, and (iii) identity features.
We assume that each node 𝑣𝑖 and hyperedge 𝑒 𝑗 are equipped with
′
(input) node features 𝒙 𝑖 ∈ R𝑑 and hyperedge features 𝒚 𝑗 ∈ R𝑑 , 3.1.1 External features or labels. External features or labels
respectively.1 Similarly, we denote node and hyperedge feature broadly refer to information that is not directly obtained from
′
matrices as X ∈ R | V | ×𝑑 and Y ∈ R | E | ×𝑑 , respectively, where the the hypergraph structure. Using external features allows HNNs to
𝑖-th row X𝑖 corresponds to 𝒙 𝑖 and 𝑗-th row Y 𝑗 corresponds to 𝒚𝑖 . capture information that may not be transparent in hypergraph
In Sec. 3.1, we detail approaches to obtain the features. structure alone. When available, using external node features X
Hypergraph neural networks (HNNs) are neural functions that and hyperedge features Y as HNN input is the standard practice.
transform given nodes, hyperedges, and their features into vector Some examples of node features from widely-used benchmark
representations (i.e., embeddings). Typically, their input is repre- datasets are bag-of-words vectors [148], TF-IDFs [27], visual ob-
sented as either (X, E) or (X, Y, E). HNNs first prepare the input ject embeddings [34], or noised label vectors [17]. Interestingly, as
hypergraph structure E (Sec. 3.2). Then, HNNs perform message in label propagation, HyperND [106] constructs input node fea-
passing between nodes (and/or hyperedges) to update their embed- tures X by concatenating external node features with label vectors.
dings (Sec. 3.3). A node (or hyperedge) message roughly refers to its Specifically, one-hot-encoded label vectors and zero vectors are
1 Sometimes,(external) node and hyperedge features may not be given. In such cases, concatenated for nodes with known and unknown labels, respec-
one may utilize structural or identity features, as described in Sec. 3.1. tively. Since external hyperedge features are typically missing in the
A Survey on Hypergraph Neural Networks: An In-Depth and Step-By-Step Guide KDD ’24, August 25–29, 2024, Barcelona, Spain
Star
Star expansion.
expansion. A star-expanded graph of a hypergraph G = (V, E) where 𝛼 ℓ , 𝛽 ℓ ∈ [0, 1] are hyperparameters, Θ (ℓ ) ∈ R𝑘 ×𝑘 is a learn-
has two new groups of nodes: the node group, which is the same as able weight matrix, and P (0) = MLP(X).
the node set V of G, and the hyperedge group, consisting of nodes On star-expanded
On star-expanded graphsgraphs (V
(V → →E and E
E and E→→ V).
V). In HNNs based
corresponding to the hyperedges E (refer to Fig. 3(c)). Star expan- on star expansion, message passing occurs from the node group to
sion captures HOIs by connecting each node (i.e., a node from the the hyperedge group (V → E) and vice versa (E → V) [17, 20, 25,
node group) with the hyperedges (i.e., nodes from the hyperedge 132, 150], either sequentially or simultaneously.
group) it belongs to, resulting a bipartite graph between the two First, we illustrate sequential message passing using ED-HNN
groups. Star expansion is expressed as 𝜏 : (E, X, Y) ↦→ A, where [132]. Its message passing at each ℓ-th layer for each node 𝑣𝑖 ∈ V
each entry of A ∈ R ( | V |+| E | ) × ( | V |+| E | ) is defined as is formalized as follows:
(ℓ ) (ℓ −1)
∑︁
I[𝑣 ∈ 𝑒 𝑗 − | V | ], if 1 ≤ 𝑖 ≤ |V | < 𝑗 ≤ |V | + |E |, 𝒒𝑗 = MLP1 𝒑𝑘 (4)
𝑖 ,
𝑎𝑖 𝑗 = I[𝑣 𝑗 ∈ 𝑒𝑖 − | V | ], if 1 ≤ 𝑗 ≤ |V | < 𝑖 ≤ |V | + |E |, (2) 𝑣𝑘 ∈𝑒 𝑗
h i
(ℓ ) (ℓ −1) (ℓ )
0, ∑︁
otherwise.
𝓹𝑖 = MLP2 𝒑𝑖 ∥𝒒𝑘 , (5)
Here, we assume WLOG that the corresponding index of 𝑣𝑖 ∈ V in 𝑒𝑘 ∈ NE (𝑣𝑖 )
A is 𝑖, and the corresponding index of 𝑒 𝑗 ∈ E in A is |V | + 𝑗. (ℓ )
h
(ℓ −1) (ℓ )
i
Line
Line expansion.
expansion. In a line-expanded graph [154] of a hypergraph 𝒑𝑖 = MLP3 𝒑𝑖 ∥𝓹𝑖 ∥𝒙 𝑖 ⊕ |NE (𝑣𝑖 )| , (6)
of G = (V, E), each pair of a node and a hyperedge containing it is
represented as a distinct node. That is, its node set is {(𝑣𝑖 , 𝑒 𝑗 ) : 𝑣𝑖 ∈ where 𝒙 ⊕ 𝑐 denotes the concatenation of vector 𝒙 and scalar 𝑐.
𝑒 𝑗 , 𝑒 𝑗 ∈ E}. Edges are established between these nodes to connect MLP1 , MLP2 , and MLP3 are MLPs shared across all layers. Note that,
each pair of distinct nodes (𝑣𝑖 , 𝑒 𝑗 ) and (𝑣𝑘 , 𝑒𝑙 ), where 𝑖 = 𝑘 or 𝑗 = 𝑙. in Eq. (4), hyperedge embeddings are updated by aggregating the
Tensor expression.
Tensor expression. Several recent HNNs represent hypergraphs embeddings of their constituent nodes. Subsequently, in Eq. (5) and
as tensors [59, 131]. For example, T-HyperGNNs [126] expresses Eq. (6), node embeddings are updated by aggregating transformed
a 𝑘−uniform (i.e., |𝑒 𝑗 | = 𝑘, ∀𝑒 𝑗 ∈ E) hypergraph G = (V, E) embeddings of incident hyperedges. Here, the message passing in
each direction (Eq. (4) and Eq. (5)) occurs sequentially.
with a 𝑘−order tensor A ∈ R | V | . That is, if 𝑘 = 3, A𝑖,𝑗,𝑘 = 1 if
𝑘
node may vary across the hyperedges it belongs to [19, 20]. Several 𝑇
(ℓ ) (ℓ )
studies [2, 20, 119] have devised hyperedge-dependent node mes- MH(𝜽, S) = ∥ℎ𝑡=1 𝜔 𝜃 𝑡 MLP𝑡,1 (S) MLP𝑡,2 (S) , (15)
sages, enabling a node to send tailored messages to each hyperedge
(ℓ ) (ℓ ) (ℓ ) (ℓ ) (ℓ )
it belongs to. For example, MultiSetMixer [119] learns different 𝒑𝑖 = LN 𝓹𝑖 + MLP3 𝓹𝑖 ; 𝓹𝑖 = LN 𝜽 + MH 𝜽, S (ℓ,𝑖 ) ,
node messages for each incident hyperedge to aggregate with the
following message passing function: where LN is layer normalization [3], 𝜔 (·) is row-wise softmax,
𝜽 = ∥𝑇𝑡=1𝜃 𝑡 is a learnable vector, and MLP𝑡,1 , MLP𝑡,2 , and MLP3 are
(ℓ ) 1 ∑︁ (ℓ −1) (ℓ ) © © 1
∑︁
(ℓ −1) ªª MLPs. Note that Eq. (15) is a widely-used multi-head attention op-
𝒒𝑗 = 𝒑𝑘,𝑗 + MLP1 LN 𝒑 ®® , (11)
|𝑒 𝑗 | 𝑣 ∈𝑒 |𝑒 𝑗 | 𝑣 ∈𝑒 𝑘,𝑗 eration [121], where 𝜽 serves as queries, and S serves as keys and
𝑘 𝑗 𝑘 𝑗
« « ¬¬ values. This process is target-agnostic since it considers only the
(ℓ ) (ℓ −1) (ℓ ) (ℓ −1) (ℓ )
𝒑𝑖,𝑗 = 𝒑𝑖,𝑗 + MLP2 LN 𝒑𝑖,𝑗 + 𝒒𝑗 , (12) global variables 𝜽 and the embeddings S of incident hyperedges,
without considering the embedding of the target 𝑣𝑖 itself.
(ℓ )
where 𝒑𝑖,𝑗 is the ℓ-th layer message of 𝑣𝑖 that is dependent on 𝑒 𝑗 , In target-aware attention approaches, target information is incor-
(ℓ )
MLP1
(ℓ )
and MLP2 are MLPs, and LN is layer normalization [3]. porated to compute attention weights. HyGNN [113] is an example,
Alternatively, some HNNs update messages based on hyperedge- with the following message passing function:
dependent node features. WHATsNet [20] introduces within-order (ℓ ) (ℓ −1) (ℓ −1) (ℓ −1)
Att ( V ) (𝒒𝑘 , 𝒑𝑖 )𝒒𝑘 Θ (ℓ,1)
positional encoding (wope) to adapt node messages for each target. (ℓ )
𝒑𝑖
© ∑︁
=𝜎 ® , (16)
ª
Within each hyperedge, WHATsNet ranks constituent nodes ac- Í (ℓ ) (ℓ −1) (ℓ −1)
«𝑒𝑘 ∈ NE (𝑣𝑖 ) 𝑒𝑠 ∈ NE (𝑣𝑖 ) Att (V) 𝑠
(𝒒 , 𝒑 𝑖 )
cording to their centralities for positional encoding. Formally, let ¬
(ℓ ) (ℓ ) (ℓ −1) (ℓ )
F ∈ R | V | ×𝑇 be a node centrality matrix, where 𝑇 and F𝑖,𝑡 respec- Att ( E ) (𝒑𝑘 , 𝒒 𝑗 )𝒑𝑘 Θ (ℓ,2)
(ℓ ) © ∑︁
𝒒𝑗 =𝜎 (17)
ª
tively denote the number of centrality measures (e.g., node degree) Í (ℓ ) (ℓ ) (ℓ −1)
®.
and the 𝑡-th centrality measure score of node 𝑣𝑖 . The order of an « 𝑣𝑘 ∈𝑒 𝑗 𝑣 𝑠 ∈𝑒 𝑗
Att (E)
(𝒑 𝑠 , 𝒒 𝑗 ) ¬
element 𝑐 in a set C is defined as Order(𝑐, C) = 𝑐 ′ ∈ C I[𝑐 ′ ≤ 𝑐].
Í
(ℓ ) (ℓ ) (ℓ ) (ℓ )
Then, wope of a node 𝑣𝑖 at a hyperedge 𝑒 𝑗 is defined as follows: Here, Att ( V ) (𝒒, 𝒑) = 𝜎 (𝒒𝑇 𝝍 1 ×𝒑𝑇 𝝍 2 ) ∈ R and Att ( E ) (𝒑, 𝒒) =
(ℓ ) (ℓ )
𝑇 1 𝜎 (𝒑𝑇 𝝍 3 × 𝒒𝑇 𝝍 4 ) ∈ R are attention weight functions, where
wope(𝑣𝑖 , 𝑒 𝑗 ) = 𝑡 =1 Order(F𝑖,𝑡 , {F𝑖,𝑡 : 𝑣𝑖 ∈ 𝑒 𝑗 }). (13) (ℓ ) (ℓ ) (ℓ ) (ℓ )
|𝑒 𝑗 | {𝝍 1 , 𝝍 2 , 𝝍 3 , 𝝍 4 } and {Θ (ℓ,1) , Θ (ℓ,2) } are sets of learnable
Finally, hyperedge-dependent node messages are defined as follows: vectors and matrices, respectively. Note that the attention weight
functions consider messages from both sources and targets. Target-
(ℓ ) (ℓ )
𝒑𝑖,𝑗 = 𝒑𝑖 + wope(𝑣𝑖 , 𝑒 𝑗 )Ψ (ℓ ) , (14) aware attention has also been incorporated into clique-expansion-
based HNNs, with HCHA [5] as a notable example.
where Ψ (ℓ ) ∈ R𝑇 ×𝑘 is a learnable projection matrix.3
3.3.4 Comparison with GNNs. GNNs also use neural message
3.3.3 How to aggregate messages (aggregation function). passing to aggregate information from other nodes [39, 77, 85].
The last step is to decide how to aggregate the received messages However, since GNNs typically perform message passing directly
for each node (and hyperedge). between nodes, they are not ideal for learning hyperedge (i.e., HOI)
Fixed pooling.
Fixed pooling. Many HNNs use fixed pooling functions, includ-
representations or hyperedge-dependent node representations.
ing summation [49, 132] and average [36, 133]. For example, ED-
HNN [132] uses summation to aggregate the embeddings of con-
4 Objective Design Guidance
stituent nodes (or incident hyperedges), as described in Eq. (4)
and Eq. (5). Clique-expansion-based HNNs without adaptive edge In this section, we outline training objectives for HNNs to capture
weights also fall into this category [110, 117]. For example, SHNN [117] HOIs effectively, particularly when label supervision is weak or
uses a fixed propagation matrix W (see Eq. (3)) to aggregate node absent. Below, we review three branches: (i) learning to classify, (ii)
(ℓ ) Í (ℓ −1) learning to contrast, and (iii) learning to generate.
embeddings. Specifically, 𝒑𝑖 = 𝑣𝑘 ∈ V W𝑖,𝑗 𝒑𝑘 .
Learnable
Learnable pooling.
pooling. Several recent HNNs enhance their pooling
functions through attention mechanisms, allowing for weighting 4.1 Learning to classify
3 Similarly,each hyperedge 𝑒 𝑗 ’s message to each node 𝑣𝑖 at the ℓ -th layer is defined as HNNs can learn HOIs by classifying hyperedges [51, 68, 125, 149,
(ℓ ) (ℓ ) (ℓ )
𝒒 𝑗,𝑖 = 𝒒 𝑗 + wope (𝑣𝑖 , 𝑒 𝑗 )Ψ (ℓ ) . WHATsNet aggregates {𝑞𝑘,𝑖 : 𝑒𝑘 ∈ N(E) (𝑣𝑖 ) } to 166] as positive or negative. A positive hyperedge is a ground-truth,
(ℓ )
obtain p𝑖 via set attention proposed by Lee et al. [76]. We omit the detailed message “true” hyperedge, and a negative hyperedge often refers to a heuris-
passing function since we focus on describing how dependent messages are obtained. tically generated “fake” hyperedge, considered unlikely to exist. By
KDD ’24, August 25–29, 2024, Barcelona, Spain Sunwoo Kim, Soo Yong Lee, Yue Gao, Alessia Antelmi, Mirko Polato, and Kijung Shin
Table 2: Summary of hypergraph neural networks (HNNs). However, in hypergraphs, since a hyperedge can contain an ar-
(Structure) (Embedding Type) (Aggregation) bitrary number of nodes, the size of the space is 𝑂 (2 | V | ), which
Name Year Venue Reductive? Edge Dependent? Learnable?
Yes No Yes No Yes No makes finding representative “unlikely” HOIs, or negative hyper-
HGNN [34] 2019 AAAI ✔ ✔ ✔ edges, more challenging [51]. Consequently, learning the distin-
HyperGCN [148] 2019 NeurIPS ✔ ✔ ✔ guishing patterns of HOIs by classifying the positive and negative
HNHN [25] 2020 ICML ✔ ✔ ✔ hyperedges may be more challenging.
HCHA [5] 2019 Pat. Rec. ✔ ✔ ✔
UniGNN [49] 2021 IJCAI ✔ ✔ ✔
HO Transformer [60] 2021 NeurIPS ✔ ✔ ✔
4.2 Learning to contrast
AllSet [17] 2022 ICLR ✔ ✔ ✔ Contrastive learning (CL) aims to maximize agreement between
HyperND [106] 2022 ICML ✔ ✔ ✔ data obtained from different views. Intuitively, views refer to dif-
H-GNN [164] 2022 ICML ✔ ✔ ✔ ferent versions of the same data, original or augmented. Training
EHNN [59] 2022 ECCV ✔ ✔ ✔ neural networks with CL has shown strong capacity in capturing
LE𝐺𝐶𝑁 [153] 2022 CIKM ✔ ✔ ✔
the input data characteristics [53]. For HNNs, several CL techniques
HERALD [164] 2022 ICASSP ✔ ✔ ✔
HGNN+ [36] 2022 TPAMI ✔ ✔ ✔ have been devised to learn HOIs [64, 70, 136]. Here, we describe
ED-HNN [132] 2023 ICLR ✔ ✔ ✔
three steps of CL for HNNs: (i) obtaining views, (ii) encoding, and
PhenomNN [135] 2023 ICML ✔ ✔ ✔ (iii) computing contrastive loss.
WHATsNet [20] 2023 KDD ✔ ✔ ✔
SheafHyperGNN [28] 2023 NeurIPS ✔ ✔ ✔
4.2.1 View creation and encoding. First, we obtain views for
MeanPooling [70] 2023 AAAI ✔ ✔ ✔ contrast. This can be achieved by augmenting the input hypergraph,
HENN [44] 2023 LoG ✔ ✔ ✔ using rule-based [64, 70] or learnable [136] methods.
HyGNN [113] 2023 ICDE ✔ ✔ ✔ Rule-based augmentation.
Rule-based augmentation. This approach stochastically corrupts
HGraphormer [110] 2023 arXiv ✔ ✔ ✔
node features and hyperedges. For nodes, an augmented feature
MultiSetMixer [119] 2023 arXiv ✔ ✔ ✔
matrix is obtained by either zeroing out certain entries (i.e., feature
HJRL [151] 2024 AAAI ✔ ✔ ✔
values) of X [68, 70] or adding Gaussian noise to them [108]. For
HDE𝑜𝑑𝑒 [150] 2024 ICLR ✔ ✔ ✔
hyperedges, augmented hyperedges are obtained by excluding some
HyperGT [90] 2024 ICASSP ✔ ✔ ✔
THNN [131] 2024 SDM ✔ ✔ ✔
nodes from hyperedges [70] or perturbing hyperedge membership
UniG-Encoder [177] 2024 Pat. Rec. ✔ ✔ ✔ (e.g., changing 𝑒𝑖 = {𝑣 1, 𝑣 2, 𝑣 3 } to 𝑒𝑖′ = {𝑣 1, 𝑣 2, 𝑣 4 }) [87].
SHNN [117] 2024 arXiv ✔ ✔ ✔ Learnable
Learnable augmentation.
augmentation. This approach utilizes a neural net-
HyperMagNet [9] 2024 arXiv ✔ ✔ ✔ work to generate views [136]. Specifically, HyperGCL [136] gener-
CoNHD [170] 2024 arXiv ✔ ✔ ✔
ates synthetic hyperedges E ′ using HNN-based VAE [65].
Once an augmentation strategy 𝜏 : (X, E) ↦→ (X′, E ′ ) is decided,
learning to classify them, HNNs may capture the distinguishing
a hypergraph-view pair (G (1) , G (2) ) can be obtained in two ways:
patterns of the ground-truth HOIs.
• G (1) is the original hypergraph with (X, E), and G (2) is an aug-
4.1.1 Heuristic negative sampling. We discuss popular nega- mented hypergraph with (X′, E ′ ), where (X′, E ′ ) = 𝜏 (X, E) [136].
tive sampling (NS) strategies to obtain negative hyperedges [104]: • Both G (1) and G (2) are augmented by applying 𝜏 to (X, E) [70].
• Sized NS: each negative hyperedge contains 𝑘 random nodes. They likely differ due to the stochastic nature of 𝜏.
• Motif NS: each negative hyperedge contains a randomly chosen Encoding.
Encoding. Then, the message passing on the two views (sharing
𝑘 adjacent nodes. the same parameters) results in two pairs of node and hyperedge
• Clique NS: each negative hyperedge is generated by replacing embeddings denoted by (P′, Q′ ) and (P′′, Q′′ ) [68, 70].
a randomly chosen node in a positive hyperedge with another 4.2.2 Contrastive loss. Then, we choose a contrastive loss. Be-
randomly chosen node adjacent to the remaining nodes. low, we present node-, hyperedge-, and membership-level contrastive
Similarly, many HNNs use rule-based NS for hyperedge classifica- losses. Here, 𝜏𝑥 , 𝜏𝑒 , 𝜏𝑚 ∈ R are hyperparameters.
tion [51, 68, 125, 149, 166]. Others leverage domain knowledge to Node
Node level.
level. A node-level contrastive loss is used to (i) maximize
design NS strategies [16, 134]. the similarity between the same node from two different views and
(ii) minimize the similarity for different nodes [64, 68, 70, 94, 136]:
4.1.2 Learnable negative sampling. Notably, Hwang et al. [51]
show that training HNNs with the aforementioned NS strategies −1 ∑︁ exp(sim(𝒑𝑖′, 𝒑𝑖′′ )/𝜏𝑣 )
L (𝑣) (P′, P′′ ) = log Í ′ ′′ , (18)
may cause overfitting to negative hyperedges of specific types. This |V | 𝑣𝑘 ∈ V exp(sim(𝒑𝑖 , 𝒑𝑘 )/𝜏 𝑣 )
𝑣𝑖 ∈ V
may be attributable to the vast population of potential negative
hyperedges, where the tiny samples may not adequately represent where sim(𝒙, 𝒚) is the similarity between 𝒙 and 𝒚 (e.g., cosine
this population. To mitigate the problem, they employ adversarial similarity).
training of a generator that samples negative hyperedges. Hyperedge
Hyperedge level.
level. A hyperedge-level contrastive loss is implemented
in a similar manner [68, 70, 80]:
4.1.3 Comparison with GNNs. Link prediction on pairwise
graphs is a counterpart of the HOI classification task [165, 176]. exp(sim(𝒒′𝑗 , 𝒒′′
−1 ∑︁ 𝑗 )/𝜏𝑒 )
However, the space of possible negative edges significantly differs L (𝑒 ) (Q′, Q′′ ) = log Í ′ ′′ . (19)
between them. In pairwise graphs, the size of the space is 𝑂 (|V | 2 ). |E | 𝑒𝑘 ∈ E exp(sim(𝒒 𝑖 , 𝒒𝑘 )/𝜏𝑒 )
𝑒𝑗 ∈ E
A Survey on Hypergraph Neural Networks: An In-Depth and Step-By-Step Guide KDD ’24, August 25–29, 2024, Barcelona, Spain
Membership
Membership level.
level. A membership-level contrastive loss is used For example, HSL [12] adopts a learnable augmenter to replace
to make the embeddings of incident node-hyperedge pairs distin- unhelpful hyperedges with the generated ones. HSL prunes hyper-
guishable from those of non-incident pairs across two views [70]: edges using a masking matrix M ∈ R | V | × | E | . Each 𝑗−th column is
𝑧
𝑚 𝑗 = sigmoid((log ( 1−𝑧𝑗 𝑗 ) + (𝜖0 − 𝜖1 ))/𝜏), where 𝜖0 and 𝜖1 , 𝜏 ∈ R,
−1 ∑︁ ∑︁ exp(D (𝒑𝑖′, 𝒒′′
𝑗 )/𝜏𝑚 )
L (𝑚) (P′, Q′′ ) = 1𝑖,𝑗 log Í and 𝑧𝑘 ∈ [0, 1], ∀𝑒𝑘 ∈ E respectively are random samples from
′ ′′
𝐾
𝑒 𝑗 ∈ E 𝑣𝑖 ∈ V 𝑣𝑘 ∈ V exp(D (𝒑𝑘 , 𝒒 𝑗 )/𝜏𝑚 ) Gumbel(0, 1), a hyperparameter, and a learnable scalar. An un-
| {z } helpful 𝑒𝑘 is expected to have small 𝑧𝑘 to be pruned.
when 𝒒′′𝑗 is an anchor
After performing pruning by Ĥ = H ⊙ M, HSL modifies Ĥ by
1 ∑︁ ∑︁ exp(D (𝒒′′ ′
𝑗 , 𝒑𝑖 )/𝜏𝑚 ) adding generated latent hyperedges ∆H. Specifically, ∆H𝑖,𝑗 = 1 if
− 1𝑖,𝑗 log Í ′′ ′ (H𝑖,𝑗 = 0) ∧ (S𝑖,𝑗 ∈ top(S, 𝑁 )), and 0 otherwise. top(S, 𝑁 ) denotes
𝐾
𝑒 𝑗 ∈ E 𝑣𝑖 ∈ V 𝑒𝑘 ∈ E exp(D (𝒒𝑘 , 𝒑𝑖 )/𝜏𝑚 ),
| {z } the set of top-N entries in a learnable score matrix S ∈ R | V | × | E | .
when 𝒑𝑖′ is an anchor Each score in S is S𝑖,𝑗 = 𝑇1 𝑇𝑡=1 sim(𝒘 𝑡 ⊙𝒑𝑖 , 𝒘 𝑡 ⊙𝒒𝑖 ), where {𝒘 𝑡 }𝑇𝑡=1
Í
and sim respectively are learnable vectors and cosine similarity. To
where 1𝑠,𝑗 = I[𝑣𝑠 ∈ 𝑣 𝑗 ]; D (𝒙, 𝒚) ∈ R is a discriminator for assign- summarize, node and hyperedge similarities learned by an HNN
ing higher value to incident pairs than non-incident pairs [122]. serve to generate latent hyperedges ∆H. Lastly, Ĥ + ∆H is fed into
another HNN for a target downstream task (e.g., node classification).
4.2.3 Comparison with GNNs. GNNs are also commonly trained
All learnable components, including the HNN for augmentation,
with contrastive objectives [109, 122, 160]. They typically focus on
are trained end-to-end.
node-level [122] and/or graph-level contrast [109].
Note that the HNNs learning to generate latent hyperedges gen-
erally implement additional loss functions to encourage the latent
4.3 Learning to generate hyperedges to be similar to the original ones [162, 167]. Further-
HNNs can also be trained by learning to generate hyperedges. Ex- more, some studies have explored generating latent HOIs when
isting HNNs aim to generate either (i) ground-truth hyperedges to input hypergraph structures were not available [37, 57, 172].
capture their characteristics or (ii) latent hyperedges potentially
beneficial for designated downstream tasks. 4.3.3 Comparison with GNNs. Various GNNs also target to gen-
erate ground-truth pairwise interactions [66, 116] or latent pairwise
4.3.1 Generating ground-truth HOIs. Training neural networks interactions [31]. In a pairwise graph, the inner product of two node
to generate input data has shown strong efficacy in various domains embeddings is widely used to model the likelihood that an edge
and downstream tasks [45, 102]. In two recent studies, HNNs are joins these nodes [31, 66]. However, modeling the likelihood of a
trained to generate ground-truth hyperedges to learn HOIs [26, 63]. hyperedge, which can join any number of nodes, using an inner
HypeBoy by Kim et al. [63] formulates hyperedge generation as a product is not straightforward.
hyperedge filling task, where the objective is to identify the missing
node for a given subset of a hyperedge. Overall, HypeBoy involves 5 Application Guidance
three steps: (i) hypergraph augmentation, (ii) node and hyperedge- HNNs have been adopted in various applications, including recom-
subset encoding, and (iii) loss-function computation. mendation, bioinformatics and medical science, time series analysis,
HypeBoy obtains the augmented node feature matrix X′ and and computer vision. Their central concerns involve hypergraph
augmented input topology E ′ , respectively by randomly masking construction and hypergraph learning task formulation.
some entries of X and by randomly dropping some hyperedges
from E. Hypeboy, then, feeds X′ and E ′ into an HNN to obtain 5.1 Recommendation
node embedding matrix P. Subsequently, for each node 𝑣𝑖 ∈ 𝑒 𝑗 and 5.1.1 Hypergraph construction. For recommender system ap-
subset 𝑞𝑖 𝑗 = 𝑒 𝑗 \ {𝑣𝑖 }, HypeBoy obtains (final) node embedding plications, many studies utilized hypergraphs consisting of item
Í
𝓹𝑖 = MLP1 (𝒑𝑖 ) and subset embedding 𝓺𝑖 𝑗 = MLP2 ( 𝑣𝑘 ∈𝑞𝑖 𝑗 𝒑𝑘 ). nodes (being recommended) and user hyperedges (receiving recom-
Lastly, the HNN is trained to make embeddings of the ‘true’ node- mendations). For instance, all items that a user interacted with were
subset pairs similar and of the ‘false’ node-subset pairs dissimilar. connected by a hyperedge [128]. When sessions were available,
Specifically, it minimizes the following loss: hyperedges connected item nodes by their context window [83,
∑︁ ∑︁ exp(sim(𝓹𝑖 , 𝓺𝑖 𝑗 )) 129, 142]. Some studies leveraged multiple hypergraphs. For in-
L=− log Í , (20) stance, Zhang et al. [163] incorporated user- and group-level hy-
𝑣𝑘 ∈ V exp(sim(𝓹𝑘 , 𝓺𝑖 𝑗 )) pergraphs. Ji et al. [55] constructed a hypergraph with item nodes
𝑒 𝑗 ∈ E 𝑣𝑖 ∈𝑒 𝑗
and a hypergraph with user nodes, where their hyperedges were
where sim(𝒙, 𝒚) is a cosine similarity between 𝒙 and 𝒚. inferred from heuristic-based algorithms. In contrast, other studies
incorporated learnable hypergraph structure [140, 141].
4.3.2 Generating latent HOIs. HNNs can be trained to generate
latent hyperedges, especially when (i) (semi-)supervised down- 5.1.2 Application tasks. Hypergraph-based modeling allows nat-
stream tasks and (ii) suboptimal input hypergraph structures are ural applications of HNNs for recommendation, typically formu-
assumed. Typically, the training methods let HNNs generate po- lated as a hyperedge prediction problem. HNNs have been used for
tential, latent hyperedges, which are used for message passing to sequential [82, 128], session-based [83, 129, 142], group [56, 163],
improve downstream task performance [12, 79, 162, 167]. conversational [168], and point-of-interest [69] recommendation.
KDD ’24, August 25–29, 2024, Barcelona, Spain Sunwoo Kim, Soo Yong Lee, Yue Gao, Alessia Antelmi, Mirko Polato, and Kijung Shin
References et al. 2021. Hypergraph models of biological networks to identify genes critical
[1] Shivam Agarwal, Ramit Sawhney, Megh Thakkar, Preslav Nakov, Jiawei Han, to pathogenic viral response. BMC bioinformatics 22, 1 (2021), 287.
and Tyler Derr. 2022. Think: Temporal hypergraph hyperbolic network. In [33] Yifan Feng, Jiashu Han, Shihui Ying, and Yue Gao. 2024. Hypergraph isomor-
ICDM. phism computation. IEEE Transactions on Pattern Analysis & Machine Intelligence
[2] Ryan Aponte, Ryan A Rossi, Shunan Guo, Jane Hoffswell, Nedim Lipka, Chang 01 (2024), 1–17.
Xiao, Gromit Chan, Eunyee Koh, and Nesreen Ahmed. 2022. A hypergraph [34] Yifan Feng, Haoxuan You, Zizhao Zhang, Rongrong Ji, and Yue Gao. 2019.
neural network framework for learning hyperedge-dependent node embeddings. Hypergraph neural networks. In AAAI.
arXiv preprint arXiv:2212.14077 (2022). [35] Giorgio Gallo, Giustino Longo, Stefano Pallottino, and Sang Nguyen. 1993.
[3] Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normaliza- Directed hypergraphs and applications. Discrete applied mathematics 42, 2-3
tion. arXiv preprint arXiv:1607.06450 (2016). (1993), 177–201.
[4] Junjie Bai, Biao Gong, Yining Zhao, Fuqiang Lei, Chenggang Yan, and Yue Gao. [36] Yue Gao, Yifan Feng, Shuyi Ji, and Rongrong Ji. 2022. HGNN+: General hy-
2021. Multi-scale representation learning on hypergraph for 3D shape retrieval pergraph neural networks. IEEE Transactions on Pattern Analysis & Machine
and recognition. IEEE Transactions on Image Processing 30 (2021), 5327–5338. Intelligence 45, 3 (2022), 3181–3199.
[5] Song Bai, Feihu Zhang, and Philip HS Torr. 2021. Hypergraph convolution and [37] Yue Gao, Zizhao Zhang, Haojie Lin, Xibin Zhao, Shaoyi Du, and Changqing
hypergraph attention. Pattern Recognition 110 (2021), 107637. Zou. 2020. Hypergraph learning: Methods and practices. IEEE Transactions on
[6] Federico Battiston, Enrico Amico, Alain Barrat, Ginestra Bianconi, Guilherme Pattern Analysis & Machine Intelligence 44, 5 (2020), 2548–2566.
Ferraz de Arruda, Benedetta Franceschiello, Iacopo Iacopini, Sonia Kéfi, Vito [38] Johannes Gasteiger, Aleksandar Bojchevski, and Stephan Günnemann. 2019.
Latora, Yamir Moreno, et al. 2021. The physics of higher-order interactions in Predict then propagate: Graph neural networks meet personalized pagerank. In
complex systems. Nature Physics 17, 10 (2021), 1093–1098. ICLR.
[7] Federico Battiston, Giulia Cencetti, Iacopo Iacopini, Vito Latora, Maxime Lucas, [39] Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E
Alice Patania, Jean-Gabriel Young, and Giovanni Petri. 2020. Networks beyond Dahl. 2017. Neural message passing for quantum chemistry. In ICML.
pairwise interactions: Structure and dynamics. Physics Reports 874 (2020), 1–92. [40] Liyu Gong and Qiang Cheng. 2019. Exploiting edge features for graph neural
[8] Federico Battiston and Giovanni Petri. 2022. Higher-order systems. Springer. networks. In CVPR.
[9] Tatyana Benko, Martin Buck, Ilya Amburg, Stephen J Young, and Sinan G Aksoy. [41] Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for
2024. Hypermagnet: A magnetic laplacian based hypergraph neural network. networks. In KDD.
arXiv preprint arXiv:2402.09676 (2024). [42] Yan Han, Peihao Wang, Souvik Kundu, Ying Ding, and Zhangyang Wang. 2023.
[10] Austin R Benson, Ravi Kumar, and Andrew Tomkins. 2018. Sequences of sets. Vision hgnn: An image is more than a graph of nodes. In ICCV.
In KDD. [43] Xiaoke Hao, Jiawang Li, Mingming Ma, Jing Qin, Daoqiang Zhang, Feng Liu,
[11] Ginestra Bianconi. 2021. Higher-order networks. Cambridge University Press. Alzheimer’s Disease Neuroimaging Initiative, et al. 2024. Hypergraph convolu-
[12] Derun Cai, Moxian Song, Chenxi Sun, Baofeng Zhang, Shenda Hong, and tional network for longitudinal data analysis in Alzheimer’s disease. Computers
Hongyan Li. 2022. Hypergraph structure learning for hypergraph neural net- in Biology and Medicine 168 (2024), 107765.
works. In IJCAI. [44] Mikhail Hayhoe, Hans Matthew Riess, Michael M Zavlanos, Victor Preciado,
[13] Derun Cai, Chenxi Sun, Moxian Song, Baofeng Zhang, Shenda Hong, and and Alejandro Ribeiro. 2023. Transferable Hypergraph Neural Networks via
Hongyan Li. 2022. Hypergraph contrastive learning for electronic health records. Spectral Similarity. In LoG.
In SDM. [45] Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Gir-
[14] Hongmin Cai, Zhixuan Zhou, Defu Yang, Guorong Wu, and Jiazhou Chen. 2023. shick. 2022. Masked autoencoders are scalable vision learners. In CVPR.
Discovering Brain Network Dysfunction in Alzheimer’s Disease Using Brain [46] Lun Hu, Menglong Zhang, Pengwei Hu, Jun Zhang, Chao Niu, Xueying Lu,
Hypergraph Neural Network. In MICCAI. Xiangrui Jiang, and Yupeng Ma. 2024. Dual-channel hypergraph convolutional
[15] Lang Chai, Lilan Tu, Xianjia Wang, and Qingqing Su. 2024. Hypergraph model- network for predicting herb–disease associations. Briefings in Bioinformatics 25,
ing and hypergraph multi-view attention neural network for link prediction. 2 (2024), bbae067.
Pattern Recognition 149 (2024), 110292. [47] Buzhen Huang, Jingyi Ju, Zhihao Li, and Yangang Wang. 2023. Reconstructing
[16] Can Chen, Chen Liao, and Yang-Yu Liu. 2023. Teasing out missing reactions groups of people with hypergraph relational reasoning. In ICCV.
in genome-scale metabolic networks through hypergraph learning. Nature [48] Jie Huang, Chuan Chen, Fanghua Ye, Jiajing Wu, Zibin Zheng, and Guohui
Communications 14, 1 (2023), 2375. Ling. 2019. Hyper2vec: Biased random walk for hyper-network embedding. In
[17] Eli Chien, Chao Pan, Jianhao Peng, and Olgica Milenkovic. 2022. You are allset: DASFAA 2019 International Workshops: BDMS, BDQM, and GDMA.
A multiset function framework for hypergraph neural networks. In ICLR. [49] Jing Huang and Jie Yang. 2021. Unignn: a unified framework for graph and
[18] Eli Chien, Jianhao Peng, Pan Li, and Olgica Milenkovic. 2020. Adaptive Universal hypergraph neural networks. In IJCAI.
Generalized PageRank Graph Neural Network. In ICLR. [50] Xingyue Huang, Miguel Romero Orth, Pablo Barceló, Michael M Bronstein, and
[19] Uthsav Chitra and Benjamin Raphael. 2019. Random walks on hypergraphs İsmail İlkan Ceylan. 2024. Link prediction with relational hypergraphs. arXiv
with edge-dependent vertex weights. In ICML. preprint arXiv:2402.04062 (2024).
[20] Minyoung Choe, Sunwoo Kim, Jaemin Yoo, and Kijung Shin. 2023. Classification [51] Hyunjin Hwang, Seungwoo Lee, Chanyoung Park, and Kijung Shin. 2022. Ahp:
of edge-dependent labels of nodes in hypergraphs. In KDD. Learning to negative sample for hyperedge prediction. In SIGIR.
[21] Hejie Cui, Xinyu Fang, Ran Xu, Xuan Kan, Joyce C Ho, and Carl Yang. 2024. [52] Iacopo Iacopini, Giovanni Petri, Andrea Baronchelli, and Alain Barrat. 2022.
Multimodal fusion of ehr in structures and semantics: Integrating clinical records Group interactions modulate critical mass dynamics in social convention. Com-
and notes with hypergraph and llm. arXiv preprint arXiv:2403.08818 (2024). munications Physics 5, 1 (2022), 64.
[22] Guilherme Ferraz de Arruda, Giovanni Petri, and Yamir Moreno. 2020. Social [53] Ashish Jaiswal, Ashwin Ramesh Babu, Mohammad Zaki Zadeh, Debapriya
contagion models on hypergraphs. Physical Review Research 2, 2 (2020), 023032. Banerjee, and Fillia Makedon. 2020. A survey on contrastive self-supervised
[23] Manh Tuan Do and Kijung Shin. 2024. Unsupervised alignmnet of hypergraphs learning. Technologies 9, 1 (2020), 2.
with different scales. In KDD. [54] Junzhong Ji, Yating Ren, and Minglong Lei. 2022. FC–HAT: Hypergraph atten-
[24] Manh Tuan Do, Se-eun Yoon, Bryan Hooi, and Kijung Shin. 2020. Structural tion network for functional brain network classification. Information Sciences
patterns and generative models of real-world hypergraphs. In KDD. 608 (2022), 1301–1316.
[25] Yihe Dong, Will Sawin, and Yoshua Bengio. 2020. Hnhn: Hypergraph networks [55] Shuyi Ji, Yifan Feng, Rongrong Ji, Xibin Zhao, Wanwan Tang, and Yue Gao. 2020.
with hyperedge neurons. In ICML Workshop: Graph Representation Learning and Dual channel hypergraph collaborative filtering. In KDD.
Beyond. [56] Renqi Jia, Xiaofei Zhou, Linhua Dong, and Shirui Pan. 2021. Hypergraph
[26] Boxin Du, Changhe Yuan, Robert Barton, Tal Neiman, and Hanghang Tong. convolutional network for group recommendation. In ICDM.
2022. Self-supervised hypergraph representation learning. In Big Data. [57] Jianwen Jiang, Yuxuan Wei, Yifan Feng, Jingxuan Cao, and Yue Gao. 2019.
[27] Dheeru Dua, Casey Graff, et al. 2017. Uci machine learning repository. (2017). Dynamic hypergraph neural networks.. In IJCAI.
[28] Iulia Duta, Giulia Cassarà, Fabrizio Silvestri, and Pietro Liò. 2023. Sheaf hyper- [58] Nicolas Keriven and Gabriel Peyré. 2019. Universal invariant and equivariant
graph networks. In NeurIPS. graph neural networks. In NeurIPS.
[29] Vijay Prakash Dwivedi, Anh Tuan Luu, Thomas Laurent, Yoshua Bengio, and [59] Jinwoo Kim, Saeyoon Oh, Sungjun Cho, and Seunghoon Hong. 2022. Equivariant
Xavier Bresson. 2022. Graph neural networks with learnable structural and hypergraph neural networks. In ECCV.
positional representations. In ICLR. [60] Jinwoo Kim, Saeyoon Oh, and Seunghoon Hong. 2021. Transformers generalize
[30] Paul Expert and Giovanni Petri. 2022. Higher-order description of brain function. deepsets and can be extended to graphs & hypergraphs. In NeurIPS.
In Higher-Order Systems. Springer, 401–415. [61] Sunwoo Kim, Fanchen Bu, Minyoung Choe, Jaemin Yoo, and Kijung Shin. 2023.
[31] Bahare Fatemi, Layla El Asri, and Seyed Mehran Kazemi. 2021. Slaps: Self- How transitive are real-world group interactions?-Measurement and reproduc-
supervision improves structure learning for graph neural networks. In NeurIPS. tion. In KDD.
[32] Song Feng, Emily Heath, Brett Jefferson, Cliff Joslyn, Henry Kvinge, Hugh D [62] Sunwoo Kim, Minyoung Choe, Jaemin Yoo, and Kijung Shin. 2023. Reciprocity
Mitchell, Brenda Praggastis, Amie J Eisfeld, Amy C Sims, Larissa B Thackray, in directed hypergraphs: measures, findings, and generators. Data Mining and
KDD ’24, August 25–29, 2024, Barcelona, Spain Sunwoo Kim, Soo Yong Lee, Yue Gao, Alessia Antelmi, Mirko Polato, and Kijung Shin
contrastive learning. Briefings in Bioinformatics 24, 6 (2023), bbad371. convolutional networks on hypergraphs. In NeurIPS.
[119] Lev Telyatnikov, Maria Sofia Bucarelli, Guillermo Bernardez, Olga Zaghen, Si- [149] Naganand Yadati, Vikram Nitin, Madhav Nimishakavi, Prateek Yadav, Anand
mone Scardapane, and Pietro Lio. 2023. Hypergraph neural networks through Louis, and Partha Talukdar. 2020. NHP: Neural hypergraph link prediction. In
the lens of message passing: a common perspective to homophily and architec- CIKM.
ture design. arXiv preprint arXiv:2310.07684 (2023). [150] Jielong Yan, Yifan Feng, Shihui Ying, and Yue Gao. 2024. Hypergraph dynamic
[120] Loc Hoang Tran and Linh Hoang Tran. 2020. Directed hypergraph neural system. In ICLR.
network. arXiv preprint arXiv:2008.03626 (2020). [151] Yuguang Yan, Yuanlin Chen, Shibo Wang, Hanrui Wu, and Ruichu Cai. 2024.
[121] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Hypergraph Joint Representation Learning for Hypervertices and Hyperedges
Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you via Cross Expansion. In AAAI.
need. In NeurIPS. [152] Yichao Yan, Jie Qin, Jiaxin Chen, Li Liu, Fan Zhu, Ying Tai, and Ling Shao. 2020.
[122] Petar Veličković, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, Learning multi-granular hypergraphs for video-based person re-identification.
and R Devon Hjelm. 2019. Deep graph infomax. ICLR. In CVPR.
[123] Clement Vignac, Andreas Loukas, and Pascal Frossard. 2020. Building powerful [153] Chaoqi Yang, Ruijie Wang, Shuochao Yao, and Tarek Abdelzaher. 2022. Hyper-
and equivariant graph neural networks with structural message-passing. In graph learning with line expansion. In CIKM.
NeurIPS. [154] Chaoqi Yang, Ruijie Wang, Shuochao Yao, and Tarek Abdelzaher. 2022. Semi-
[124] Gourav Wadhwa, Abhinav Dhall, Subrahmanyam Murala, and Usman Tariq. supervised hypergraph node classification on hypergraph line expansion. In
2021. Hyperrealistic image inpainting with hypergraphs. In WACV. CIKM.
[125] Changlin Wan, Muhan Zhang, Wei Hao, Sha Cao, Pan Li, and Chi Zhang. 2021. [155] Zhilin Yang, William Cohen, and Ruslan Salakhudinov. 2016. Revisiting semi-
Principled hyperedge prediction with structural spectral features and neural supervised learning with graph embeddings. In ICML.
networks. arXiv preprint arXiv:2106.04292 (2021). [156] Jaehyuk Yi and Jinkyoo Park. 2020. Hypergraph convolutional recurrent neural
[126] Fuli Wang, Karelia Pena-Pena, Wei Qian, and Gonzalo R Arce. 2024. T-hypergnns: network. In KDD.
Hypergraph neural networks via tensor representations. IEEE Transactions on [157] Jaemin Yoo, Hyunsik Jeon, Jinhong Jung, and U Kang. 2022. Accurate node
Neural Networks and Learning Systems (2024). feature estimation with structured variational graph autoencoder. In KDD.
[127] Haorui Wang, Haoteng Yin, Muhan Zhang, and Pan Li. 2022. Equivariant and [158] Se-eun Yoon, Hyungseok Song, Kijung Shin, and Yung Yi. 2020. How much and
Stable Positional Encoding for More Powerful Graph Neural Networks. In ICLR. when do we need higher-order information in hypergraphs? a case study on
[128] Jianling Wang, Kaize Ding, Liangjie Hong, Huan Liu, and James Caverlee. 2020. hyperedge prediction. In WWW.
Next-item recommendation with sequential hypergraphs. In SIGIR. [159] Jiaxuan You, Jonathan M Gomes-Selman, Rex Ying, and Jure Leskovec. 2021.
[129] Jianling Wang, Kaize Ding, Ziwei Zhu, and James Caverlee. 2021. Session-based Identity-aware graph neural networks. In AAAI.
recommendation with hypergraph attention networks. In SDM. [160] Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, and
[130] Junqi Wang, Hailong Li, Gang Qu, Kim M Cecil, Jonathan R Dillman, Nehal A Yang Shen. 2020. Graph contrastive learning with augmentations. In NeurIPS.
Parikh, and Lili He. 2023. Dynamic weighted hypergraph convolutional network [161] Shan Yu, Hongdian Yang, Hiroyuki Nakahara, Gustavo S Santos, Danko Nikolić,
for brain functional connectome analysis. Medical Image Analysis 87 (2023), and Dietmar Plenz. 2011. Higher-order interactions characterized in cortical
102828. activity. Journal of neuroscience 31, 48 (2011), 17514–17526.
[131] Maolin Wang, Yaoming Zhen, Yu Pan, Zenglin Xu, Ruocheng Guo, and Xiangyu [162] Jiying Zhang, Yuzhao Chen, Xi Xiao, Runiu Lu, and Shu-Tao Xia. 2022. Learnable
Zhao. 2024. Tensorized hypergraph neural networks. In SDM. hypergraph laplacian for hypergraph learning. In ICASSP.
[132] Peihao Wang, Shenghao Yang, Yunyu Liu, Zhangyang Wang, and Pan Li. 2023. [163] Junwei Zhang, Min Gao, Junliang Yu, Lei Guo, Jundong Li, and Hongzhi Yin. 2021.
Equivariant hypergraph diffusion neural operators. In ICLR. Double-scale self-supervised hypergraph learning for group recommendation.
[133] Shun Wang, Yong Zhang, Xuanqi Lin, Yongli Hu, Qingming Huang, and Baocai In CIKM.
Yin. 2024. Dynamic Hypergraph Structure Learning for Multivariate Time Series [164] Jiying Zhang, Fuyang Li, Xi Xiao, Tingyang Xu, Yu Rong, Junzhou Huang, and
Forecasting. IEEE Transactions on Big Data 01 (2024), 1–13. Yatao Bian. 2022. Hypergraph convolutional networks via equivalency between
[134] Wei Wang, Gaolin Yuan, Shitong Wan, Ziwei Zheng, Dong Liu, Hongjun Zhang, hypergraphs and undirected graphs. ICML Workshop on Topology, Algebra, and
Juntao Li, Yun Zhou, and Xianfang Wang. 2024. A granularity-level information Geometry in Machine Learning.
fusion strategy on hypergraph transformer for predicting synergistic effects of [165] Muhan Zhang and Yixin Chen. 2018. Link prediction based on graph neural
anticancer drugs. Briefings in Bioinformatics 25, 1 (2024), bbad522. networks. In NeurIPS.
[135] Yuxin Wang, Quan Gan, Xipeng Qiu, Xuanjing Huang, and David Wipf. 2023. [166] Ruochi Zhang, Yuesong Zou, and Jian Ma. 2020. Hyper-SAGNN: a self-attention
From hypergraph energy functions to hypergraph neural networks. In ICML. based graph neural network for hypergraphs. In ICLR.
[136] Tianxin Wei, Yuning You, Tianlong Chen, Yang Shen, Jingrui He, and Zhangyang [167] Zizhao Zhang, Yifan Feng, Shihui Ying, and Yue Gao. 2022. Deep hypergraph
Wang. 2022. Augmentations in hypergraph contrastive learning: Fabricated and structure learning. arXiv preprint arXiv:2208.12547 (2022).
generative. In NeurIPS. [168] Sen Zhao, Wei Wei, Xian-Ling Mao, Shuai Zhu, Minghui Yang, Zujie Wen,
[137] Felix Wu, Amauri Souza, Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Dangyang Chen, and Feida Zhu. 2023. Multi-view hypergraph contrastive
Weinberger. 2019. Simplifying graph convolutional networks. In ICML. policy learning for conversational recommendation. In SIGIR.
[138] Jialun Wu, Kai He, Rui Mao, Chen Li, and Erik Cambria. 2023. MEGACare: [169] Yusheng Zhao, Xiao Luo, Wei Ju, Chong Chen, Xian-Sheng Hua, and Ming Zhang.
Knowledge-guided multi-view hypergraph predictive framework for healthcare. 2023. Dynamic hypergraph structure learning for traffic flow forecasting. In
Information Fusion 100 (2023), 101939. ICDE.
[139] Jinming Wu, Qi Qi, Jingyu Wang, Haifeng Sun, Zhikang Wu, Zirui Zhuang, [170] Yijia Zheng and Marcel Worring. 2024. Co-Representation Neural Hyper-
and Jianxin Liao. 2023. Not only pairwise relationships: fine-grained relational graph Diffusion for Edge-Dependent Node Classification. arXiv preprint
modeling for multivariate time series forecasting. In IJCAI. arXiv:2405.14286 (2024).
[140] Lianghao Xia, Chao Huang, Yong Xu, Jiashu Zhao, Dawei Yin, and Jimmy Huang. [171] Luo Zhezheng, Mao Jiayuan, Tenenbaum Joshua B., and Kaelbling Leslie, Pack.
2022. Hypergraph contrastive collaborative filtering. In SIGIR. 2023. On the expressiveness and generalization of hypergraph neural networks.
[141] Lianghao Xia, Chao Huang, and Chuxu Zhang. 2022. Self-supervised hypergraph In LoG.
transformer for recommender systems. In KDD. [172] Peng Zhou, Zongqian Wu, Xiangxiang Zeng, Guoqiu Wen, Junbo Ma, and
[142] Xin Xia, Hongzhi Yin, Junliang Yu, Qinyong Wang, Lizhen Cui, and Xiangliang Xiaofeng Zhu. 2023. Totally dynamic hypergraph neural network. In IJCAI.
Zhang. 2021. Self-supervised hypergraph convolutional networks for session- [173] Xue Zhou, Bei Hui, Ilana Zeira, Hao Wu, and Ling Tian. 2023. Dynamic relation
based recommendation. In AAAI. learning for link prediction in knowledge hypergraphs. Applied Intelligence 53,
[143] Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How powerful 22 (2023), 26580–26591.
are graph neural networks?. In ICLR. [174] Yuxuan Zhou, Zhi-Qi Cheng, Chao Li, Yanwen Fang, Yifeng Geng, Xuansong
[144] Ran Xu, Mohammed K Ali, Joyce C Ho, and Carl Yang. 2023. Hypergraph Xie, and Margret Keuper. 2022. Hypergraph transformer for skeleton-based
transformers for ehr-based clinical predictions. AMIA Summits on Translational action recognition. arXiv preprint arXiv:2211.09590 (2022).
Science Proceedings 2023 (2023), 582. [175] Fanglin Zhu, Shunyu Chen, Yonghui Xu, Wei He, Fuqiang Yu, Xu Zhang, and
[145] Ran Xu, Yue Yu, Chao Zhang, Mohammed K Ali, Joyce C Ho, and Carl Yang. Lizhen Cui. 2022. Temporal hypergraph for personalized clinical pathway
2022. Counterfactual and factual reasoning over hypergraphs for interpretable recommendation. In BIBM.
clinical predictions on ehr. In ML4H. [176] Zhaocheng Zhu, Zuobai Zhang, Louis-Pascal Xhonneux, and Jian Tang. 2021.
[146] Xixia Xu, Qi Zou, and Xue Lin. 2022. Adaptive hypergraph neural network for Neural bellman-ford networks: A general graph neural network framework for
multi-person pose estimation. In AAAI. link prediction. In NeurIPS.
[147] Naganand Yadati. 2020. Neural message passing for multi-relational ordered [177] Minhao Zou, Zhongxue Gan, Yutong Wang, Junheng Zhang, Dongyan Sui,
and recursive hypergraphs. In NeurIPS. Chun Guan, and Siyang Leng. 2024. Unig-encoder: A universal feature encoder
[148] Naganand Yadati, Madhav Nimishakavi, Prateek Yadav, Vikram Nitin, Anand for graph and hypergraph node classification. Pattern Recognition 147 (2024),
Louis, and Partha Talukdar. 2019. Hypergcn: A new method for training graph 110115.