2 Attention Based Graph Summarization For Large - Compressed
2 Attention Based Graph Summarization For Large - Compressed
This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TCE.2024.3411993
Abstract—Efficiently processing large-scale graphs for infor- continuous growth of highly interconnected datasets, marked
mation retrieval tasks presents a formidable hurdle, demanding by structures that are both extensive and intricately detailed,
innovative solutions for enhancing user experiences. This pa- presents a significant challenge. Extracting meaningful insights
per introduces a framework that merges attention-based graph
summarization with state-of-the-art graph sampling methods from these datasets demands the development of advanced
tailored explicitly for large-scale graph processing and informa- processing and analytical techniques. One particularly promis-
tion retrieval applications, all aimed at enriching user experi- ing approach that has surfaced to enhance user experience
ences. Our approach distinguishes itself through its adeptness in querying and understanding these complex relationships is
in efficiently handling vast graph datasets, leveraging robust graph summarization [3].
sampling techniques and attention mechanisms to enhance fea-
ture extraction. Central to our methodology is the utilization Graph summarization comprises a collection of algorithms
of graph summarization techniques, which focus on distilling tailored to specific applications, aimed at condensing intricate
pertinent information, thereby enhancing both the accuracy and graphs into more succinct representations [4]. This transfor-
computational efficiency of information retrieval and recommen- mation aims to preserve essential structural patterns, query
dation tasks. Through practical demonstrations, notably within answers, or specific property distributions, ensuring that users
academic databases, our framework showcases its effectiveness
in real-world scenarios, offering a significant advancement in the can interact with the data more intuitively [5]. The key goals
realm of personal technology data management and information of graph summarization in the context of user experience
retrieval systems. encompass minimizing graph data volumes, accelerating graph
Index Terms—Attention Mechanism, Graph Summarization, query evaluation, and improving graph visualization. These
Information Retrieval, Variational Graph Autoencoders objectives contribute to facilitating smoother interactions with
the underlying data.
Traditionally, graph summarization has relied on conven-
I. I NTRODUCTION
tional machine learning methods or a graph-structured query,
I N the digital landscape of academia, the quest for efficient such as degree, adjacency, or eigenvector centrality. These
Information Retrieval (IR) methods are ever-evolving. IR, approaches have been utilized in tasks such as node clustering,
the process of accessing and retrieving relevant information graph sampling, and subgraph extraction [6]. Node clustering
from vast collections of data, is a cornerstone of academic groups similar nodes together based on certain criteria, sim-
research [1]. Traditionally, scholars have relied on tabular data plifying the representation of intricate structures [7]. Graph
as a means to organize and present information. However, in sampling involves selecting a subset of nodes or edges that
recent years, there has been a paradigm shift towards utilizing preserves the essential characteristics of the entire graph [8].
graphs as a more dynamic and intuitive way to represent Subgraph extraction, on the other hand, identifies and isolates
relationships and connections within datasets. This transition relevant portions of the graph that capture specific patterns
from tables to graphs not only enhances the user experience or relationships [9]. While these conventional methods have
in digital environments for academics but also opens up demonstrated effectiveness to a certain extent, they face
new avenues for exploration and analysis [2]. However, the challenges such as computational intensity and a significant
demand for memory storage.
This work was supported in part by the Centre for Applied Artificial Intel-
ligence at Macquarie University and in part by Australian Research Council In response, deep learning, and more specifically Graph
Projects under Grant LP210301259 and Grant DP230100899 (Corresponding Neural Networks (GNNs), have emerged as promising alter-
authors: Nasrin Shabani; Amin Beheshti.) natives. GNNs are designed to capture intricate relationships
Nasrin Shabani, Amin Beheshti, Jia Wu, Venus Haghighi, and Jin Foo are
with the School of Computing, Macquarie University, NSW 2109, Australia. and dependencies within graph-structured data, making them
Emails: {nasrin.shabani@hdr, amin.beheshti, jia.wu, venus.haghighi@hdr, eu- well-suited for tasks like graph summarization [6]. Unlike
jin.foo@hdr}.mq.edu.au. traditional methods that rely on handcrafted features or query-
Alireza Jolfaei is with the College of Science and Engineering, Flinders
University, SA 5042, Australia. Email: [email protected]. based approaches, GNNs learn representations directly from
Maryam Khanian Najafabadi is with the School of Computer the graph structure. One notable category of GNNs is Vari-
Science, University of Sydney, NSW 2050, Australia. Email: ational Graph Autoencoders (VGAEs) [10], which fall under
[email protected].
This paper was produced by the IEEE Publication Technology Group. They the broader umbrella of generative models. VGAEs extend
are in Piscataway, NJ. traditional autoencoders to graph-structured data, combining
Manuscript received ..., 2024; revised ..., 2024. the power of deep learning with generative modeling [11].
0000–0000/00$00.00VGAEs aim to learn a latent representation of the graph,
© 2024 IEEE
orized licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on August 10,2024 at 05:52:08 UTC from IEEE Xplore. Restrictions a
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Consumer Electronics. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TCE.2024.3411993
effectively summarizing its essential features, offering effi- • Scalability for Large Graphs: Recognizing the chal-
ciency for large and complex graphs [12]. This integration lenges posed by large-scale graphs, prioritizing scalability
holds potential in IR where understanding relationships within by leveraging advanced graph sampling techniques. En-
graph structures is crucial. Current approaches in graph-based suring efficient processing and extraction of subgraphs
IR often use traditional methods [13] or graph representation from extensive datasets.
models [14]. However, they face several notable gaps, such as • Support for IR: Enabling users to pose queries and
scalability, limited semantic understanding, and the inability obtain insightful responses by navigating through the
to effectively capture nuanced relationships. VGAEs address extracted subgraphs and finally receiving accurate rec-
these limitations by providing a sophisticated mechanism for ommendations to proceed further. Enhancing the inter-
learning latent representations, thereby offering a promising pretability and utility of the system for end-users.
solution to enhance the capabilities of graph-based IR systems. • Interactive Dashboard: Incorporating an interactive
dashboard in GraphSUM, fostering user-friendly explo-
A. Motivating Scenario ration of large-scale graphs. Enabling dynamic interaction
with extracted subgraphs, providing visual, AI-driven
Consider a graduate student embarking on a research project
natural language, and statistical insights.
in the field of computer science. Their goal is to explore the
latest advancements in Natural Language Processing (NLP) The remainder of the paper is structured as follows: Section
techniques for sentiment analysis. Traditionally, digital tools II provides background and literature review. Section III
would navigate through tabular datasets, searching through formalizes the problem. In Section IV, we detail our approach.
rows and columns to gather relevant papers. However, as Section V presents experimental results, while Section VI out-
the field of NLP continues to evolve rapidly, the need for a lines a conducted user study. Finally, Section VII summarizes
more intuitive and dynamic approach to IR becomes apparent. findings and suggests future research avenues.
Graph-based representations offer a compelling alternative, al-
lowing researchers to visualize intricate relationships between
II. BACKGROUND AND R ELATED W ORK
papers, authors, keywords, and citation networks. Imagine
our graduate student, armed with a powerful graph-based IR A. Graph Summarization
system, exploring the interconnected web of NLP research lit-
The complex task of summarizing graphs, capturing their
erature. With a few clicks, they can uncover influential papers,
essence while reducing their size, has led to diverse ap-
identify key authors and research trends, and navigate through
proaches. For a more in-depth exploration, we encourage
citation networks with ease. Yet, as the volume and complex-
readers to refer to the comprehensive surveys on graph sum-
ity of academic datasets grow, traditional graph processing
marization [3], [4], [6].
techniques struggle to keep pace. Large-scale graphs present
Traditional approaches in graph summarization rely on con-
a formidable challenge, demanding innovative solutions to en-
ventional machine learning or graph-structured queries (e.g.,
hance user experiences and unlock the full potential of graph-
degree, adjacency, eigenvector centrality), categorized into
based IR systems. It is within this context that our framework
clustering-based, statistical inference, and optimization-based
emerges—a fusion of attention-based graph summarization
methods [3]. Clustering-based approaches simplify graphs
and cutting-edge graph sampling methods, tailored for large-
by grouping similar nodes [16], while statistical inference
scale graph processing and IR applications.
techniques identify graph patterns and sample subsets of
nodes/edges [17]. Optimization-based graph summarization
B. Contributions methods, like Kang et al.’s [14], aim to efficiently capture
Our research is dedicated to the development of a robust essential graph features while addressing specific goals. Their
approach for the efficient processing of large-scale graphs, personalized approach tailors the summary graph to a target
with a particular focus on enhancing IR. Building upon our node within a specified space budget. However, scalability
previous work [15], we present a novel extension aimed at issues may arise for very large graphs due to increased
further improving the capabilities of our innovative interactive computational demands. Deep learning-based approaches offer
visualization dashboard, GraphSUM. This extension incorpo- promising solutions to address these challenges. Leverag-
rates additional datasets, refines our methodology, introduces ing neural networks like Recurrent Graph Neural Networks
a recommendation feature to the IR module, and incorporates (RecGNNs), Convolutional Graph Neural Networks (Con-
users’ feedback to enhance the evaluation process. Leveraging vGNNs), Graph Autoencoders (GAEs), and Graph Attention
advanced techniques, such as attention mechanisms integrated Networks (GATs), these methods can capture complex pat-
with VGAE, our approach aims to provide a comprehensive terns and dependencies within graphs [6]. RecGNNs cap-
solution for knowledge extraction and representation. Our key ture temporal dependencies, ConvGNNs handle local pat-
contributions include: terns, GAEs learn low-dimensional representations preserving
• Attention Mechanisms with VGAE: Integrating at- structural information, and GATs aggregate information with
tention mechanisms with VGAE, emphasizing crucial attention mechanisms [18]. Moreover, deep generative-based
relationships within the graph to capture and ensure that approaches, such as VGAEs, combine variational autoencoders
the extracted subgraphs are not only relevant but also with graph structures, enabling the reconstruction of the input
enriched with important contextual information. graph while maintaining its inherent structure [19]. In our
orized licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on August 10,2024 at 05:52:08 UTC from IEEE Xplore. Restrictions a
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Consumer Electronics. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TCE.2024.3411993
research, we embrace VGAE for graph summarization, har- III. P ROBLEM D EFINITION
nessing its capacity to extract meaningful representations and In this section, we first briefly provide preliminary defini-
preserve crucial structural properties throughout the summa- tions and then formalize the problem studied in this work.
rization process. Definition 1 (Graph): A graph, denoted as G = (V, E),
comprises a set of nodes (V ) representing entities and a
B. Graph-based IR
set of edges (E) representing relationships within a large-
IR techniques deal with the retrieval of relevant information scale graph. In this context, V emphasizes the collection of
from large collections of data. It involves techniques for nodes, which are the fundamental entities in the graph, while
searching, retrieving, and ranking data points based on their E signifies the set of edges, capturing the relationships or
relevance to a user’s query [13]. Traditional IR approaches connections between these entities.
typically view the knowledge graph as a textual corpus. These Definition 2 (Summary Graph): A summary graph, de-
methods prioritize efficient information retrieval but often lack noted as SG = (VSG , ESG ), is a compact representation de-
a deep understanding of the full syntactic or semantic structure rived from the larger-scale graph G. VSG and ESG capture key
of the query [20]. Consequently, they encounter challenges elements and relationships, providing an efficient abstraction
such as scalability issues and difficulty in handling structured for analysis and exploration.
data and uncertainty. On the other hand, Neural Network-based The overarching challenge in this research is to develop
approaches employ deep learning to map natural language a robust approach for knowledge extraction from large-scale
queries to graph information, focusing on handling complex graphs, emphasizing both the enhancement of relevance in
queries and capturing entity relationships. For instance, Shao subgraph extraction and the efficient processing of extensive
et al. [21] introduced frameworks for sequence labeling, graph datasets. Based on this, our primary goal is to design
improving query understanding by assigning semantic labels a methodology that addresses the dual challenges of ensuring
to words or phrases within search queries. Mao et al. [22] the relevance of extracted subgraphs to content-based queries,
introduced a GNN-based approach for item tagging, treating while also managing the complexities of processing large-scale
it as a link prediction task for IR and leveraging query logs to graphs efficiently.
construct a query-item-tag tripartite graph. Meanwhile, Li et
al. [23] employed a GAT-based approach with edge weights
for question answering over knowledge graphs. Although the IV. T HE P ROPOSED M ODEL
aforementioned methods enhance the efficiency and effective- In this work, we introduce an efficient IR system, de-
ness of IR over knowledge graphs, challenges still remain in signed to facilitate interactive exploration and visualization
terms of scalability, and robustness to noisy or incomplete techniques. Our system enables users to quickly comprehend
information. and navigate complex, large-scale graph-structured data while
Recently, researchers have employed diverse graph sum- receiving recommendations. The framework of our method,
marization techniques to boost the precision and efficiency illustrated in Fig. 1, comprises four main components: RW-
of graph-based IR, constituting another facet of related work based graph sampling, graph summarization, IR with recom-
in this domain. For example, Li et al. [24] introduced a mendations (IR-RS), and interactive visualization.
graph summarization technique, combining RecGNNs and
ConvGNNs for improved accuracy by emphasizing relation
A. Random walk-based Subgraph Sampling
relevance in embeddings. In another work, Jalota et al. [25]
proposed a framework named LAUREN that incorporates Graph sampling is a critical initial step in handling large-
established graph summarization methods, accompanied by scale graph data, involving the selection of a representative
filtering techniques, and investigates their effects on systems subset of nodes and edges to create a smaller graph while
for graph answering questions. Safavi et al. [5] suggested per- preserving essential characteristics [26]. In our approach, this
sonalized knowledge graph summarization, seeking to provide is pivotal for preparing data for subsequent graph summa-
a sparse summary of a large knowledge graph tailored to each rization, ensuring efficient processing of large-scale graphs.
user’s interests. While these endeavours enhance precision they Graph sampling based on random walks (RWs) is particularly
face scalability issues when dealing with massive graphs. significant, employing a technique that involves simulating a
Our GraphSUM approach diverges from question-centric walk across nodes. This method captures local neighborhood
methods, introducing text-centric queries and generating graph structures effectively, retaining the original graph’s topological
summaries from academic data, including titles and abstracts. characteristics and community structures [27].
Unlike systems with predefined queries [24], GraphSUM In this regard, we adopt an advanced RW-based graph
allows dynamic exploration, aligning closely with evolving sampling technique proposed by Zang et al. [28], Graphsaint.
researcher needs for an intuitive exploration of complex 2
knowledge graphs. Furthermore, the integration of advanced P (u) ∝ ( A
e(:,u) ) (1)
techniques such as VGAE as a generative model and advanced
graph sampling enhances the impact of GraphSUM. This where the term A e(:,u) denotes the u-th column of the normal-
combination of techniques enhances the overall performance ized adjacency matrix, emphasizing connections from node u
and utility of GraphSUM in addressing the challenges of large- to others. The squared L2 norm of this column guides the RW,
scale and intricate graph structures. favoring nodes with stronger connections.
orized licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on August 10,2024 at 05:52:08 UTC from IEEE Xplore. Restrictions a
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Consumer Electronics. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TCE.2024.3411993
!""
Title
Authors
Authors P5
Year
#$"
Citation
Venue
P8 Abstract
Title
P1 Authors
Year
P3
Citation
P2
P4
Venue
P7 Reparametrize
Abstract
Title from N(0,1)
ELU
Authors
Year
Citation P5 !"#
P3
Venue
!"! X GAT %
#$#
P1
'
Z
Abstract
GAT
P4
Abstract
P2 Title
Authors P6
&#
Abstract
Title
Authors
Year
Citation
Abstract
Title
P7 Title P3 GAT
Venue Authors
A
Year Authors
Citation Year
Year
Venue Citation
Citation P7
Venue
Venue
P5 !"$
Paper P1
Citation #$$ P4
P3
Abstract
Title Nodes P2
Authors
Year
Citation
Features
Venue
'
m h
y
Su Grap
ar
m
,(') IR-RS
Output Graph
Abstract
,('% )
Title
Authors
Year
Citation
Venue
P9 Node Similarity
P9
P8 Ranking Measure Natural
P8 Abstract
Title
Flagged + Language
P7
((+% )
Authors
Recommendations
Notations
P1 Node Centrality Paper
RW: Random walk SG: Subgraph P3 Ranking Measures Input Graph
X: Node Features Matrix A: Adjacency Matrix +
P7
Z: Latent Variables M: Mapping Similarity Paper
P2
%: Mean of Latent Variables &#: Log Variance of Measure ,(')
ELU: Exponential Linear Latent Variables
Unit Activation Function
Fig. 1: The GraphSUM framework: (1) Graph Sampling, for efficient processing of large-scale graphs (Section IV-A); (2) Graph
Summarization, aimed at extracting key information from complex graphs for better analysis (Section IV-B); (3) IR-RS, using
summarized graph data to respond to user queries accurately (Section IV-C); and (4) Interactive Visualization, for user-friendly
graph exploration (Section IV-D).
Graphsaint enhances this process by minimizing the vari- present in graph data, especially in an unsupervised learning
ance σ 2 and bias often introduced during sampling. It em- context. Its ability to distil complex information into accessible
ploys a normalization factor αv for each node, calculated forms makes it a fundamental system component, significantly
as the inverse of the node’s sampling probability, given by enhancing its capabilities in question-answering and IR.
1 1) Attention Mechanism: The module integrates GAT into
αv = π(v) , where π(v) is the sampling probability of node v.
This normalization ensures that the sampled subgraphs are as the VGAE framework, introducing a potent attention mecha-
informative as the full graph. nism that dynamically prioritizes nodes and edges based on
One of the key advantages of this graph sampling technique their significance within the graph. The key difference from
is its ability to produce high-quality subgraphs that accurately traditional methods, which treat all nodes and edges equally,
represent the global structure of the original graph. By guiding lies in the attention weight αij assigned to an edge connecting
the random walk process based on the strength of connections nodes i and j. The key formulation for a GAT is:
and applying normalization to mitigate sampling biases, it X
hi = σ( αij W hj ) (2)
effectively balances the trade-off between local neighborhood
i∈N (j)
exploration and global graph structure preservation. By effi-
ciently capturing the essential characteristics of the original where hi is the hidden feature vector of node ui , N (j) is
graph, this graph sampling technique enables us to construct the set of neighbouring nodes of ui , hj is the hidden state of
informative graph summaries that facilitate downstream tasks neighbouring node uj , W is a weight matrix, and αij is the
such as IR and recommendation tasks, ensuring scalability and attention coefficient that measures the importance of node uj
effectiveness across increasingly large graphs. to node ui . The attention coefficients are computed as:
exp(eij )
B. Graph Summarization αij = sof tmaxj (eij ) = P (3)
k∈Ni exp(eik )
The graph summarization module condenses complex graph where eij is a scalar energy value computed as follow [29]:
input data into more manageable and informative represen-
tations [6]. Utilizing an attention-based VGAE framework,
this module is crucial for capturing the intricate patterns eij = LeakyReLU (aT [W hi ||W hj ]) (4)
orized licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on August 10,2024 at 05:52:08 UTC from IEEE Xplore. Restrictions a
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Consumer Electronics. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TCE.2024.3411993
Here a is a learnable parameter vector, and || denotes con- of the graph but also emphasizes the features deemed most
catenation. The LeakyReLU function introduces non-linearity relevant by the attention mechanism. Additionally, the adaptive
into the model and helps prevent vanishing gradients. nature of attention mechanisms allows VGAE to dynamically
In this context, the attention function computes the signifi- allocate attention based on the task’s requirements, ensuring
cance of each edge within the local graph neighborhood. By robust performance across different graph structures and com-
dynamically assigning importance levels to individual edges, plexities.
GAT facilitates a customized summarization process, guaran- 3) Unsupervised Learning for Complex Pattern Capture:
teeing that the resultant summary effectively encapsulates the One of the significant advantages of using a generative model
essential features of the graph. is its ability to operate unsupervised. This means the model
2) Encoding Graphs into Latent Space: The VGAE com- can learn to identify and summarize graph data patterns
ponent of the module functions by encoding the graph into without needing labelled training data. It operates by encoding
a latent space, represented as Z. This encoding is achieved the graph into a latent space Z, optimizing a variational
through a process that involves learning a compact, lower- lower bound on the likelihood of the graph data. This is
dimensional graph representation. This is expressed as a mathematically represented by the Evidence Lower Bound
function f : G → Z, where G represents the original graph (ELBO) in Equ. 8, which combines the reconstruction likeli-
data. The goal is to capture the graph’s essential features in hood (encouraging the decoded graph to resemble the original)
Z, while reducing the data’s complexity. with a regularization term using the Kullback-Leibler (KL)
We incorporate a two-layer GAT-based variational autoen- divergence.
coder. Each layer can be defined as follows [10]:
X (l)
ELBO = Eq(Z|X) [log p(X|Z)] − KL[q(Z|X)||p(Z)] (8)
Z (l+1) = σ D e −1/2 A
eDe −1/2 αij · W (l) · Zj (5)
j∈N (i) Here, X represents the observed graph data, and Z is the latent
space representation. The term q(Z|X) is the approximate
where Z (l) represents the latent node representations at layer posterior distribution over Z given X, and p(Z) is the prior
l of the encoder. A e is the adjacency matrix of the graph
distribution over Z.
with added self-loops. D e is the diagonal degree matrix of
This unsupervised approach offers a thorough compre-
e W (l) is the weight matrix for the lth layer. σ(·) is the
A. hension of the underlying patterns within the graph, which
activation function applied element-wise. In this study, we use enhances the system’s efficacy in managing intricate graph
ELU (Exponential Linear Unit) activation function due to the data.
smoothness of activation. Smoother activation functions help
to mitigate issues related to vanishing gradients, which can
occur during backpropagation. C. IR-RS
The VGAE introduces stochasticity during the training In our approach, the IR-RS module allows users to input
process by sampling the latent representation Z from a Gaus- their queries in a natural language format, making the system
sian distribution in the latent space. This stochastic sampling accessible and user-friendly. This module interacts directly
encourages the model to generate diverse graph representations with the graph summarization output – the simplified and
for the same input, which is beneficial for summarization. condensed representations of the original, complex graph data.
The model can capture different plausible summarizations, After embedding user queries using a pre-trained BERT model,
providing a richer representation of the input graph. The we employ a simple linear regression mapping technique to
VGAE use a variational inference model to map each node project both embeddings into a shared vector space. This
to a point in the latent space, represented by a mean (µ) linear mapping minimizes the mean squared error between
and log-variance (σ 2 ) using GAT layers. In our case which the predicted and actual embeddings, ensuring that the embed-
relationships are complex and non-linear, using a GAT layer dings of queries and graph summarization results are directly
rather than conventional (e.g., linear layer, ConvGNNs) is comparable. Once mapped into the shared vector space, cosine
more appropriate, as it explicitly considers the relationships similarity is computed to measure the similarity between query
between nodes as follows: embeddings and graph summarization embeddings. Concur-
rently, we extract keywords, compare them with paper node
µ = ELU (GATµ (αµ · Z (l+1) )) (6) keywords, and rank nodes. This process efficiently retrieves
2 (l+1) relevant information for user queries, providing accessible
σ = ELU (GATσ (ασ · Z )) (7)
and contextually relevant responses. Additionally, the module
(l+1) accepts paper indices/titles, provides recommendations for
where Z is the output of the last layer, GATµ and GATσ
are the GAT layer for mean and log-variance, αµ and ασ are further reading based on the output of graph summarization,
the attention coefficients. ELU is the activation function. and offers similar papers to the given title/index, enhancing
The latent representation for each node in Z is then sampled user exploration and discovery.
from a Gaussian distribution N (µ, σ 2 ), with the parameters µ Let Q represent the embedded query obtained from the
and σ 2 being functions of the node’s features and its attention pre-trained BERT model, and G denote the embedded graph
weight α. The integration of the attention mechanism results summarization result. Each of these embeddings resides in a
in a latent space that not only captures the essential features high-dimensional space, denoted by Rd , where d represents the
orized licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on August 10,2024 at 05:52:08 UTC from IEEE Xplore. Restrictions a
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Consumer Electronics. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TCE.2024.3411993
dimensionality of the embeddings. To map these embeddings generated through our graph summarization process, the rec-
into a shared vector space, we introduce a linear transformation ommendation technique identifies papers with similar content,
matrix W ∈ Rd×k , where k is the dimensionality of the shared domain, or concepts to the user’s specified paper of interest.
space. The mapped query Q̂ and mapped graph summarization
Ẑ are computed as Q̂ = QW and Ẑ = ZW , respectively. The D. Interactive Visualization
objective of linear regression is to learn the optimal values for
Through an interactive dashboard, GraphSUM facilitates a
W that minimize the mean squared error between the predicted
comprehensive user experience by providing functionalities
and actual embedding which can be formulated as finding W
that include the loading of an academic dataset, execution
that minimizes the following loss function:
of a pre-trained graph summarization model, and the input
of queries with the ability to specify the desired number of
min ||QW − Q̂||2 + ||ZW − Ẑ||2 (9) displayed papers. Users can further delve into the system’s
W
intricacies through the navigation of display tabs, allowing for
Once mapped into the shared vector space, the cosine
in-depth exploration of both input data and the resultant output
similarity between Q̂ and Ẑ is computed as:
graph. Incorporating GPT engines enriches the user experience
by generating narratives that elucidate the summarised graph’s
Q̂ · Ẑ
CS(Q̂, Ẑ) = (10) characteristics. Moreover, the interactive graph feature empow-
||Q̂|| · ||Ẑ|| ers users to conduct detailed examinations, enabling zoom,
This metric calculates the cosine of the angle between the drag, and node-specific interactions for a nuanced exploration
two vectors, providing a measure of their similarity regardless of academic papers within the dataset.
of their magnitudes.
The assessment of extracted keywords entails measuring the V. E XPERIMENTS
similarity between sets of node keywords and query keywords, In this section, we will discuss the experimental phase
quantified by the Keyword Similarity (KS) metric: of our research, providing a full overview of the important
components that contribute to our study. The following sub-
|KQ ∩ Ki | sections will outline the aspects of our experimental approach,
KS(KQ , Ki ) = p (11)
|KQ | · |Ki | including the data sources, experimental setup, compared
methods, evaluation metrics, ablation study, and sensitivity
Here, KQ represents the set of keywords in the query, Ki analysis. Concluding this section is a thorough discussion that
denotes the set of keywords associated with a specific node, brings together our findings and offers a detailed interpretation
and KS(KQ , Ki ) computes the similarity between the two sets. of the results, aiming to provide a nuanced understanding of
This formula calculates the intersection of the two sets and our research outcomes.
normalizes it by the square root of the product of their sizes,
effectively capturing the degree of overlap relative to their
individual sizes. A. Data
The overall Similarity Score (OS) can be defined as a Our experimental validation of GraphSUM employs two
weighted combination: real-world datasets obtained from the ACM and DBLP aca-
demic repositories [30]. Table I presents an overview of the
OS(Q̂, Ẑ, KQ , Ki ) = α × CS(Q̂, Ẑ) + β × KS(KQ , Ki ) (12) raw data within these datasets. They consist of a significant
corpus of scholarly articles, with the DBLP dataset con-
Here, α and β represent the respective weights assigned
taining 1,196,476 nodes (representing papers) and 4,860,642
to the CS and KS measures. The use of different weights
edges (representing citations), and the ACM dataset containing
allows for flexibility in emphasizing the contribution of each
1,146,695 nodes and 4,725,106 edges. Each paper in these
similarity measure to the overall relevance assessment. In our
datasets is extensively annotated, including attributes such
case, with α = 0.5 and β = 0.5, we assign a higher importance
as title, authors’ names, publication year, and venue. To
to the cosine similarity measure (α) compared to the keyword
enhance categorization, papers are classified into distinct sub-
similarity (β).
ject areas such as Computer Networks, Artificial Intelligence,
After obtaining similarity scores, a node ranking mechanism
Programming Languages, Databases, Security, and Software
is applied, and the top n-ranked nodes are selected as input
Engineering, utilizing metadata provided by the sources. Each
to form a relevant subgraph. This process ensures that the
publication is represented by a TF/IDF weighted word vector,
resulting subgraph is contextually aligned with the input query.
drawn from a vocabulary of 2000 unique words. The unique
This translation from complex graph data to a small subset
words are derived from the papers’ titles and published venues.
of graphs with a list of answers is a significant aspect of
the module, enhancing the system’s interactivity and usability.
Moreover, the system incorporates a recommendation feature B. Experimental setup
called RS that leverages a similar approach and calculates The experimental setup for this study involved utilizing
the similarity of node embeddings combined with centrality computational resources primarily from a local environment.
measures to assist users in discovering the most relevant papers The deep learning-based graph summarization model was
related to their input paper. By analyzing the embeddings trained and tested on a MacBook Pro (Apple M3 Max chip
orized licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on August 10,2024 at 05:52:08 UTC from IEEE Xplore. Restrictions a
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Consumer Electronics. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TCE.2024.3411993
TABLE I: Datasets Overview: Node and Edge Counts, Feature Dimensions, and Sample Entries
Feature
Dataset #Nodes #Edges Features Example
dim
Title ‘Stochastic path tracing on consumer graphics cards’
Abstract ‘Dynamics of biological-ecological systems is strongly ...’
Authors [’Patrick Coquillard’, ’Alexandre Muzy’, ’Eric Wajnberg’]
Year 2009
DBLP 1,196,476 4,860,642 2000
Venue ‘arXiv: Quantitative Methods’
#Citations 0
ID ‘3c3901c8-dbc9-4567-9439-2b9937af6faa’
[’02a30d20-6a14-47b0-bf0f-c7d836bd4423’, ’4fc8da95-8e2f-
References
4316-8a03-0c0c37794c58’]
Title ‘Space-Time Trade-Offs for Banded Matrix Problems’
Authors John E. Savage
Year 1984
ACM 1,647,835 4,725,106 2000
Venue ‘Journal of the ACM (JACM)’
ID 11
References [289023, 408637, 600827, 688896, 2135000]
with 40-core GPU and 16-core Neural Engine). This local allowing nodes to attend over their neighbors’ representa-
setup provided the necessary resources for the design and tions with varying importance weights; GraphSAGE [31]
comprehensive testing of the final version of the dashboard. performs inductive representation learning by aggregating
Python 3.11.7 was the main programming language, utilizing feature information from a node’s local neighborhood.
the PyTorch Geometric library for deep learning tasks and DeepWalk [32] which leverages random walks to gener-
PyQt5 for the interactive dashboard’s design, ensuring a user- ate embeddings by treating them as sequences of nodes,
friendly experience for researchers. capturing both local and global graph structures; and
Node2vec [33] which extends DeepWalk by introducing
a biased random walk strategy to explore diverse neigh-
C. Compared Methods
borhoods.
In order to assess the effectiveness of our methodology, we 3) IR: We contrast our method with a baseline that involves
conducted a thorough analysis by comparing it with other a simplified keyword-based approach, where the keywords
approaches. We closely examine our method in comparison from the user query are directly matched with those associated
with different graph sampling, summarization, and IR ap- with the nodes in the graph. In the baseline, node ranking is
proaches. By carrying out this comparative analysis, we aim to determined solely by the extent of keyword overlap, without
demonstrate the practical usefulness of our chosen approach. considering the cosine similarity between query embeddings
1) Graph Sampling: We evaluate the RW-based mini-batch and the graph summarization output.
sampling method in contrast to commonly used baselines in
graph-related tasks. Our assessment specifically focuses on:
• Random Node Sampler: Involves the random selection of D. Evaluation Metrics
nodes within a graph and the subsequent return of their To assess the performance of our graph summaries, we
induced subgraph. employ Link Prediction (LP) as an unsupervised learning task,
• Neighbor-based Sampler: Performs neighbor sampling as
which refers to the task of predicting missing or potential con-
introduced in [31]. This sampling method allows for nections (edges) between nodes in a graph. Incorporating LP
mini-batch training on large-scale graphs where full-batch for graph summarization aids in more effective, interpretable,
training is not feasible. and scalable graph summarization, capable of capturing and
2) Graph Summarization: To evaluate the performance of representing the intricate structures and relationships present
our proposed GAT-based VGAE for graph summarization, we within the graph data. We randomly split graph edges for
conduct a comparative analysis against: training, validation, and testing, sampling positive edges based
• Conventional VGAE: Employs convolutional architec- on specified ratios and generating negative edges through
tures for VGAE, aiming to learn node representations random node pairings without existing connections. We mea-
by capturing local graph structures in a low-dimensional sure the model’s performance using evaluation metrics such
latent space [10]. as Area Under the ROC Curve (AUC-ROC) and Average
• GAEs: Designed for learning low-dimensional represen- Precision (AP). AUC-ROC measures the area under the ROC
tations of nodes in a graph by reconstructing the adja- curve, which plots the True Positive Rate (Sensitivity) against
cency matrix or node features. It typically consists of the False Positive Rate (1 - Specificity) at various decision
convolutional-based encoders and decoders [10], [19]. thresholds. On the other hand, AP computes the area under the
• Embedding-based methods: We also assess GAT-VGAE Precision-Recall curve which plots Precision against Recall at
performance against several state-of-the-art embedding- various decision thresholds.
based approaches for node embedding: GAT [23] en- For assessing IR-RS performance, we consider the fol-
hances ConvGNNs by introducing attention mechanisms, lowing metrics. Additionally, we leverage metadata such as
orized licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on August 10,2024 at 05:52:08 UTC from IEEE Xplore. Restrictions a
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Consumer Electronics. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TCE.2024.3411993
papers’ fields of study and keywords to assess the accuracy of TABLE II: Performance Metrics of LP and IR Across Different
the responses. Graph Sampling Techniques for DBLP dataset.
• Top-k Accuracy (k@hit) refers to the proportion of ques- Metrics Random node Neighbor-based RW-based
tions for which the correct answer appears within the sampling sampling sampling
ROC-AUC 0.7883 0.9129 0.9538
IR-RS LP
top-k ranked candidate answers generated by the model.
AP 0.7782 0.9233 0.9543
To compute k@hit, divide the number of questions where 1@hit 0.70 0.70 0.70
the correct answer is within the top-k predictions by the P@5 0.50 0.58 0.61
total number of inputs. MRR 0.75 0.76 0.80
• Precision at k (P@k) focuses on the precision within the
top-k predictions, providing a measure of how many of TABLE III: Performance Metrics of LP and IR Across Differ-
the top-ranked answers are correct. ent Graph Sampling Techniques for ACM dataset.
Metrics Random node Neighbor-based RW-based
Number of Correct Answers in Top-k sampling sampling sampling
P@k = (13)
k × Total Number of Queries ROC-AUC 0.7989 0.8755 0.8987
IR-RS LP
AP 0.8162 0.8741 0.9147
• Mean Reciprocal Rank (MRR) is the average of the 1@hit 0.60 0.70 0.80
Reciprocal Ranks (RR) across multiple queries. RR is P@5 0.53 0.50 0.55
MRR 0.80 0.81 0.85
a metric that evaluates the ranking of a relevant item in
a list. Here, rank is the position of the first relevant item
in the ranked list. MRR provides a summary measure of
how well a system, on average, ranks the first relevant on a fixed-size neighborhood around each node, and random
item in a set of queries. node sampling, which involves the entire graph, RW-based
sampling strikes a balance by considering both local and global
|Q|
1 X 1 information within the graph.
MRR = (14)
|Q| i=1 ranki Expanding the assessment to include the IR-RS module,
we employ the metrics 1@hit, P@5, and MRR, across twenty
distinct queries in the context of semantic search. The findings
E. Ablation Study
suggest that in this context, the RW-based sampling method
In this section, we conduct ablation experiments to elucidate surpasses the neighbor-based sampling approach and demon-
the individual contributions of each component in GraphSUM, strates comparable performance to the random-node sampling
aiming to identify the model that exhibits optimal perfor- method, which considers the entire graph. This consistency
mance. Three distinct experimental settings are employed to across LP, IR, and recommendation tasks on both datasets un-
modify/exclude specific aspects from the original structure. derscores the effectiveness of the RW-based sampling method
Here’s a breakdown of each modification and its purpose: in handling various aspects of the graph-based framework.
• GraphSUM-M1: Instead of using RW-based graph sam- 2) GraphSUM-M2: This experiment focuses on assessing
pling, alternative methods such as random node sampling the impact of different graph summarization and embedding
or neighbor-based sampling are employed. techniques on the model’s performance by replacing the orig-
• GraphSUM-M2: Attention-based graph summarization inal attention-based graph summarization. The replacement
is replaced with alternative techniques, including methods include GAEs with different encoders, VGAEs, and
convolutional-based GAEs, VGAEs, and other embed- other embedding methods such as Node2vec and DeepWalk.
ing methods such as GAT, Graphsage, Deepwalk, and In the context of LP, the final results, detailed in Figure 2,
Node2vec. reveal that GAT-VGAE outperforms the alternative techniques.
• GraphSUM-M3: The IR-RS module is replaced with a Expanding the evaluation to the IR-RS module highlights
simpler approach based on keyword matching without GAT-VGAE’s superior performance compared to other meth-
any summarization. ods in this domain. Notably, our findings demonstrate that
1) GraphSUM-M1: We conduct a thorough comparative GAT-VGAE outperforms the alternatives, as illustrated in the
analysis of three graph sampling methods to assess the ef- accompanying figure. The metrics used for evaluation are
ficiency and scalability of GraphSUM. The performance met- similar to the previous experiment. These results underscore
rics, showed in Tables II and III, provide valuable insights into GAT-VGAE as the most effective method for both LP and IR
the effectiveness of diverse sampling methods- RW-based [28] tasks with recommendations in this experiment. It’s crucial to
and neighbor-based [31], and random node sampling. Notably, note that while DeepWalk and Node2vec embeddings capture
the RW-based sampling method outperforms both neighbor- the structural information of the graph, they fall short in
based and random-node sampling methods in terms of ROC- directly encoding the semantic meaning of nodes, particularly
AUC and AP metrics for LP. Despite the popularity of in contexts like IR and Recommendations where semantic
neighbor-based sampling as a graph sampling method, it understanding holds paramount importance. This underscores
lags behind in this analysis. This observation underscores the the substantial impact of the chosen graph summarization tech-
superior efficacy of the RW-based sampling approach for LP nique on the overall model performance, thereby reinforcing
tasks. In contrast to neighbor-based sampling, which focuses the importance of attention-based summarization.
orized licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on August 10,2024 at 05:52:08 UTC from IEEE Xplore. Restrictions a
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Consumer Electronics. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TCE.2024.3411993
G. Discussion
3) GraphSUM-M3: In this experimental iteration, we re- Here, we delve deeper into the implications arising from the
place the IR-RS module with a simple keyword-matching findings of our study, assessing both its strengths and limita-
approach, eliminating the inclusion of any graph summariza- tions, and highlighting their significance within the domain of
tion. This experimental scenario is iterated for twenty distinct large-scale graph processing and IR.
queries, and the evaluation metrics employed are [1@hit, p@5, The experiments revealed the superiority of the RW-based
MRR]. The outcomes, [0.5, 0.45, 0.66] for DBLP and [0.6, graph sampling method over alternatives like neighbor-based
0.33, 0.66] for ACM, underscore a significant enhancement in and random node sampling, emphasizing its efficacy in cap-
overall performance when the output of graph summarization turing both local and global information within the graph.
node representations becomes an integral part of the IR This observation aligns with the intuition that a balanced
process. The results show that integrating graph summarization approach considering local and global contexts is crucial for
significantly enhances the accuracy and relevance of answers tasks such as LP and semantic search. The experiment evaluat-
provided by the IR-RS module. This improvement is attributed ing different graph summarization and embedding techniques
to the more comprehensive representation of graph content underscored the significance of attention-based summarization,
facilitated by the graph summarization module. The contextual particularly exemplified by the GAT-VGAE. While techniques
understanding obtained during summarization significantly like Node2vec and DeepWalk capture structural information,
contributes to a more detailed and informed generation of they fall short in encoding semantic meaning, especially cru-
responses. cial for tasks like IR and recommendations. This reaffirms the
importance of leveraging attention mechanisms for effective
As evident from the results, GraphSUM consistently out-
knowledge extraction and representation in large-scale graphs.
performs all ablation methods. For a detailed overview, refer
The inclusion of the graph summarization module significantly
to the summary of the results in Table IV, where the optimal
enhanced the accuracy and relevance of responses in the
model is compared with the ablation experiments. In instances
IR-RS module. This improvement can be attributed to the
where multiple models are involved in an experiment, we opt
comprehensive representation of graph content facilitated by
for the second-best model to be compared with GraphSUM.
In this table, GraphSUM consistently demonstrates superior
performance, particularly when compared to scenarios where TABLE IV: Comparison of the ablation study for GraphSUM.
the proposed graph summarization method is omitted or re-
placed with traditional node embedding methods. The model’s -M1 -M2 -M3 GraphSUM
performance is notably weakened in the absence of the graph 1@hit 0.7 0.6 0.5 0.7
DBLP
summarization. The incorporation of the graph summarization P@5 0.58 0.55 0.45 0.61
MRR 0.76 0.75 0.66 0.80
method significantly enhances the model’s capabilities. This
method not only improves the understanding of underlying 1@hit 0.7 0.7 0.6 0.8
ACM
semantics but also enables the thoughtful consideration of P@5 0.40 0.45 0.33 0.55
MRR 0.81 0.72 0.66 0.85
contextual information.
orized licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on August 10,2024 at 05:52:08 UTC from IEEE Xplore. Restrictions a
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Consumer Electronics. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TCE.2024.3411993
orized licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on August 10,2024 at 05:52:08 UTC from IEEE Xplore. Restrictions a
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Consumer Electronics. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TCE.2024.3411993
(A) (B)
Fig. 4: (A) A snapshot of the interactive visualization tool. GraphSUM efficiently processes large-scale graphs, with a particular
emphasis on summarizing and presenting this data to enhance IR; (B) GraphSUM User Study: (above) Usability and user
experience evaluation. (below) Correctness of responses for both IR and recommendations.
ing its applicability to diverse domains. Notably, we see po- [11] C. Chen, H. Lu, H. Hong, H. Wang, and S. Wan, “Deep self-supervised
tential in leveraging crowd-sourcing for user feedback, which graph attention convolution autoencoder for networks clustering,” IEEE
Transactions on Consumer Electronics, vol. 69, no. 4, pp. 974–983,
could provide valuable insights into the system’s usability and 2023.
effectiveness. By integrating crowd-sourcing mechanisms, we [12] F. Faez, Y. Ommi, M. S. Baghshah, and H. R. Rabiee, “Deep graph
aim to gather diverse perspectives on GraphSUM’s features, generators: A survey,” IEEE Access, vol. 9, pp. 106 675–106 702, 2021.
[13] N. Shabani, A. Beheshti, H. Farhood, M. Bower, M. Garrett, and H. A.
visualization methods, and overall user experience. Rokny, “icreate: Mining creative thinking patterns from contextualized
educational data,” in International Conference on Artificial Intelligence
in Education. Springer, 2022, pp. 352–356.
R EFERENCES [14] S. Kang, K. Lee, and K. Shin, “Personalized graph summarization:
formulation, scalable algorithms, and applications,” in 2022 IEEE 38th
[1] L. Pepa, A. Sabatelli, L. Ciabattoni, A. Monteriù, F. Lamberti, and International Conference on Data Engineering (ICDE). IEEE, 2022,
L. Morra, “Stress detection in computer users from keyboard and mouse pp. 2319–2332.
dynamics,” IEEE Transactions on Consumer Electronics, vol. 67, no. 1, [15] N. Shabani, A. Beheshti, J. Wu, M. K. Najafabadi, J. Foo, and A. Jolfaei,
pp. 12–19, 2021. “Graphsum: Scalable graph summarization for efficient question answer-
[2] J. Sun, S. Du, Z. Liu, F. Yu, S. Liu, and X. Shen, “Weighted heteroge- ing,” in Proceedings of the 27th International Conference on Extending
neous graph-based three-view contrastive learning for knowledge tracing Database Technology, EDBT 2024, Paestum, Italy, 25th March - 28th
in personalized e-learning systems,” IEEE Transactions on Consumer March, 2024. OpenProceedings.org, 2024.
Electronics, vol. 70, no. 1, pp. 2838–2847, 2024. [16] M. P. Boobalan, D. Lopez, and X. Z. Gao, “Graph clustering using k-
[3] Y. Liu, T. Safavi, A. Dighe, and D. Koutra, “Graph summarization neighbourhood attribute structural similarity,” Applied soft computing,
methods and applications: A survey,” ACM computing surveys (CSUR), vol. 47, pp. 216–223, 2016.
vol. 51, no. 3, pp. 1–34, 2018. [17] N. T. T. Ho, T. B. Pedersen et al., “Efficient temporal pattern mining
[4] Š. Čebirić, F. Goasdoué, H. Kondylakis, D. Kotzinos, I. Manolescu, in big time series using mutual information,” Proceedings of the VLDB
G. Troullinou, and M. Zneika, “Summarizing semantic graphs: a survey,” Endowment, vol. 15, no. 3, pp. 673–685, 2022.
The VLDB Journal, vol. 28, no. 3, pp. 295–327, 2019. [18] S. Brody, U. Alon, and E. Yahav, “How attentive are graph attention
[5] T. Safavi, C. Belth, L. Faber, D. Mottin, E. Müller, and D. Koutra, networks?” arXiv preprint arXiv:2105.14491, 2021.
“Personalized knowledge graph summarization: From the cloud to [19] T. N. Kipf and M. Welling, “Semi-supervised classification with graph
your pocket,” in 2019 IEEE International Conference on Data Mining convolutional networks,” arXiv preprint arXiv:1609.02907, 2016.
(ICDM). IEEE, 2019, pp. 528–537. [20] S. H. Farhi and D. Boughaci, “Graph based model for information
[6] N. Shabani, J. Wu, A. Beheshti, Q. Z. Sheng, J. Foo, V. Haghighi, retrieval using a stochastic local search,” Pattern Recognition Letters,
A. Hanif, and M. Shahabikargar, “A comprehensive survey on graph vol. 105, pp. 234–239, 2018.
summarization with graph neural networks,” IEEE Transactions on [21] Y. Shao, J. C.-W. Lin, G. Srivastava, A. Jolfaei, D. Guo, and
Artificial Intelligence, 2024. Y. Hu, “Self-attention-based conditional random fields latent variables
[7] D. Gibson, R. Kumar, and A. Tomkins, “Discovering large dense model for sequence labeling,” Pattern Recognition Letters, vol. 145,
subgraphs in massive graphs,” in Proceedings of the 31st international pp. 157–164, 2021. [Online]. Available: https://www.sciencedirect.com/
conference on Very large data bases, 2005, pp. 721–732. science/article/pii/S0167865521000635
[8] P. Hu and W. C. Lau, “A survey and taxonomy of graph sampling,” [22] K. Mao, X. Xiao, J. Zhu, B. Lu, R. Tang, and X. He, “Item
arXiv preprint arXiv:1308.5865, 2013. tagging for information retrieval: A tripartite graph neural network
[9] S. Dumbrava, A. Bonifati, A. N. R. Diaz, and R. Vuillemot, “Approx- based approach,” in Proceedings of the 43rd International ACM
imate querying on property graphs,” in Scalable Uncertainty Manage- SIGIR Conference on Research and Development in Information
ment: 13th International Conference, SUM 2019, Compiègne, France, Retrieval, ser. SIGIR ’20. New York, NY, USA: Association for
December 16–18, 2019, Proceedings 13. Springer, 2019, pp. 250–265. Computing Machinery, 2020, p. 2327–2336. [Online]. Available:
[10] T. N. Kipf and M. Welling, “Variational graph auto-encoders,” arXiv https://doi.org/10.1145/3397271.3401438
preprint arXiv:1611.07308, 2016. [23] J. Li and X. Hou, “Graph attention network with edge weights for
orized licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on August 10,2024 at 05:52:08 UTC from IEEE Xplore. Restrictions a
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Consumer Electronics. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TCE.2024.3411993
question answering over knowledge graph,” in Other Conferences, Alireza Jolfaei (Senior Member, IEEE) is an Asso-
2022. [Online]. Available: https://api.semanticscholar.org/CorpusID: ciate Professor of Cybersecurity and Networking in
249724456 the College of Science and Engineering at Flinders
[24] S. Li, K. W. Wong, C. C. Fung, and D. Zhu, “Improving question University. He is a Senior Member of the IEEE
answering over knowledge graphs using graph summarization,” in In- and a Distinguished Speaker of the ACM on the
ternational Conference on Neural Information Processing. Springer, topic of Cybersecurity. His main research interest
2021, pp. 489–500. is in Cyber-Physical Systems Security, where he in-
[25] R. Jalota, D. Vollmers, D. Moussallem, and A.-C. N. Ngomo, “Lauren- vestigates the hidden interdependencies in industrial
knowledge graph summarization for question answering,” in 2021 IEEE communication protocols and aims to provide funda-
15th International Conference on Semantic Computing (ICSC). IEEE, mentally new methods for security-aware modelling,
2021, pp. 221–226. analysis and design of safety-critical cyber-physica
[26] W. Zhao, T. Guo, X. Yu, and C. Han, “A learnable sampling method for systems in the presence of cyber-adversaries. He has been a chief investigator
scalable graph neural networks,” Neural Networks, vol. 162, pp. 412– of several internal and external grants with a total amount exceeding $2,6
424, 2023. million.
[27] A. Bojchevski, O. Shchur, D. Zügner, and S. Günnemann, “Netgan:
Generating graphs via random walks,” in International conference on
machine learning. PMLR, 2018, pp. 610–619.
[28] H. Zeng, H. Zhou, A. Srivastava, R. Kannan, and V. Prasanna, “Graph-
saint: Graph sampling based inductive learning method,” arXiv preprint Jia Wu (Senior Member, IEEE) is an Associate
arXiv:1907.04931, 2019. Professor of Data Science at Macquarie University.
[29] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Ben- He is currently the Research Director of the Centre
gio, “Graph attention networks,” arXiv preprint arXiv:1710.10903, 2017. for Applied Artificial Intelligence and the Director of
[30] J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su, “Arnetminer: Higher Degree Research in the School of Computing
extraction and mining of academic social networks,” in Proceedings at Macquarie University, Sydney, Australia. Dr Wu
of the 14th ACM SIGKDD international conference on Knowledge received his Ph.D. degree in computer science from
discovery and data mining, 2008, pp. 990–998. the University of Technology Sydney, Australia. His
[31] W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation current research interests include data mining and
learning on large graphs,” in Advances in Neural Information Processing machine learning. Since 2009, he has published
Systems. MIT Press, 2017, pp. 1024–1034. 100+ refereed journal and conference papers, includ-
[32] B. Perozzi, R. Al-Rfou, and S. Skiena, “Deepwalk: Online learning ing TKDE, TKDD, KDD, ICDM, WWW, and NeurIPS.
of social representations,” in Proceedings of the 20th ACM SIGKDD
international conference on Knowledge discovery and data mining,
2014, pp. 701–710.
[33] A. Grover and J. Leskovec, “node2vec: Scalable feature learning for
networks,” in Proceedings of the 22nd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, 2016, pp. 855– Venus Haghighi is currently a Ph.D. student
864. in computer science at the School of Computing,
Macquarie University, Sydney, NSW, Australia. The
focus of her research is to enhance classic GNN
models and explore robust graph learning paradigms
to detect and mitigate the camouflage behavior of
malicious actors in both static and dynamic net-
works. Her research interests include graph-based
Nasrin Shabani received a Master of Research
anomaly detection, graph neural networks, graph-
degree in Computer Science with First Class Hon-
based fraud detection, and graph data mining.
ours from the Macquarie University, Sydney, NSW,
Australia. She is currently pursuing a Ph.D. in Com-
puter Science at the same institution. Her research
interests lie at the intersection of graph mining,
graph summarization, and deep learning. Through
her work, she aims to develop novel algorithms and
Maryam Khanian Najafabadi is a leading figure
techniques that can extract meaningful insights and
in Computational, Artificial Intelligence, and Natural
patterns from complex graph data structures.
Language Processing, with extensive experience in
both academia and industry. Her numerous publi-
cations in top-tier journals, supervision of under-
graduate and postgraduate projects, and multiple
awards and accolades underscore her expertise and
contribution to the field. Leading various projects
across sectors, she applies innovative Artificial and
Amin Beheshti is a Full Professor of Data Science, Computational Intelligence methods to address real-
the Director of the Centre for Applied Artificial world challenges.
Intelligence, the head of the Data Science Lab, and
the founder of the Big Data Society at Macquarie
University, Sydney, Australia. Additionally, he is an
Adjunct Professor of Computer Science at UNSW
Sydney, Australia. Amin completed his PhD and Jin Foo is currently a 2nd year Master of Research
Postdoc in Computer Science and Engineering at student at the School of Computing, Macquarie Uni-
UNSW Sydney, and holds a Master’s and Bache- versity, Sydney, Australia. His research focuses on
lor’s degree in Computer Science, both with First anomaly detection with word embeddings, hashing
Class Honours. As a distinguished researcher in Big- algorithms and graph networks. He aims to design
Data/Data/Process Analytics, Amin has been invited to serve as a Keynote an efficient unsupervised anomaly detection method
Speaker, General- Chair, PC-Chair, Organisation-Chair, and program com- that can be used for real-time protection against
mittee member of top international conferences. He is the leading author of fraudulent behaviour.
several authored books in data, social, and process analytics, co-authored with
other high-profile researchers. To date, Amin has secured over $21 million in
research grants for AI-Enabled, Data-Centric, and Intelligence-Led projects.
orized licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on August 10,2024 at 05:52:08 UTC from IEEE Xplore. Restrictions a
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.