2022 - Wang-Zhang-Xiao-Song - A Review On Graph Neural Network Methods in Financial Applications - Journal of Data Science
2022 - Wang-Zhang-Xiao-Song - A Review On Graph Neural Network Methods in Financial Applications - Journal of Data Science
6339/22-JDS1047
April 2022 Data Science Reviews
Abstract
With multiple components and relations, financial data are often presented as graph data, since it
could represent both the individual features and the complicated relations. Due to the complexity
and volatility of the financial market, the graph constructed on the financial data is often
heterogeneous or time-varying, which imposes challenges on modeling technology. Among the
graph modeling technologies, graph neural network (GNN) models are able to handle the complex
graph structure and achieve great performance and thus could be used to solve financial tasks.
In this work, we provide a comprehensive review of GNN models in recent financial context. We
first categorize the commonly-used financial graphs and summarize the feature processing step
for each node. Then we summarize the GNN methodology for each graph type, application in
each area, and propose some potential research areas.
Keywords deep learning; finance; graph convolutional network; graph representation learning
1 Introduction
As the data collection techniques grow, graph data are commonly collected in many areas in-
cluding social sciences, transportation systems, chemistry, and physics (Wu et al., 2020). Repre-
senting complex relational data, graph data contain both the individual node information and
the structural information. Recently, there are growing interests in developing machine-learning
methods to model the graph data of various domains. Among them, graph neural network (GNN)
methods could achieve great performance on various tasks, including node classification, edge
prediction, and graph classification (Kipf and Welling, 2017; Zhang and Chen, 2018; Xu et al.,
2018). Performing node aggregation and updates, graph neural network models extend the deep
learning methodology to graphs and are gaining popularity.
A financial system is a complex system with many components and sophisticated relations,
which may be frequently updated. To represent the relational data in the financial domain,
graphs are commonly constructed, including the transaction network (Weber et al., 2019), user-
item review graph (Dou et al., 2020), and stock relation graph (Feng et al., 2019). By converting
the financial task into a node classification task, GNN methods are commonly utilized since it
performs well among graph modeling methods (Liu et al., 2019). For instance, GNN could be
utilized in a stock prediction task, by formulating it as a node classification task, where each
node represents a stock and edges represent relations between companies. Figure 1 demonstrates
the workflow of a stock prediction task using GNN methods. However, the complex nature
© 2022 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin
University of China. Open access article under the CC BY license.
Received November 3, 2021; Accepted April 10, 2022
112 Wang, J. et al.
Figure 1: Workflow for stock movement prediction task using GNN methodology. The graph
construction and feature processing steps present stock information in a graph and a feature
matrix, which is then used as the input for the GNN model. In the graph, nodes are connected
if there exist some relationships between stocks, such as supplier, competitor, shake-holder, etc.
A multi-layer perception layer (MLP) is used to output the price prediction result.
of financial systems may result in multiple data sources and complicated graph structures,
which imposes challenges on feature processing, graph construction, and graph neural network
modeling. Represented as numerical sequences or textual information, financial data need to be
processed with caution to keep the temporal pattern or semantic meanings. Also, the multi-facet
nature of financial relations make it hard to construct a graph to capture the relations. Moreover,
the financial-related graph is often heterogeneous or time-varying, which impose challenges on
existing graph neural network models. What’s more, to reflect some financial patterns (e.g.
device aggregation pattern, see Section 5.4 for details), GNN methods may need to be modified
such as changing losses and adding additional layers. Since financial systems process unique
characteristics and receive great attention, it is of significant importance to discuss and summary
the GNN methodology developed for financial tasks.
There are several recent reviews on graph neural networks. Among them, Wu et al. (2020)
present a comprehensive review on graph neural networks and categorize the GNNs into four
categories: recurrent graph neural networks, convolutional graph neural networks, graph auto-
encoders, and spatial-temporal graph neural networks. Zhou et al. (2020) provide a taxonomy
on GNN models based on graph type, training methods, and propagation steps. There is also
literature focusing on limited types of GNNs. Zhang et al. (2019) focus on graph convolutional
networks (GCN) and introduce two taxonomies to group the existing GCNs. Lee et al. (2019)
survey the literature on graph attention models and provided detailed examples on each type
of method. However, the aforementioned reviews focus on the general methodology and provide
little details for applications, seldom mentioning the financial application. Without covering
GNN models developed based on financial contexts, the reviewed models may not be applicable
to financial tasks due to the complexity of financial data. On the other side, review papers
focusing on the financial domain haven’t covered GNN methodologies in detail yet. Ozbayoglu
et al. (2020) summarize the machine learning and deep learning models in the financial field,
without mentioning the GNN methodologies. Huang et al. (2020) survey the financial deep
learning models in the finance and bank industry, and the GNN models are not covered. Jiang
(2021) review stock prediction-related machine-learning mythologies and mention GNN models
very briefly. In summary, existing GNN surveys focus on modeling methodology and do not
emphasize the financial application of GNN methods, while surveys on financial applications
A Review on GNN in Finance 113
don’t cover the GNN models in detail. To fill the gap, in this survey, we provide a systematic
and comprehensive review of graph neural network methods in the financial application.
In this paper, we present a thorough survey on graph neural network models with financial
application. We provide a comprehensive review of graph neural networks and summarize the
corresponding methods. This survey has contributions as follows.
• We systemically categorize the commonly-used financial graphs based on graph charac-
teristics and provide a thorough list of graphs. Graphs are categorized into five groups:
homogeneous graph, directed graph, bipartite graph, multi-relation graph, and dynamic
graph. We also present the GNN models according to their graph types, so that this review
could serve as a guide for implementing GNNs on real-life datasets.
• We provide a comprehensive list of financial applications that GNN methods are ap-
plied. We categorized the applications into five categories: stock movement prediction,
loan default risk prediction, recommender system of e-commerce, fraud detection, and
event prediction.
• We summarize various aspects of information for each application, including features,
graphs, GNN models, and available codes. A GitHub1 page is built to document the
collection of information. This work could be considered as a resource to understand,
implement and develop GNN models on multiple financial tasks.
• We identify five challenges and discuss the recent progress. We also suggest future direc-
tions for these problems.
The rest of the paper is organized as follows. Section 2 classifies financial graphs into
different categories based on its characteristics. Section 3 summarizes the commonly-used feature
processing techniques for each node in the graph. Section 4 presents the GNN methodology used
for each graph type. Section 5 provides a collection of application areas. Section 6 proposes some
challenges that could be future directions of research.
2 Graph Categorization
When preparing the data, how to construct the graph to represent the structural information
is essential and the type for the constructed graph could determine the follow-up modeling
methodology. In this section, we present the categorization of the graph based on its construction
methods and graph types. Table 1 presents a comprehensive list of graphs for financial tasks.
Definition 2.1 (Graph). A graph G is defined by a pair: G = (V, E), where V = {v1 , . . . , vn } is
a set of n nodes and E is a set of edges, where eij = (vi , vj ) ∈ E denotes an edge joining node vi
and node vj .
Definition 2.3 (Undirected graph and Directed graph). A undirected graph is a graph where
the edges are undirected. A directed graph is a graph where the edges have orientations. eij =
(vi , vj ) ∈ E denotes an edge pointing from node vi to node vj .
Remark. Undirected graph has a symmetric adjacency matrix, i.e., Aij = Aj i .
Definition 2.4 (Bipartite graph). A Bipartite graph is a graph whose nodes could be divided
into two non-empty and disjoint sets U, W, such that every edge connects a node in U and a
node in W.
Definition 2.5 (Homogeneous graph and Heterogeneous graph). In a graph G = (V, E), we can
assign a type to each node and edge; in this case, the graph is denoted as G = (V, E, A, R),
where each node vi ∈ V is associated with its type ai ∈ A, and each edge eij ∈ E is associated
with its type rij ∈ R. A homogeneous graph is a graph whose nodes are of the same type and
edges are of the same type. Otherwise, the graph is heterogeneous.
Definition 2.6 (Multi-relation graph). A Multi-relation graph is a graph where edges have
different types.
Definition 2.7 (Dynamic graph). A dynamic graph is defined as a sequence of graphs G seq =
{G1 , . . . , GT }, where Gi = (Vi , Ei ), for i = 1, . . . , T , where Vi , Ei are the set of nodes and edges
for ith graph in the sequence respectively.
Figure 2: Graph categorization based on graph characteristics. Each color of the circle represents
a node type and each color of the line represents an edge type. Arrows represent directed edges.
A homogeneous graph is a graph with one type of node and one type of edge. A directed graph is
a graph with directed edges. A bipartite graph is a graph with two types of nodes and edges only
exist between nodes of different types. A multi-relation graph has edges with different types. A
dynamic graph is a sequence of graphs.
Figure 3: Feature processing for sequential features and textual information. For sequential
numerical features, recurrent neural network (RNN) based approaches are commonly used to
capture the temporal dependencies. For text features, it is often processed utilizing natural lan-
guage processing (NLP) methods including word embedding, sentence embedding, and language
models, to convert the unstructured data to structured ones.
both nodes and edges could appear and disappear, it is hard to perform some graph operations
that require fixed dimensions of matrices. Thus, capturing the dynamically of the graph is
challenging and requires a more sophisticated methodology.
3 Feature Processing
With diverse data sources in the financial field, node features are commonly formatted as se-
quential numerical features or textual information. These data formats impose challenges on the
feature processing step since GNN methods could not be directly applied to these data formats.
In this section, we summarize the commonly-used feature processing techniques and how they
solve these challenges.
gradient problem. Using memory cells and gate units, it has the following expression:
ft = σ (Wf xt + Uf ht−1 + bf ),
it = σ (Wi xt + Ui ht−1 + bi ),
ot = σ (Wo xt + Uo ht−1 + bo ),
c̃t = tanh(Wc xt + Uc ht−1 + bc ),
ct = fc ◦ ct−1 + it ◦ c̃t−1 ,
ht = ot ◦ tanh(ct ),
where xt ∈ RD is the input vector at time t and D is the number of features, ft , it , ot , c̃t , ct , ht
denotes the forget gate, output gate, cell input, cell state and hidden state vectors respectively,
Wf , Wi , Wo , Wc , Uf , Ui , Uo , Uc are trainable weight matrices and bf , bi , bo , bc are trainable bias
vectors, σ (·)represents the sigmoid activation function, and ◦ denotes the element-wise product.
The hidden state of the LSTM on day t is denoted by: ht = LST M(xt , ht−1 ), s − l t s.
Since the LSTM updates the hidden state to capture the structural information, a common
approach to encode the historical data is generating sequential embedding E s using the last
hidden state of LSTM, E s = LSTM(X s ) ∈ Rn×u , where u is the dimension of the output feature.
Then the encoded pricing information is used as input to the graph neural network. For instance,
Chen et al. (2018) used the generated sequential embedding E s as the input feature matrix for
graph convolutional network and Feng et al. (2019) utilized E s as the input feature matrix
for their proposed temporal graph convolutional layer. Using the last hidden state as an input
feature, this type of method could capture the information in the past days while having an
appropriate format to feed into the GNN model.
rt = σ (Wr xt + Ur ht−1 + br ),
zt = σ (Wz xt + Uz ht−1 + bz ),
h̃t = tanh(Wh xt + Uh (rt ◦ ht−1 ) + bh ),
ht = (1 − zt ) ◦ ht−1 + zt ◦ h̃t ,
where xt ∈ RD is the input vector at time t for stock i and D is the number of features, rt ,
zt , h̃t , ht denotes the reset gate, update gate, candidate activation and hidden state vectors
respectively, Wr , Wz , Wh , Ur , Uz , Uh are trainable weight matrices and br , bz , bh are trainable
bias vectors, σ (·) represents the sigmoid activation function, and ◦ denotes the element-wise
product. The hidden states of the GRU on day t is denoted by: ht = GRU(xt , ht−1 ).
Utilizing GRU to encode the past numerical information, we could obtain the hidden state
for each day. Since past days’ impact on the current-day representation may differ, an attention
mechanism is frequently used to assign weights differently. For instance, Sawhney et al. (2020a)
use an additive attention mechanism to aggregate the hidden states across time. Cheng et al.
(2020) utilize the concatenated attention method to incorporate different importance of time.
The attention mechanism aggregates the hidden states of past days and assigns different weights
across time. To get the feature representation at time s, it rewards the influential days when
120 Wang, J. et al.
aggregating the hidden states from time s − l to time s, and thus take temporal dependencies
into account. The obtained node representation is then used as a feature in the graph neural
network model.
where hl is the hidden feature vector for node i in the l th layer, W l is trainable weight matrix, a is
a learnable vector, Ni is the neighborhoods of node i, αijl represents attention coefficient of node
j to i at l th layer, σ (·) is the activation function, denotes vector concatenation, and LReLU
denotes the leaky ReLU activation function. It is also worth mentioning that, since GAT is able
to learn the weights of the neighboring node, we could interpret the learned attention weights
as a relative importance measure, to better understand the model. Similar to GCN, GAT is also
often used as a benchmark method in the reviewed papers with about 40% coverage.
outputs. It applies gated recurrent unit (GRU) as a recurrent function and is constructed as
follows:
at = AT ht−1 + b,
rt = σ (Wr at + Ur ht−1 ),
zt = σ (Wz at + Uz ht−1 ),
h̃t = tanh(Wh at + Uh (rt ◦ ht−1 )),
ht = (1 − zt ) ◦ ht−1 + zt ◦ h̃t ,
where ht is the updated event representation at t th step, at contains information transferred from
both directions’ edges, rt , zt , h̃t denotes the reset gate, update gate, and candidate activation
vectors at t th respectively, Wr , Wz , Wh , Ur , Uz , Uh are trainable weight matrices, b is a trainable
bias vector, σ (·) is the sigmoid function, ◦ denotes the element-wise product, and tanh denotes
the hyperbolic tangent function. Incorporating the adjacency matrix A, GGNN aggregates the
structural information in every propagation step. Unrolling the recurrence function to a fixed
number, the GGNN ensures convergence without constraining the parameters.
where euw represents the edge representation for edge linking node u and w, hluw , hlu , hlw , hlN (u) ,
hlN (w) are hidden states at l th layer, WEl , WUl , WN
l l l
(U ) , WN (W ) , WW are trainable weight matrices
at l th layer, AGGE (·), AGGU (·), AGGU W (·), AGGW (·), AGGW U (·) are user-chosen aggregation
functions, σ (·) is the activation function, and concat denotes the vector concatenation.
A Review on GNN in Finance 123
There exist many modeling methodologies for bipartite graphs that could fit into the above
framework. For instance, Zhang et al. (2020) propose a similar model structure as the frame-
work and apply the attention mechanism as the aggregation function, since different items may
have different impacts when learning uses’ representations. Instead of using all neighbors, Li
et al. (2019a) utilize a sampling technique when aggregating neighbors’ information in each it-
eration. There are also literature using the above framework as a building block and combining
clustering methodology to learn a hierarchical representation of the graph, since hierarchical
representation with various GNN models could achieve satisfactory performance. For instance,
Li et al. (2019b) utilize the node embedding generated from the framework to cluster users into
different communities and make a recommendation based on both community information and
user information. Specifically, the user information is decomposed into two orthogonal spaces
representing community-level information and individualized user preferences. Li et al. (2020c)
treat the framework as a GNN module and stack it in a hierarchical fashion. With the embed-
ding generated from the framework, clustering algorithms are performed to generate a coarsened
graph which is used as an input for the next GNN layer.
where hli,r is the subgraph-specific embedding of node i in subgraph r in l th layer, hli is the general
embedding of node i in l th layer, AGGr (·) is the aggregation function in subgraph r, AGG(·) is
the inter-relation aggregation function, and f (·), g(·) are user-defined functions.
There are multiple methods that could be categorized into the above two-step framework.
For instance, in a fraud classification task, Liu et al. (2018) observe that fraudsters tend to
congregate in topology and thus use weighted sum for within-relation aggregation to capture
this congregation pattern. They then apply an attention mechanism for inter-relation aggregation
to learn the significance for each sub-graph as follows:
where Xi is the feature vector for node i, W is a trainable matrix, σ (·)is the activation function,
and Attention denotes the attention aggregator.
To incorporate neighbors’ information, Dou et al. (2020) use the mean aggregator for within-
relation aggregation. To reduce the computational cost and keep the relational importance infor-
mation, they apply a pre-calculated parameter prl as the weight in the intra-relation aggregation
step. They also test several aggregating functions when aggregating relation-specific embeddings
with the following structure:
hli,r = σ (Mean{hl−1
j,r }), ∀j s.t.(i, j ) ∈ Er ,
hli = σ (hl−1
i + AGG{hli,r · prl }), ∀r ∈ (1, . . . , R),
hli,r = MLP{hl−1
i,r }, where hi,r = Attention{xj,r : ∀j s.t.(i, j ) ∈ Er },
1
appear again backing up other companies’ loans. To overcome the limitation, Pareja et al. (2020)
proposed EvolveGCN which utilizes a recurrent neural network to evolve the GCN parameters
instead of updating the node embeddings. For each time point t, a GCN model is constructed
as follows to fit the graph Gt :
−1 −1
Htl+1 = σ (D̃t 2 Ãt D̃t 2 Htl Wtl ),
where Ãt is the adjacency matrix with added self-connections, D̃t = diag( j Ãij ) is the degree
matrix, Wtl is trainable weight matrix of l th layer, Htl is the matrix of activation in the l th layer,
and σ (·) is the activation function.
To update the weight matrix Wtl , Pareja et al. (2020) propose two methods. The first method
considers Wtl as a hidden state of the dynamics and update it using a GRU model, as shown in
equation (8). The second method treats Wtl as an output state which is updated using a LSTM
method, as shown in equation (9). The structure for both methods is as follows:
Compared to the second method, the first method incorporates the updated node embedding
in the recurrent neural network and it may lead to better performance when node features are
informative.
5 Application
In this section, we have detailed some financial applications that the GNN methods have been
commonly applied on. We have also summarized features, graphs, methods, evaluation metrics,
and baselines used in each financial application in the supplementary materials.
and capture the temporal patterns are also critical. Also, the financial industry has rich data
sources including financial statements, news and pricing information, which impose difficulty on
modeling the data.
There are multiple ways to construct the stock relational graph. For instance, believing that
correlation on historical prices reflects the inter-stock relation, Li et al. (2020a) construct the
graph using the correlation matrix of historic data to predict the movement of Tokyo stock price
index. On the other hand, Matsunaga et al. (2019) borrow information from knowledge bases
and construct supplier, customer, partner, and shareholder relational graphs. With multiple ways
of graph construction, there doesn’t exist a “best” graph due to the lack of graph evaluation
methods. Future work could be done to design a graph evaluation method to help researchers
better construct a relational graph.
To effectively process the sequential data and incorporate related corporations’ information
Chen et al. (2018) propose a joint model using LSTM and GCN to predict the stock movement.
However, Chen et al. (2018)’s approach assumes that the relations between stocks are static,
which may not reflect the reality. Instead, Feng et al. (2019) propose a temporal graph convo-
lution layer to capture the stock relations in a time-sensitive manner, so that the strength of
relation could be evolving over time. The relations are then updated based on historical pricing
sequences and the proposed method obtained better performance compared to GCN. Believing
that stock description documents also contain information reflecting the changes in companies’
effect, Ying et al. (2020) capture the temporal relation by both sequential features and stock
document attributes with a time-aware relational attention network.
The aforementioned methods focus on capturing the temporal dependencies, while Sawhney
et al. (2020a) focus on fusing data from different sources. Sawhney et al. (2020a) propose a
multipronged attention network to jointly learn from historical price, social media, and inter
stock relations. Encoded pricing and textual information are used as node feature inputs to
GAT, where the graph information comes from the Wiki company-based relations. The attention
mechanisms is applied to allocate different weights on various data sources and latent correlations
may learned via the attention layers.
a objective function, so that vertices with similar structures will be closer in the learned feature
space. Since the guarantee relations changes with time, Cheng and Zhang L Wang X (2020b)
forms a dynamic guarantee network to represent the dynamics. A recurrent graph neural network
layer is developed to learn the temporal pattern and attentional weights are learned for each
time point via an attention architecture.
Unlike guarantee loans that the loan information could be naturally represented in a di-
rected graph, other loan types may not have a clear graph structure and researchers need to
construct the graph based on interactive information. For instance, Xu et al. (2021) construct
a user relation graph where users are connected by various relationships, such as social con-
nections, transactions, and device usage. However, the interactive graph may also contain noisy
data, which may be irrelevant. Since the massive interactive information may be noisy and
the impacting supply-chain information is deficient, Yang et al. (2020) extract supply-chain
relations while predicting loan defaults. Forming the interaction data as a graph, Yang et al.
(2020) formulate the supply chain mining task as a link prediction task and thus construct a
supply chain network, which is then used to predict the default probability with GNN method-
ology.
With the rise of e-commerce, e-commerce consumer lending service is gaining popularity to
enhance consumers’ purchasing power. Able to obtain information from multiple facets, the e-
commerce platform could have multi-view data and multi-relation networks, which may require
sophisticated modeling methodology. For instance, in order to predict the default probability
for each consumer with multi-view data, Liang et al. (2021) utilize a hierarchical attention
mechanism to encode the features on each view. Exploring multiplex relations, Hu et al. (2019)
propose an attributed multiplex graph-based model with relation-specific layer and attention
mechanism to jointly model multiple relations. To simultaneously model the labeled and unla-
beled data, Wang et al. (2019) proposed a semi-supervised graph neural network approach and
obtain interpretable results.
The recommender system is mainly based on the past history of the user, including its rating
and reviews on the item. However, fake ratings and feedback may be posted by the fraudsters
to seek financial benefits. To detect fraudulent reviews on the e-commerce platform, Kudo et al.
(2020) construct a directed and signed comment graph with a signed graph convolutional network
approach. Compared to Kudo et al. (2020)’s approach which only considers the comment graph,
Li et al. (2019a) integrate both the bipartite user-item graph and a comment graph to capture the
local and global context of the comments. Noticing that camouflage behaviors of the fraudsters
may deteriorate the performance of fraud detection mechanism and has seldom been considered
by prior works, Dou et al. (2020) propose a model against both feature and relation camouflage.
For each node, only informative neighbors are selected for the next aggregation step, utilizing
a similarity measure and a reinforcement learning mechanism. While the above literature focus
on the fraud review detection side, there are also works that accomplish both fraud review
detection and item recommendation tasks. For example, Zhang et al. (2020) propose a GCN-
based framework that performs both item recommendation and fraud detection in an end-to-end
manner, while each of the tasks is beneficial for the other one.
6 Challenges
6.1 Graph Evaluation Methods
To justify the inclusion of a graph, a commonly-used evaluation method is to compare the
outcome for a graph-based machine learning method with a graph-free machine learning method
(Feng et al., 2019; Li et al., 2020a). However, there is little discussion on comparing different
graphs’ effects and quality, while existing literature discussing graph comparisons are often
inadequate. For example, Liang et al. (2019) visualize the structural patterns in different graphs,
concluding that the device sharing graph is more appropriate based on the observed patterns.
Without presenting the evaluation metric for each graph, the graph comparison based on the
visualization may not be adequate. The problem may be more severe in similarity-based graph
construction since a threshold needs to be set to determine whether an edge exists. Different
threshold values may lead to completely different graphs and thus affect the model performance.
Thus, justification on threshold setup during graph construction is of great importance. Some
efforts have been made to tackle this problem, such as utilizing a reinforcement learning approach
to automatically select the optimal threshold (Dou et al., 2020). More attention may need to be
drawn to develop a framework to assess the graph quality systematically.
6.2 Explainability
Combining both graph structural information and feature information, GNN models are often
complicated and it is challenging to make an interpretation. Recently, there are some literature
focusing on the explainability of GNN models. GNNExplainer (Ying et al., 2019), for example, is
proposed to provide an interpretable explanation on trained GNN models such as GCN and GAT.
Model explainability in financial tasks is of great importance, since understanding the model
could benefit decision-making and reduce economic losses. However, there is little literature
studying the explainability of GNN models in a financial application, which often accompanies
by heterogeneous and dynamic graphs. Current literature focuses on relatively simpler graphs.
For example, Li et al. (2019c) extend the GNNExplainer to a weighted directed graph and apply
it on a Bitcoin transaction graph. Rao et al. (2020) propose an explainable fraud prediction
system that could operate on heterogeneous graphs consisting of different node and edge types.
More work could be done on the explainability of GNN models with edge-attributed graphs and
dynamic graphs, which are not yet considered.
130 Wang, J. et al.
6.5 Scalability
In the real-world financial scenario, commercial data are often of large scales. For instance,
Yang (2019) utilize data from a popular e-commerce platform and it contains about 483 million
nodes with 231 million edges. How to improve the scalability of GNNs is vital but challenging.
Computing the Laplacian matrix becomes hard with millions of nodes and for a graph of irregular
Euclidean space, optimizing the algorithm is also difficult. Sampling techniques may partially
solve the problem with the cost of losing structural information. Thus, how to maintain the
graph structure and improve the efficiency of GNN algorithms are worth further exploration.
Supplementary Material
In the supplementary materials, we present materials that are not covered in the main text.
The supplementary materials contain the summary table for each financial application, figures
categorizing major GNN methodologies for each graph type and acronyms used in the text.
References
Araci D (2019). Finbert: Financial sentiment analysis with pre-trained language models. arXiv
preprint: https: // arxiv. org/ abs/ 1908. 10063 .
Cer D, Yang Y, Kong S, Hua N, Limtiaco N, John RS et al. (2018). Universal sentence encoder
for english. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language
Processing: System Demonstrations, 169–174.
A Review on GNN in Finance 131
Chen Y, Wei Z, Huang X (2018). Incorporating corporation relationship via graph convolutional
neural networks for stock price prediction. In: Proceedings of the 27th ACM International
Conference on Information and Knowledge Management, 1655–1658.
Cheng D, Niu Z, Zhang Y (2020). Contagious chain risk rating for networked-guarantee loans.
In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery
& Data Mining, 2715–2723.
Cheng D, Tu Y, Ma Z, Niu Z, Zhang L (2019). Risk assessment for networked-guarantee loans
using high-order graph attention representation. In: Proceedings of the Twenty-Eighth Inter-
national Joint Conference on Artificial Intelligence, IJCAI-19, 5822–5828. International Joint
Conferences on Artificial Intelligence Organization.
Cheng D, Wang X, Zhang Y, Zhang L (2020). Risk guarantee prediction in networked-loans.
In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence,
IJCAI-20 (C Bessiere, ed.), 4483–4489. International Joint Conferences on Artificial Intelli-
gence Organization. Special Track on AI in FinTech.
Cho K, van Merrienboer B, Gulcehre C, Bougares F, Schwenk H, Bengio Y (2014). Learning
phrase representations using rnn encoder-decoder for statistical machine translation. In: Con-
ference on Empirical Methods in Natural Language Processing (EMNLP 2014).
Devlin J, Chang MW, Lee K, Toutanova K (2019). BERT: Pre-training of deep bidirectional
transformers for language understanding. In: Proceedings of the 2019 Conference of the North
American Chapter of the Association for Computational Linguistics: Human Language Tech-
nologies, Volume 1 (Long and Short Papers), 4171–4186. Association for Computational Lin-
guistics, Minneapolis, Minnesota.
Dou Y, Liu Z, Sun L, Deng Y, Peng H, Yu PS (2020). Enhancing graph neural network-based
fraud detectors against camouflaged fraudsters. In: Proceedings of the 29th ACM International
Conference on Information & Knowledge Management, 315–324.
Feng F, He X, Wang X, Luo C, Liu Y, Chua TS (2019). Temporal relational ranking for stock
prediction. ACM Transactions on Information Systems (TOIS), 37(2): 1–30.
Gori M, Monfardini G, Scarselli F (2005). A new model for learning in graph domains. In: Pro-
ceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005, volume 2,
729–734. IEEE.
Harl M, Weinzierl S, Stierle M, Matzner M (2020). Explainable predictive business process
monitoring using gated graph neural networks. Journal of Decision Systems, 1–16.
Hochreiter S, Schmidhuber J (1997). Long short-term memory. Neural computation, 9(8):
1735–1780.
Hu B, Zhang Z, Shi C, Zhou J, Li X, Qi Y (2019). Cash-out user detection based on attributed
heterogeneous information network with a hierarchical attention mechanism. In: Proceedings
of the AAAI Conference on Artificial Intelligence, volume 33, 946–953.
Hu B, Zhang Z, Zhou J, Fang J, Jia Q, Fang Y, et al. (2020). Loan default analysis with multiplex
graph learning. In: Proceedings of the 29th ACM International Conference on Information &
Knowledge Management, 2525–2532.
Huang J, Chai J, Cho S (2020). Deep learning in finance and banking: A literature review and
classification. Frontiers of Business Research in China, 14: 1–24.
Jiang J, Chen J, Gu T, Choo KKR, Liu C, Yu M, et al. (2019). Anomaly detection with graph
convolutional networks for insider threat and fraud detection. In: MILCOM 2019-2019 IEEE
Military Communications Conference (MILCOM), 109–114. IEEE.
Jiang W (2021). Applications of deep learning in stock market prediction: Recent progress.
132 Wang, J. et al.
Lv L, Cheng J, Peng N, Fan M, Zhao D, Zhang J (2019). Auto-encoder based graph convolu-
tional networks for online financial anti-fraud. In: 2019 IEEE Conference on Computational
Intelligence for Financial Engineering & Economics (CIFEr), 1–6. IEEE.
Ma X, Sha J, Wang D, Yu Y, Yang Q, Niu X (2018). Study on a prediction of p2p network loan
default based on the machine learning lightgbm and xgboost algorithms according to different
high dimensional data cleaning. Electronic Commerce Research and Applications, 31: 24–39.
Matsunaga D, Suzumura T, Takahashi T (2019). Exploring graph neural networks for stock
market predictions with rolling window analysis. arXiv preprint: https: // arxiv. org/ abs/
1909. 10660 .
Ozbayoglu AM, Gudelek MU, Sezer OB (2020). Deep learning for financial applications: A
survey. Applied Soft Computing, 106384.
Pareja A, Domeniconi G, Chen J, Ma T, Suzumura T, Kanezashi H, et al. (2020). Evolvegcn:
Evolving graph convolutional networks for dynamic graphs. In: Proceedings of the AAAI Con-
ference on Artificial Intelligence, volume 34, 5363–5370.
Pennington J, Socher R, Manning CD (2014). Glove: Global vectors for word representation.
In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing
(EMNLP), 1532–1543.
Rao SX, Zhang S, Han Z, Zhang Z, Min W, Cheng M, Shan Y, Zhao Y, Zhang C (2020).
Suspicious massive registration detection via dynamic heterogeneous graph neural networks.
arXiv preprint: https: // arxiv. org/ abs/ 2012. 10831 .
Rao SX, Zhang S, Han Z, Zhang Z, Min W, Chen Z, et al. (2020). xfraud: Explainable fraud
transaction detection on heterogeneous graphs. arXiv preprint: https: // arxiv. org/ abs/
2011. 12193 .
Sawhney R, Agarwal S, Wadhwa A, Shah R (2020a). Deep attentive learning for stock move-
ment prediction from social media text and company correlations. In: Proceedings of the 2020
Conference on Empirical Methods in Natural Language Processing (EMNLP), 8415–8426.
Sawhney R, Khanna P, Aggarwal A, Jain T, Mathur P, Shah R (2020b). Voltage: Volatility fore-
casting via text-audio fusion with graph convolution networks for earnings calls. In: Proceed-
ings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP),
8001–8013.
Turiel J, Aste T (2020). Peer-to-peer loan acceptance and default prediction with artificial
intelligence. Royal Society Open Science, 7(6): 191649.
Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018). Graph attention
networks. In: International Conference on Learning Representations.
Vrandečić D, Krötzsch M (2014). Wikidata: A free collaborative knowledgebase. Communica-
tions of the ACM, 57(10): 78–85.
Wang D, Lin J, Cui P, Jia Q, Wang Z, Fang Y, et al. (2019). A semi-supervised graph attentive
network for financial fraud detection. In: 2019 IEEE International Conference on Data Mining
(ICDM), 598–607. IEEE Computer Society.
Weber M, Domeniconi G, Chen J, Weidele DKI, Bellei C, Robinson T, et al. (2019). Anti-money
laundering in bitcoin: Experimenting with graph convolutional networks for financial forensics.
arXiv preprint: https: // arxiv. org/ abs/ 1908. 02591 .
Wu Z, Pan S, Chen F, Long G, Zhang C, Philip SY (2020). A comprehensive survey on graph
neural networks. In: IEEE Transactions on Neural Networks and Learning Systems, 4–24.
IEEE.
Xu B, Shen H, Sun B, An R, Cao Q, Cheng X (2021). Towards consumer loan fraud detection:
134 Wang, J. et al.
Graph neural networks with role-constrained conditional random field. In: Proceedings of the
AAAI Conference on Artificial Intelligence, volume 35, 4537–4545.
Xu K, Hu W, Leskovec J, Jegelka S (2018). How powerful are graph neural networks? arXiv
preprint: https: // arxiv. org/ abs/ 1810. 00826 .
Yang H (2019). Aligraph: A comprehensive graph neural network platform. In: Proceedings of
the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining,
3165–3166.
Yang S, Zhang Z, Zhou J, Wang Y, Sun W, Zhong X, et al. (2020). Financial risk analysis for
smes with graph-based supply chain mining. In: Proceedings of the Twenty-Ninth International
Joint Conference on Artificial Intelligence, 4661–4667.
Yang Y, Wei Z, Chen Q, Wu L (2019). Using external knowledge for financial event prediction
based on graph neural networks. In: Proceedings of the 28th ACM International Conference
on Information and Knowledge Management, 2161–2164.
Ying R, Bourgeois D, You J, Zitnik M, Leskovec J (2019). GNN explainer: A tool for post-hoc
explanation of graph neural networks. CoRR, arXiv preprint: https://arxiv.org/abs/abs/
1903.03894.
Ying X, Xu C, Gao J, Wang J, Li Z (2020). Time-aware graph relational attention network
for stock recommendation. In: Proceedings of the 29th ACM International Conference on
Information & Knowledge Management, 2281–2284.
Zhang M, Chen Y (2018). Link prediction based on graph neural networks. Advances in Neural
Information Processing Systems, 31.
Zhang S, Tong H, Xu J, Maciejewski R (2019). Graph convolutional networks: a comprehensive
review. Computational Social Networks, 6(1): 1–23.
Zhang S, Yin H, Chen T, Hung QVN, Huang Z, Cui L (2020). Gcn-based user representation
learning for unifying robust recommendation and fraudster detection. In: Proceedings of the
43rd International ACM SIGIR Conference on Research and Development in Information
Retrieval, 689–698.
Zhao T, Deng C, Yu K, Jiang T, Wang D, Jiang M (2021). Gnn-based graph anomaly detection
with graph anomaly loss. In: The Second International Workshop on Deep Learning on Graphs:
Methods and Applications (DLG-KDD’20). Available at: https://deep-learning-graphs.
bitbucket.io/dlg-kdd20/.
Zhou J, Cui G, Hu S, Zhang Z, Yang C, Liu Z, et al. (2020). Graph neural networks: A review
of methods and applications. AI Open, 1: 57–81.
Zhu YN, Luo X, Li YF, Bu B, Zhou K, Zhang W, et al. (2020). Heterogeneous mini-graph
neural network and its application to fraud invitation detection. In: 2020 IEEE International
Conference on Data Mining (ICDM), 891–899. IEEE.