Anomalous Node Detection in Blockchain Networks Based on Graph Neural Networks
Abstract
:1. Introduction
- We have designed a model that combines a Graph Attention Network (GAT) with subtree attention. While retaining direct neighbor information, it also learns multi-hop neighbor information to enhance the model’s ability to understand complex relationships. The introduction of subtree attention enables the model to identify potential anomalous nodes.
- We utilized the Integrated Bagging ensemble learning framework, dividing the data into multiple sub-training subsets. We then trained base classifiers on each subset separately and combined their predictions to obtain the final prediction.
- Traditional Bagging integrates the predictions of base classifiers through a voting mechanism. However, this method is overly simplistic and fails to effectively utilize the predictions of the base classifiers. Therefore, we applied the stacking approach to process the base classifiers’ predictions. Specifically, we used CAT as the meta-model and trained it using the predictions from the training and validation sets. The trained meta-model then provides the final predictions.
2. Related Work
2.1. Graph-Based Anomaly Detection
2.1.1. Financial Anomaly Detection
2.1.2. Fraudulent Review Detection
2.2. Class Imbalance Learning
2.2.1. Class Imbalance Classification Based on Resampling
2.2.2. Class Imbalance Classification Based on Cost Sensitivity
2.2.3. Class Imbalance Classification Based on Ensemble Learning
3. Definition and Problem Statement
3.1. Definition
3.2. Problem Statement
4. Methodology
4.1. Overview
4.2. SGAT Model
4.2.1. Multi-Hop Information Aggregation Based on Subtree Attention
4.2.2. Neighbor Information Aggregation Based on GAT
4.2.3. Proposed SGAT
4.3. Ensemble Learning Algorithms
4.3.1. Bootstrap Aggregating
4.3.2. Categorical Boosting
5. Experiments
- RQ1: What is the performance with respect to different training parameters?
- RQ2: Does the SGAT-BC model outperform the state-of-the-art methods for graph-based anomaly detection?
- RQ3: How do the key components contribute to the prediction?
5.1. Experimental Setup
5.1.1. Dataset Description
- The Elliptic Dataset is a well-known dataset extensively used for studying and analyzing the legality of Bitcoin transactions. Co-released by the IBM research team and Elliptic Company, the dataset categorizes nodes into legal, illegal, and unknown, with malicious activities such as extortion, money laundering, and scams classified as illegal transactions. It comprises 203,769 nodes and 234,355 edges, where nodes represent transactions and edges denote the flow of Bitcoin between these transactions. Among these nodes, 4545 are labeled as illegal (2%), 42,019 as legal (21%), and the remaining 157,205 are unlabeled. This results in a node imbalance ratio of 0.03. The dataset features 166 dimensions for each node, with the first 94 dimensions capturing transaction attributes such as time steps, in-degrees, out-degrees, transaction fees, and Bitcoin amounts. The remaining 72 dimensions represent aggregate features, summarizing the graph structure of a node’s direct neighbors. In this study, only the labeled nodes and the edges they form were selected for analysis, while all unlabeled nodes and their associated edges were removed. Due to intellectual property restrictions, the dataset provider has not disclosed detailed descriptions of all features, but a generalized overview of the publicly available features has been provided. To ensure consistency and scalability, numerical features were standardized based on their statistical properties, excluding specific non-numerical components.
- AscendEXHacker and UpbitHack are datasets published on XBlock, included in the EthereumHeist dataset. This dataset spans 2018 to 2022, focusing on representative theft cases on Ethereum and providing a robust foundation for blockchain anomaly detection research. We specifically selected two cases, AscendEXHacker and UpbitHack, as part of our study.For the AscendEXHacker dataset, the transaction graph comprises 6713 nodes (6646 normal nodes and 67 anomalous nodes) and 11,901 edges, with a node imbalance ratio (NodeIR) of 0.01, indicating a highly imbalanced class distribution. Similarly, the UpbitHack dataset contains a significantly larger graph with 568,994 nodes (559,250 normal nodes and 8744 anomalous nodes) and 1,447,348 edges, with a NodeIR of 0.03, reflecting a slight increase in anomaly representation but still posing challenges due to imbalance.The transaction graph is constructed from the raw transaction files using from and to fields to define sender and receiver relationships. Each node represents an Ethereum address, and edges denote transaction flows. Node labels are provided, with the label heist marking nodes involved in malicious activities, while normal nodes lack this label.Since the original data for both the AscendEXHacker and UpbitHack datasets lacks predefined features, we performed feature engineering to generate meaningful attributes for each node. First, we calculated the degree-related features, including the in-degree, out-degree, and total degree of each node, to capture the transactional relationships between nodes. Additionally, based on the raw transaction data, we extracted value-based features such as the mean transaction value (mean-value), maximum transaction value (max-value), and minimum transaction value (min-value). In Ethereum, gas represents the computational cost required to execute transactions or smart contracts. It ensures that users pay for the computational resources consumed, which prevents abuse of network resources. Gas-related features were also derived, including the mean gas price (mean-gasPrice), maximum gas price (max-gasPrice), and minimum gas price (min-gasPrice), as well as the mean gas used (mean-gasUsed), maximum gas used (max-gasUsed), and minimum gas used (min-gasUsed). These features reflect the computational cost and characteristics of the transactions associated with each node. Finally, we calculated the transaction frequency for each node by dividing its total degree by the time interval between the earliest and latest transactions. This comprehensive feature engineering process allowed us to enrich the dataset with node-level attributes, enabling more effective anomaly detection in Ethereum transaction networks.
- The Ethereum transactions dataset is an open-source blockchain transaction dataset available on GitHub. It consists of key attributes, including sender, receiver, amount, timestamp, fromIsPhi, and toIsPhi. A transaction graph is constructed from these data, where nodes represent entities (senders and receivers), and edges represent transactions between them. Labels are determined by the fromIsPhi and toIsPhi attributes, where fromIsPhi indicates that the sender is an anomaly node, and toIsPhi indicates that the receiver is an anomaly node.We first performed data cleaning to extract the nodes and edges for the graph. For feature engineering, we computed node-level attributes such as out-degree, in-degree, average degree, total degree, average sending amount, total sending amount, maximum sending amount, average receiving amount, total receiving amount, maximum receiving amount, transaction time interval ratio, and the total number of neighbors for each node. These engineered features were used to enrich the graph representation and provide comprehensive input for anomaly detection tasks. Finally, numerical features were standardized to ensure uniform scaling for downstream modeling tasks, and isolated or irrelevant nodes were removed from the transaction graph to retain only meaningful components.
5.1.2. Compared Methods
- GCNs [10]: Graph Convolutional Networks are a popular type of graph neural network that learns representations of nodes by propagating and transforming features across the nodes of a graph. GCNs utilize the adjacency matrix and node feature matrix of the graph to perform hierarchical feature extraction, capturing the local structural information of nodes.
- GATs [32]: Graph Attention Networks introduce an attention mechanism that dynamically determines the importance of different neighbors during the aggregation process of nodes. GATs can adaptively learn the weights between nodes, thus more effectively capturing the structural features of the graph.
- Graphsage [39]: Graph Sample and Aggregation (GraphSAGE) is an inductive learning framework that efficiently generates low-dimensional embeddings of nodes from large-scale graphs. GraphSAGE samples a fixed number of neighbors and uses aggregation functions, such as mean or pooling, to update the representation of nodes.
- FdGars [14]: FdGars employs a customized version of Graph Convolutional Networks (GCNs) to detect fraudulent accounts in online app store review systems. Unlike general-purpose GCNs, which primarily focus on hierarchical feature extraction from node features and adjacency matrices, the GCNs in FdGars are specifically optimized for anomaly detection tasks. The task-specific design enables FdGars to capture subtle relational and contextual features in the social graph, enhancing its capability to identify potential fraudulent activities.
- Graphconsis [15]: GraphConsis is a graph neural network framework designed for fraud detection. It addresses issues of inconsistency in graph models applied to fraud detection, such as contextual, feature, and relational inconsistencies. The framework integrates node features with contextual embeddings and designs a consistency score to filter out inconsistent neighbors based on their consistency scores, thereby defining ‘sampled nodes’ as those neighbors that meet a predefined threshold of consistency. This selection process ensures that only relevant and consistent information is used for learning relational attention weights.
- CARE-GNN [16]: CARE-GNN is an enhanced graph neural network approach for detecting disguised fraudsters in graph-structured data. It incorporates a novel attention mechanism and subgraph feature extraction strategy to identify and highlight areas of the graph that may be manipulated or influenced by disguised fraudsters. The method can learn complex fraud patterns and dynamically adjust its network structure to cope with the evolving strategies of fraudsters.
- GAT-COBO [22]: GAT-COBO is a cost-sensitive graph neural network model specifically designed for fraud detection in the telecommunications industry. This model integrates the capabilities of Graph Attention Networks (GATs) with cost-sensitive learning strategies to enhance performance in telecom fraud detection.
- STAGNN [33]: STAGNN is an enhanced graph neural network (GNN) model, specifically engineered for more efficient processing of graph-structured data. This model adaptively utilizes the root subtree structures within graphs to amplify its self-attention mechanisms, thereby enhancing both the performance and interpretability of the neural network across various tasks.
- IForest [40]: Isolation Forest is an ensemble-based algorithm specifically designed for anomaly detection. It isolates anomalies instead of profiling normal data points. By randomly selecting a feature and then randomly selecting a split value between the maximum and minimum values of the selected feature, Isolation Forest recursively partitions the data. Anomalies are expected to have shorter paths in the tree structure, thus isolating them efficiently.
- LOF [41]: Local Outlier Factor is an algorithm that measures the local deviation of a given data point with respect to its neighbors. It is based on a concept of local density, where locality is given by the k-nearest neighbors, whose distance is used to estimate the density. A point is considered as an outlier if the density around this point is significantly different from the density around its neighbors.
- OCSVM [42]: One-Class SVM is a specialized version of the SVM algorithm. It learns a decision function for anomaly detection by identifying the smallest region that encompasses the majority of the data points. Data points that do not fall within this region are considered anomalies.
5.1.3. Evaluation Metrics
- Macro Recall: Measures the average proportion of actual anomalies that the model successfully identifies across all classes, treating each class equally regardless of its sample size. In anomaly detection, Macro Recall is particularly crucial as it ensures the model’s capability to detect rare and critical anomalies is not overshadowed by majority class performance.
- Macro F1 Score: The harmonic mean of macro precision and Macro Recall, offering a balanced evaluation of the model’s ability to identify anomalies correctly while maintaining robustness across all classes. This is especially important in anomaly detection, where a model must balance accuracy and coverage to effectively detect anomalies without being dominated by majority class performance.
- Macro AUC: The Macro Area Under the ROC Curve (AUC) provides an aggregate measure of the model’s ability to distinguish between classes across all thresholds, averaged over all classes. Macro AUC is advantageous in anomaly detection as it evaluates the model’s discriminative power in scenarios with significant class imbalance, ensuring fair assessment across both minority and majority classes.
- G-Mean: The geometric mean of the True Positive Rate (TPR) and the True Negative Rate (TNR), assessing the balance between the model’s sensitivity to the minority class and specificity to the majority class. In anomaly detection, G-Mean ensures that the model not only identifies anomalies effectively but also avoids excessive false positives, maintaining a balanced performance between critical minority and majority classes.
5.1.4. Experiments Details
5.2. Sensitivity Analysis (RQ1)
5.3. Performance Comparison (RQ2)
5.4. Ablation Study (RQ3)
- Compared with SGAT-BC, the performance of SGAT-BC/s decreased on all datasets across four evaluation metrics, especially on Recall and G-Mean. On the and , Recall decreased by 6.63% and 7%, respectively, while G-Mean decreased by 6.86% and 7.27%, respectively. This is because the aggregation of multi-hop neighbor information by STA allows the model to better identify anomalous nodes, achieving better performance in both identifying anomalous and normal nodes.
- After removing the Bagging ensemble learning framework, SGAT-BC/b only maintained its performance on the F1 metric of the ETD dataset. However, it decreased across all evaluation metrics on all other datasets. On the , AUC, F1, Recall, and G-Mean showed the most significant declines, decreasing by 1.85%, 32.6%, 6.05%, and 15.38%, respectively. This is because in blockchain anomaly detection tasks, the datasets are highly imbalanced. Removing the Bagging ensemble learning framework results in training only a single model for evaluation, which causes the model to be biased toward the majority class, leading to decreased recognition ability for minority classes. Furthermore, due to dataset imbalance, training a single model increases model bias and reduces robustness.
- After removing the CAT stacking module to obtain SGAT-BC/c, we found that SGAT-BC/c showed the most significant decreases in F1, Recall, and G-Mean on the , decreasing by 4.41%, 3.05%, and 3.08%, respectively. Traditional Bagging ensemble learning handles base model predictions too directly by using a simple voting method. This approach cannot accurately combine the predictions of all base models. Instead, we use CAT as a meta-model to train on the output predictions of all base models, which better leverages the training results of the base models.
6. Conclusions
6.1. Discussion
6.2. Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Xu, X.; Weber, I.; Staples, M. Architecture for Blockchain Applications; Springer: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
- Kosba, A.; Miller, A.; Shi, E.; Wen, Z.; Papamanthou, C. Hawk: The blockchain model of cryptography and privacy-preserving smart contracts. In Proceedings of the 2016 IEEE symposium on security and privacy (SP), San Jose, CA, USA, 22–26 May 2016; pp. 839–858. [Google Scholar]
- Chainalysis. 2024 Crypto Crime Report Introduction. 2024. Available online: https://www.chainalysis.com/blog/2024-crypto-crime-report-introduction/ (accessed on 8 July 2024).
- Saxena, S.; Nagpal, A.; Prashar, T.; Shravan, M.; Al-Hilali, A.A.; Alazzam, M.B. Blockchain for supply chain traceability: Opportunities and challenges. In Proceedings of the 2023 3rd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), Greater Noida, India, 12–13 May 2023; pp. 110–114. [Google Scholar]
- Chen, W.; Zheng, Z. Blockchain data analysis: A review of status, trends and challenges. J. Comput. Res. Dev. 2018, 55, 1853–1870. [Google Scholar]
- Chen, W.; Zheng, Z.; Cui, J.; Ngai, E.; Zheng, P.; Zhou, Y. Detecting ponzi schemes on ethereum: Towards healthier blockchain technology. In Proceedings of the 2018 World Wide Web Conference, Lyon, France, 23–27 April 2018; pp. 1409–1418. [Google Scholar]
- Wang, Y.; Dong, L.; Jiang, X.; Ma, X.; Li, Y.; Zhang, H. KG2Vec: A node2vec-based vectorization model for knowledge graph. PLoS ONE 2021, 16, e0248552. [Google Scholar] [CrossRef] [PubMed]
- Kılıç, B.; Özturan, C.; Sen, A. Analyzing large-scale blockchain transaction graphs for fraudulent activities. In Big Data and Artificial Intelligence in Digital Finance; Springer: Cham, Swizerland, 2022; p. 253. [Google Scholar]
- Motie, S.; Raahemi, B. Financial fraud detection using graph neural networks: A systematic review. Expert Syst. Appl. 2023, 240, 122156. [Google Scholar] [CrossRef]
- Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Liu, Z.; Chen, C.; Yang, X.; Zhou, J.; Li, X.; Song, L. Heterogeneous graph neural networks for malicious account detection. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Torino, Italy, 22–26 October 2018; pp. 2077–2085. [Google Scholar]
- Wang, D.; Lin, J.; Cui, P.; Jia, Q.; Wang, Z.; Fang, Y.; Yu, Q.; Zhou, J.; Yang, S.; Qi, Y. A semi-supervised graph attentive network for financial fraud detection. In Proceedings of the 2019 IEEE International Conference on Data Mining (ICDM), Beijing, China, 8–11 November 2019; pp. 598–607. [Google Scholar]
- Li, S.; Gou, G.; Liu, C.; Hou, C.; Li, Z.; Xiong, G. TTAGN: Temporal transaction aggregation graph network for ethereum phishing scams detection. In Proceedings of the ACM Web Conference 2022, Lyon, France, 25–29 April 2022; pp. 661–669. [Google Scholar]
- Wang, J.; Wen, R.; Wu, C.; Huang, Y.; Xiong, J. Fdgars: Fraudster detection via graph convolutional networks in online app review system. In Proceedings of the Companion Proceedings of the 2019 World Wide Web conference, San Francisco, CA, USA, 13–17 May 2019; pp. 310–316. [Google Scholar]
- Liu, Z.; Dou, Y.; Yu, P.S.; Deng, Y.; Peng, H. Alleviating the inconsistency problem of applying graph neural network to fraud detection. In Proceedings of the 43rd International ACM SIGIR Conference on research And Development in Information Retrieval, Xi’an, China, 25–30 July 2020; pp. 1569–1572. [Google Scholar]
- Dou, Y.; Liu, Z.; Sun, L.; Deng, Y.; Peng, H.; Yu, P.S. Enhancing graph neural network-based fraud detectors against camouflaged fraudsters. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Virtual, 19–23 October 2020; pp. 315–324. [Google Scholar]
- Li, A.; Qin, Z.; Liu, R.; Yang, Y.; Li, D. Spam review detection with graph convolutional networks. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 2703–2711. [Google Scholar]
- Gupta, S.; Jhunjhunwalla, M.; Bhardwaj, A.; Shukla, D. Data imbalance in landslide susceptibility zonation: Under-sampling for class-imbalance learning. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2020, 42, 51–57. [Google Scholar] [CrossRef]
- Peng, M.; Zhang, Q.; Xing, X.; Gui, T.; Huang, X.; Jiang, Y.G.; Ding, K.; Chen, Z. Trainable undersampling for class-imbalance learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 4707–4714. [Google Scholar]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Wang, W.; Wang, S.; Fan, W.; Liu, Z.; Tang, J. Global-and-local aware data generation for the class imbalance problem. In Proceedings of the 2020 SIAM International Conference on Data Mining, SIAM, Cincinnati, OH, USA, 7–9 May 2020; pp. 307–315. [Google Scholar]
- Hu, X.; Chen, H.; Zhang, J.; Chen, H.; Liu, S.; Li, X.; Wang, Y.; Xue, X. GAT-COBO: Cost-Sensitive Graph Neural Network for Telecom Fraud Detection. IEEE Trans. Big Data 2024, 10, 528–542. [Google Scholar] [CrossRef]
- Cui, Y.; Jia, M.; Lin, T.Y.; Song, Y.; Belongie, S. Class-balanced loss based on effective number of samples. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 9268–9277. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Vong, C.M.; Du, J. Accurate and efficient sequential ensemble learning for highly imbalanced multi-class data. Neural Netw. 2020, 128, 268–278. [Google Scholar] [CrossRef] [PubMed]
- Guo, H.; Zhou, J.; Wu, C.A. Ensemble learning via constraint projection and undersampling technique for class-imbalance problem. Soft Comput. 2020, 24, 4711–4727. [Google Scholar] [CrossRef]
- Ren, J.; Wang, Y.; Mao, M.; Cheung, Y.M. Equalization ensemble for large scale highly imbalanced data classification. Knowl.-Based Syst. 2022, 242, 108295. [Google Scholar] [CrossRef]
- Liu, X.Y.; Wu, J.; Zhou, Z.H. Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. Part B Cybern. 2008, 39, 539–550. [Google Scholar]
- Guo, H.; Viktor, H.L. Learning from imbalanced data sets with boosting and data generation: The databoost-im approach. ACM Sigkdd Explor. Newsl. 2004, 6, 30–39. [Google Scholar] [CrossRef]
- Galar, M.; Fernandez, A.; Barrenechea, E.; Bustince, H.; Herrera, F. A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2011, 42, 463–484. [Google Scholar] [CrossRef]
- He, H.; Garcia, E.A. Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 2009, 21, 1263–1284. [Google Scholar]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph Attention Networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
- Huang, S.; Song, Y.; Zhou, J.; Lin, Z. Tailoring self-attention for graph via rooted subtrees. Adv. Neural Inf. Process. Syst. 2024, 36, 73559–73581. [Google Scholar]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, 4–9 December 2017; pp. 6000–6010. [Google Scholar]
- Kim, J.; Nguyen, T.; Min, S.; Cho, S.; Lee, M.; Lee, H.; Hong, S. Pure Transformers are Powerful Graph Learners. arXiv 2022, arXiv:2207.02505. [Google Scholar]
- Breiman, L. Bagging predictors. Machine Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
- Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. Adv. Neural Inf. Process. Syst. 2018, 31, 6639–6649. [Google Scholar]
- Hamilton, W.L.; Ying, R.; Leskovec, J. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, 4–9 December 2017; pp. 1025–1035. [Google Scholar]
- Liu, F.T.; Ting, K.M.; Zhou, Z.-H. Isolation forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy, 15–19 December 2008; pp. 413–422. [Google Scholar]
- Alghushairy, O.; Alsini, R.; Soule, T.; Ma, X. A review of local outlier factor algorithms for outlier detection in big data streams. Big Data Cogn. Comput. 2020, 5, 1. [Google Scholar] [CrossRef]
- Hejazi, M.; Singh, Y.P. One-class support vector machines approach to anomaly detection. Appl. Artif. Intell. 2013, 27, 351–366. [Google Scholar] [CrossRef]
DataSet | Normal Node | Anomalous Node | NodeIR | Edge |
---|---|---|---|---|
AscendEXHacker () | 6646 | 67 | 0.01 | 11,901 |
Ethereum transactions data () | 45,782 | 1165 | 0.03 | 103,035 |
Elliptic () | 42,019 | 4545 | 0.11 | 101,188 |
UpbitHack () | 559,250 | 8744 | 0.03 | 1,447,348 |
Method | ||||||||
---|---|---|---|---|---|---|---|---|
Macro AUC | Macro F1 | Macro Recall | G-Mean | Macro AUC | Macro F1 | Macro Recall | G-Mean | |
GCN | 0.9839 | 0.6313 | 0.6102 | 0.4733 | 0.9093 | 0.6220 | 0.5737 | 0.3844 |
GAT | 0.9567 | 0.7815 | 0.8584 | 0.8479 | 0.9403 | 0.7432 | 0.6947 | 0.6263 |
Graphsage | 0.9346 | 0.4976 | 0.5000 | 0.0000 | 0.8287 | 0.4980 | 0.5021 | 0.0655 |
FdGars | 0.4739 | 0.5278 | 0.6434 | 0.5726 | 0.5993 | 0.0807 | 0.4979 | 0.2372 |
Player2vec | 0.1874 | 0.1119 | 0.4083 | 0.2857 | 0.3498 | 0.0735 | 0.5016 | 0.2220 |
GraphConsis | 0.8627 | 0.5340 | 0.5340 | 0.2761 | 0.9598 | 0.5563 | 0.5353 | 0.2698 |
CARE-GNN | 0.8727 | 0.2209 | 0.5973 | 0.4938 | 0.9516 | 0.6693 | 0.7908 | 0.7745 |
GAT-COBO | 0.5533 | 0.5372 | 0.7253 | 0.6994 | 0.9648 | 0.7600 | 0.7539 | 0.7167 |
STAGNN | 0.7038 | 0.7470 | 0.7233 | 0.7470 | 0.9169 | 0.7663 | 0.8241 | 0.7663 |
IForest | 0.0167 | 0.5640 | 0.9394 | 0.9389 | 0.0370 | 0.6503 | 0.8905 | 0.8900 |
LOF | 0.6466 | 0.5024 | 0.6153 | 0.5443 | 0.7345 | 0.5188 | 0.5746 | 0.4710 |
OCSVM | 0.0172 | 0.5648 | 0.9398 | 0.9393 | 0.2827 | 0.4994 | 0.7458 | 0.7454 |
SGATBC | 0.9979 | 0.9276 | 0.9981 | 0.9981 | 0.982 | 0.8722 | 0.9853 | 0.9853 |
Method | ||||||||
---|---|---|---|---|---|---|---|---|
Macro AUC | Macro F1 | Macro Recall | G-Mean | Macro AUC | Macro F1 | Macro Recall | G-Mean | |
GCN | 0.8752 | 0.6506 | 0.6073 | 0.4686 | 0.9648 | 0.7730 | 0.6993 | 0.6319 |
GAT | 0.9218 | 0.7279 | 0.6847 | 0.6171 | 0.9553 | 0.7420 | 0.6691 | 0.5821 |
Graphsage | 0.9382 | 0.8632 | 0.8205 | 0.8026 | … | … | … | … |
FdGars | 0.4136 | 0.4434 | 0.5734 | 0.5722 | 0.4201 | 0.1942 | 0.5398 | 0.4122 |
Player2vec | 0.5331 | 0.2062 | 0.5239 | 0.3453 | 0.5189 | 0.1818 | 0.5310 | 0.3939 |
GraphConsis | 0.6977 | 0.4752 | 0.4985 | 0.0468 | … | … | … | … |
CARE-GNN | 0.9113 | 0.6190 | 0.8078 | 0.8036 | 0.9577 | 0.6276 | 0.9117 | 0.9107 |
GAT-COBO | 0.9749 | 0.9081 | 0.8737 | 0.8652 | 0.9725 | 0.7978 | 0.9372 | 0.9368 |
STAGNN | 0.9308 | 0.9301 | 0.9305 | 0.9301 | 0.4839 | 0.5000 | 0.4918 | 0.5000 |
IForest | 0.9013 | 0.4453 | 0.4447 | 0.0140 | 0.0854 | 0.6356 | 0.7854 | 0.7741 |
LOF | 0.4429 | 0.5113 | 0.5114 | 0.3298 | 0.5434 | 0.4770 | 0.4729 | 0.2069 |
OCSVM | 0.8418 | 0.4537 | 0.4532 | 0.1180 | 0.3261 | 0.5934 | 0.7022 | 0.6697 |
SGATBC | 0.9994 | 0.9933 | 0.9962 | 0.9962 | 0.9811 | 0.9811 | 0.9811 | 0.9811 |
Method | ||||||||
---|---|---|---|---|---|---|---|---|
Macro AUC | Macro F1 | Macro Recall | G-Mean | Macro AUC | Macro F1 | Macro Recall | G-Mean | |
SGATBC | 0.9979 | 0.9276 | 0.9981 | 0.9981 | 0.9820 | 0.8722 | 0.9853 | 0.9853 |
SGAT-BC∖STA | 0.9975 | 0.9051 | 0.9318 | 0.9295 | 0.9806 | 0.8490 | 0.9153 | 0.9126 |
SGAT-BC∖CAT | 0.9982 | 0.8305 | 0.9944 | 0.9943 | 0.9824 | 0.8861 | 0.8972 | 0.8921 |
SGAT-BC∖Bagging | 0.9938 | 0.8125 | 0.9777 | 0.9774 | 0.9797 | 0.8928 | 0.8938 | 0.8904 |
Method | ||||||||
---|---|---|---|---|---|---|---|---|
Macro AUC | Macro F1 | Macro Recall | G-Mean | Macro AUC | Macro F1 | Macro Recall | G-Mean | |
SGATBC | 0.9994 | 0.9933 | 0.9962 | 0.9962 | 0.9811 | 0.8707 | 0.9761 | 0.9761 |
SGAT-BC∖STA | 0.9981 | 0.9783 | 0.9940 | 0.9940 | 0.9782 | 0.8557 | 0.9654 | 0.9654 |
SGAT-BC∖CAT | 0.9994 | 0.9639 | 0.9907 | 0.9907 | 0.9808 | 0.8266 | 0.9456 | 0.9453 |
SGAT-BC∖Bagging | 0.9809 | 0.6673 | 0.9357 | 0.8424 | 0.9789 | 0.8213 | 0.9446 | 0.9421 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chang, Z.; Cai, Y.; Liu, X.F.; Xie, Z.; Liu, Y.; Zhan, Q. Anomalous Node Detection in Blockchain Networks Based on Graph Neural Networks. Sensors 2025, 25, 1. https://doi.org/10.3390/s25010001
Chang Z, Cai Y, Liu XF, Xie Z, Liu Y, Zhan Q. Anomalous Node Detection in Blockchain Networks Based on Graph Neural Networks. Sensors. 2025; 25(1):1. https://doi.org/10.3390/s25010001
Chicago/Turabian StyleChang, Ze, Yunfei Cai, Xiao Fan Liu, Zhenping Xie, Yuan Liu, and Qianyi Zhan. 2025. "Anomalous Node Detection in Blockchain Networks Based on Graph Neural Networks" Sensors 25, no. 1: 1. https://doi.org/10.3390/s25010001
APA StyleChang, Z., Cai, Y., Liu, X. F., Xie, Z., Liu, Y., & Zhan, Q. (2025). Anomalous Node Detection in Blockchain Networks Based on Graph Neural Networks. Sensors, 25(1), 1. https://doi.org/10.3390/s25010001