CBGRU: A Detection Method of Smart Contract Vulnerability Based on a Hybrid Model
Abstract
:1. Introduction
- By combining different word embedding methods and deep learning methods, the accuracy of smart contract vulnerability detection is improved.
- CBGRU applies hybrid networks to smart contract vulnerability detection for the first time, and the hybrid network model proposed in this paper can detect several different smart contract vulnerabilities while maintaining good detection performance.
- Through extensive experiments, it is demonstrated that the CBGRU model proposed in this study combines the advantages of deep learning and hybrid learning, and has better smart contract vulnerability detection performance compared to a single neural network model.
2. Related Work
2.1. Smart Contract Vulnerability Detection
- Smart contracts are becoming more and more complex in their structure to achieve complex functionality, and the variety of smart contract vulnerabilities is increasing, and the rules defined by experts based on vulnerabilities cannot keep up with the speed of smart contract vulnerability updates.
- A crude overlay of several expert-defined logic rules can lead to a high false-alarm rate, and expert rule-based smart contract vulnerability detection tools are not suitable for general smart contract vulnerability detection situations.
- The attacker can use techniques to bypass inspection patterns against these rules defined in advance. Expert rule-based smart contract vulnerability detection tools cannot be updated on time.
2.2. Deep Learning Hybrid Models
- Hybrid models have been well used in image recognition, sentiment classification, network intrusion detection, and news classification.
- Hybrid models have better performance compared to a single deep learning model, and the training time is faster than a single model due to the lower complexity of each model in the hybrid model.
- Hybrid networks are more conducive to classification because they combine the advantages of different deep learning models.
2.3. Research Motivation
- Determining the label of the training data.
- Pre-processing of training data and making changes to the form of training data
- Extracting the feature values by deep learning models
- Classification of training data by the classifier of the model
3. Overall Framework
- Pre-processing of the dataset.
- Mapping high-dimensional smart contracts to low-dimensional vectors via word embedding models.
- Extract the feature values by two neural networks, then concatenate the feature values.
- Performing classification and deriving results.
3.1. Overall Model Structure
Algorithm 1 Smart contract vulnerability detection process. |
Input:S: Smart contracts that need to be tested |
Output:result: the result of detection |
1: Step1. Use the preprocessing function P to preprocess the smart contract S to obtain |
2: = |
3: Step2. Embedding using to obtain the embedding matrix |
4: = |
5: Step3. Embedding using FastText to obtain the embedding matrix |
6: = |
7: Step4. CNN performs feature extraction on to obtain features |
8: = |
9: Step5. BiGRU performs feature extraction on to obtain features |
10: = + |
11: Step6. Fusion of extracted feature values |
12: = |
13: Step7. Classification by softmax to obtain results |
14: result = Softmax (W + b) |
3.2. Word Embedding Layer
- Remove the solidity code version, such as “pragma solidity^0.4.4” in the ProofExistence contract in Figure 6.
- Removes comments, non-ASCII values, and blank lines from the contract.
- Represent user-defined function names as FUN plus numbers, and user-defined variable names as VAR plus numbers in smart contracts. This is because user-defined function names and variable names have little effect on whether the smart contract contains vulnerabilities and also add noise when performing feature extraction, which negatively affects the final feature extraction.
- Remove all spaces from the smart contract and perform word embedding; after the smart contract is processed, only the keywords in the solidity language will remain.
3.3. CBGRU Hybrid Network Layer
3.4. CBGRU Model Overall Process
Algorithm 2 training model. |
1: Initialize model parameters randomly |
2: Set the max number of epochs: |
3: Set the origin dataset: D |
4: for S in D do |
5: // Use the preprocessing function P to process the processing smart contract S |
6: P(S) |
7: end for |
8: for t in 1, 2, 3…, T do |
9: Pack the dataset t into mini-batch: |
10: end for |
11: for epoch in 1,2,3…, |
12: //Merge all datasets. |
13: D = |
14: for in do |
15: = CNN () |
16: = BiGRU () |
17: result = Softmax ( + ) |
18: Loss () = Equation (32) |
19: Compute gradient: () |
20: Update model: = − () |
21: end for |
22: end for |
4. Experiments and Results
4.1. Dataset
4.2. Experiment
4.2.1. Comparative Experiments
4.2.2. Comparison with Previous Studies
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Wang, S.; Ouyang, L.; Yuan, Y.; Ni, X.; Han, X.; Wang, F.-Y. Blockchain-Enabled Smart Contracts: Architecture, Applications, and Future Trends. IEEE Trans. Syst. Man Cybern. Syst. 2019, 49, 2266–2277. [Google Scholar] [CrossRef]
- Kawaguchi, N. Application of Blockchain to Supply Chain: Flexible Blockchain Technology. Procedia Comput. Sci. 2019, 164, 143–148. [Google Scholar] [CrossRef]
- Hang, L.; Kim, B.; Kim, D. A Transaction Traffic Control Approach Based on Fuzzy Logic to Improve Hyperledger Fabric Performance. Wirel. Commun. Mob. Comput. 2022, 2022, 2032165. [Google Scholar] [CrossRef]
- Zou, W.; Lo, D.; Kochhar, P.S.; Le, X.-B.D.; Xia, X.; Feng, Y.; Chen, Z.; Xu, B. Smart Contract Development: Challenges and Opportunities. IIEEE Trans. Softw. Eng. 2021, 47, 2084–2106. [Google Scholar] [CrossRef]
- Zhang, B.; Yu, H.; Yan, Y. NTOPNG based Traffic Monitoring and Modelling. In Proceedings of the 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC 2019), Chengdu, China, 20–22 December 2019; Xu, B., Mou, K., Eds.; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2019. ISBN 9781728119076. [Google Scholar]
- Zhang, Y.; Yutaka, M.; Sasabe, M.; Kasahara, S. Attribute-Based Access Control for Smart Cities: A Smart-Contract-Driven Framework. IEEE Internet Things J. 2021, 8, 6372–6384. [Google Scholar] [CrossRef]
- Duan, B.; Xin, K.; Zhong, Y. Optimal Dispatching of Electric Vehicles Based on Smart Contract and Internet of Things. IEEE Access 2020, 8, 9630–9639. [Google Scholar] [CrossRef]
- Alzubi, O.A.; Alzubi, J.A.; Shankar, K.; Gupta, D. Blockchain and artificial intelligence enabled privacy--preserving medical data transmission in Internet of Things. Trans. Emerg. Telecommun. Technol. 2021, 32, e4360. [Google Scholar] [CrossRef]
- Liao, J.-W.; Tsai, T.-T.; He, C.-K.; Tien, C.-W. SoliAudit: Smart Contract Vulnerability Assessment Based on Machine Learning and Fuzz Testing. In Proceedings of the 2019 Sixth International Conference on Internet of Things: Systems, Management and Security (IOTSMS), Granada, Spain, 22–25 October 2019; Alsmirat, M., Jararweh, Y., Eds.; IEEE: Piscataway, NJ, USA, 2019; pp. 458–465, ISBN 9781728129495. [Google Scholar]
- Liu, J.; Liu, Z. A Survey on Security Verification of Blockchain Smart Contracts. IEEE Access 2019, 7, 77894–77904. [Google Scholar] [CrossRef]
- Qian, P.; Liu, Z.; He, Q.; Zimmermann, R.; Wang, X. Towards Automated Reentrancy Detection for Smart Contracts Based on Sequential Models. IEEE Access 2020, 8, 19685–19695. [Google Scholar] [CrossRef]
- Tikhomirov, S.; Voskresenskaya, E.; Ivanitskiy, I.; Takhaviev, R.; Marchenko, E.; Alexandrov, Y. Smartcheck: Static analysis of ethereum smart contracts. In Proceedings of the 1st International Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB 2018), Gothenburg, Sweden, 27 May 2018; Tonelli, R., Destefanis, G., Counsell, S., Marchesi, M., Eds.; ACM: New York, NY, USA, 2018; pp. 9–16, ISBN 9781450357265. [Google Scholar]
- Prechtel, D.; Gros, T.; Muller, T. Evaluating Spread of ‘Gasless Send’ in Ethereum Smart Contracts. In Proceedings of the 2019 9th IFIP International Conference on New Technologies, Mobility and Security (NTMS 2019), Canary Island, Spain, 24–26 June 2019; IEEE: Piscataway, NJ, USA, 2019. ISBN 9781728115429. [Google Scholar]
- Tsankov, P.; Dan, A.; Drachsler-Cohen, D.; Gervais, A.; Bünzli, F.; Vechev, M. Securify: Practical Security Analysis of Smart Contracts. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, Toronto, ON, Canada, 15 October 2018; Lie, D., Mannan, M., Backes, M., Wang, X., Eds.; ACM: New York, NY, USA, 2018; pp. 67–82, ISBN 9781450356930. [Google Scholar]
- Feist, J.; Grieco, G.; Groce, A. Slither: A Static Analysis Framework for Smart Contracts. In Proceedings of the 2019 IEEE/ACM 2nd International Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB), Montreal, QC, Canada, 27 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 8–15, ISBN 9781728122571. [Google Scholar]
- Jiang, B.; Liu, Y.; Chan, W.K. ContractFuzzer: Fuzzing smart contracts for vulnerability detection. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE ’18), Montpellier, France, 3–7 September 2018; Huchard, M., Kästner, C., Fraser, G., Eds.; ACM: New York, NY, USA, 2018; pp. 259–269, ISBN 9781450359375. [Google Scholar]
- Grieco, G.; Song, W.; Cygan, A.; Feist, J.; Groce, A. Echidna: Effective, usable, and fast fuzzing for smart contracts. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA ’20), Virtual Event, 18–22 July 2020; Khurshid, S., Păsăreanu, C.S., Eds.; ACM: New York, NY, USA, 2020; pp. 557–560, ISBN 9781450380089. [Google Scholar]
- Kolluri, A.; Nikolic, I.; Sergey, I.; Hobor, A.; Saxena, P. Exploiting the laws of order in smart contracts. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA ’19), Beijing, China, 15–19 July 2019; Zhang, D., Møller, A., Eds.; ACM: New York, NY, USA, 2019; pp. 363–373, ISBN 9781450362245. [Google Scholar]
- Harer, J.A.; Kim, L.Y.; Russell, R.L.; Ozdemir, O.; Kosta, L.R.; Rangamani, A.; Hamilton, L.H.; Centeno, G.I.; Key, J.R.; Ellingwood, P.M.; et al. Automated software vulnerability detection with machine learning. arXiv 2018, arXiv:1803.04497. [Google Scholar]
- Cao, S.; Sun, X.; Bo, L.; Wei, Y.; Li, B. BGNN4VD: Constructing Bidirectional Graph Neural-Network for Vulnerability Detection. Inf. Softw. Technol. 2021, 136, 106576. [Google Scholar] [CrossRef]
- Li, Z.; Zou, D.; Xu, S.; Ou, X.; Jin, H.; Wang, S.; Deng, Z.; Zhong, Y. VulDeePecker: A Deep Learning-Based System for Vulnerability Detection. In Proceedings 2018 Network and Distributed System Security Symposium, San Diego, CA, USA, 18–21 February 2018; Traynor, P., Oprea, A., Eds.; Internet Society: Reston, VA, USA, 2018; ISBN 1891562495. [Google Scholar]
- Yu, X.; Zhao, H.; Hou, B.; Ying, Z.; Wu, B. DeeSCVHunter: A Deep Learning-Based Framework for Smart Contract Vulnerability Detection. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–8, ISBN 978-1-6654-3900-8. [Google Scholar]
- Wu, H.; Zhang, Z.; Wang, S.; Lei, Y.; Lin, B.; Qin, Y.; Zhang, H.; Mao, X. Peculiar: Smart Contract Vulnerability Detection Based on Crucial Data Flow Graph and Pre-training Techniques. In Proceedings of the 2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE), Wuhan, China, 25–28 October 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 378–389, ISBN 9781665425872. [Google Scholar]
- Xing, C.; Chen, Z.; Chen, L.; Guo, X.; Zheng, Z.; Li, J. A new scheme of vulnerability analysis in smart contract with machine learning. Wirel. Netw. 2020, 1–10. [Google Scholar] [CrossRef]
- Gao, Z.; Jiang, L.; Xia, X.; Lo, D.; Grundy, J. Checking Smart Contracts With Structural Code Embedding. IIEEE Trans. Software Eng. 2021, 47, 2874–2891. [Google Scholar] [CrossRef] [Green Version]
- Goswami, S.; Singh, R.; Saikia, N.; Bora, K.K.; Sharma, U. TokenCheck: Towards Deep Learning Based Security Vulnerability Detection in ERC-20 Tokens. In Proceedings of the 2021 IEEE Region 10 Symposium (TENSYMP), Jeju, Korea, 23–25 August 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–8, ISBN 9781665400268. [Google Scholar]
- Du, S.; Li, T.; Yang, Y.; Horng, S.-J. Deep Air Quality Forecasting Using Hybrid Deep Learning Framework. IEEE Trans. Knowl. Data Eng. 2021, 33, 2412–2424. [Google Scholar] [CrossRef] [Green Version]
- Fu, L.; Peng, Q.; Chai, L. Predicting DNA Methylation States with Hybrid Information Based Deep-Learning Model. IEEE/ACM Trans. Comput. Biol. Bioinform. 2020, 17, 1721–1728. [Google Scholar] [CrossRef]
- Xi, J.; Xie, D.; Jiang, W.; Xiang, Y. High Resolution Remote Sensing Image Classification Using Hybrid Ensemble Learning. In Proceedings of the 2021 3rd International Conference on Intelligent Control, Measurement and Signal Processing and Intelligent Oil Field (ICMSP), Xi’an, China, 23–25 July 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 152–157, ISBN 9781665437158. [Google Scholar]
- Yue, W.; Li, L. Sentiment Analysis using Word2vec-CNN-BiLSTM Classification. In Proceedings of the 2020 Seventh International Conference on Social Networks Analysis, Management and Security (SNAMS), Paris, France, 14–16 December 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–5, ISBN 9780738111803. [Google Scholar]
- Chuang, P.-J.; Li, S.-H. Network Intrusion Detection using Hybrid Machine Learning. In Proceedings of the 2019 International Conference on Fuzzy Theory and Its Applications (iFUZZY), Taipei, Taiwan, 7–10 November 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–5, ISBN 9781728108407. [Google Scholar]
- Duan, J.; Zhao, H.; Qin, W.; Qiu, M.; Liu, M. News Text Classification Based on MLCNN and BiGRU Hybrid Neural Network. In Proceedings of the 2020 3rd International Conference on Smart BlockChain (SmartBlock), Zhengzhou, China, 23–25 October 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–6, ISBN 9781665440738. [Google Scholar]
- Di, W.; Jiang, Z.; Xie, X.; Wei, X.; Yu, W.; Li, R. LSTM Learning with Bayesian and Gaussian Processing for Anomaly Detection in Industrial IoT. IEEE Trans. Ind. Inf. 2020, 16, 5244–5253. [Google Scholar] [CrossRef] [Green Version]
- Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 53. [Google Scholar] [CrossRef] [PubMed]
- Gong, W.; Chen, H.; Zhang, Z.; Zhang, M.; Wang, R.; Guan, C.; Wang, Q. A Novel Deep Learning Method for Intelligent Fault Diagnosis of Rotating Machinery Based on Improved CNN-SVM and Multichannel Data Fusion. Sensors 2019, 19, 1693. [Google Scholar] [CrossRef] [Green Version]
- Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
- Remya, S.; Sasikala, R. Performance evaluation of optimized and adaptive neuro fuzzy inference system for predictive modeling in agriculture. Comput. Electr. Eng. 2020, 86, 106718. [Google Scholar] [CrossRef]
- Onan, A. Sentiment analysis on massive open online course evaluations: A text mining and deep learning approach. Comput. Appl. Eng. Educ. 2021, 29, 572–589. [Google Scholar] [CrossRef]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal Loss for Dense Object Detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 2999–3007, ISBN 9781538610329. [Google Scholar]
- Qiao, S.; Han, N.; Huang, J.; Yue, K.; Mao, R.; Shu, H.; He, Q.; Wu, X. A Dynamic Convolutional Neural Network Based Shared-Bike Demand Forecasting Model. ACM Trans. Intell. Syst. Technol. 2021, 12, 1–24. [Google Scholar] [CrossRef]
- Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. arXiv 2016, arXiv:1603.04467. [Google Scholar]
- Durieux, T.; Ferreira, J.F.; Abreu, R.; Cruz, P. Empirical Review of Automated Analysis Tools on 47,587 Ethereum Smart Contracts. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, Seoul, Korea, 27 June–19 July 2020; pp. 530–541. [Google Scholar] [CrossRef]
- Durieux, T.; Madeiral, F.; Martinez, M.; Abreu, R. Empirical review of Java program repair tools: A large-scale experiment on 2141 bugs and 23,551 repair attempts. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ’19), Tallinn, Estonia, 26–30 August 2019; Dumas, M., Ed.; Association for Computing Machinery: New York, NY, USA, 2019; pp. 302–313, ISBN 9781450355728. [Google Scholar]
- Zhuang, Y.; Liu, Z.; Qian, P.; Liu, Q.; Wang, X.; He, Q. Smart Contract Vulnerability Detection using Graph Neural Network. In Proceedings of the 29th International Joint Conference on Artificial Intelligence, Yokohama, Japan, 11–17 July 2020; des Jardins, M., Bessiere, C., Eds.; ACM: New York, NY, USA, 2020; pp. 3283–3290, ISBN 9780999241165. [Google Scholar]
- Liu, Z.; Qian, P.; Wang, X.; Zhu, L.; He, Q.; Ji, S. Smart Contract Vulnerability Detection: From Pure Neural Network to Interpre Graph Feature and Expert Pattern Fusion. arXiv 2021, arXiv:2106.09282. [Google Scholar]
Vulnerability Name | Numbers |
---|---|
Callstack Depth Attack | 1378 |
Integer Overflow | 1640 |
Integer Underflow | 1988 |
Reentry | 1719 |
Timestamp Dependency | 1671 |
Infinite Loop | 1317 |
Model Name | Classification Method One | Embedding Method | Classification Method Two | Embedding Method | Embedding Size | Dropout | Epoch | Accuracy (%) |
---|---|---|---|---|---|---|---|---|
M1 | CNN | Word2Vec | CNN | Word2vec | 300 | 0.5 | 50 | 80.78 |
M2 | CNN | Word2Vec | GRU | Word2vec | 300 | 0.5 | 50 | 80.33 |
M3 | CNN | Word2Vec | LSTM | Word2vec | 300 | 0.5 | 50 | 79.67 |
M4 | CNN | Word2Vec | BiLSTM | Word2vec | 300 | 0.5 | 50 | 81.14 |
M5 | CNN | Word2Vec | BiGRU | Word2vec | 300 | 0.5 | 50 | 82.10 |
M6 | CNN | Word2Vec | CNN | FastText | 300 | 0.5 | 50 | 79.25 |
M7 | CNN | Word2Vec | GRU | FastText | 300 | 0.5 | 50 | 80.67 |
M8 | CNN | Word2Vec | LSTM | FastText | 300 | 0.5 | 50 | 79.45 |
M9 | CNN | Word2Vec | BiLSTM | FastText | 300 | 0.5 | 50 | 83.55 |
CBGRU | CNN | Word2Vec | BiGRU | FastText | 300 | 0.5 | 50 | 85.80 |
M10-A | CNN | Word2Vec | \ | \ | 300 | 0.5 | 50 | 75.67 |
M11 | BiLSTM | Word2Vec | \ | \ | 300 | 0.5 | 50 | 74.60 |
M12 | BiGRU | Word2Vec | \ | \ | 300 | 0.5 | 50 | 75.56 |
M13 | CNN | FastText | \ | \ | 300 | 0.5 | 50 | 77.65 |
M14 | BiLSTM | FastText | \ | \ | 300 | 0.5 | 50 | 75.04 |
M15-B | BiGRU | FastText | \ | \ | 300 | 0.5 | 50 | 78.75 |
Vulnerability Type | Accuracy | Precision | Recall | F1-Score |
---|---|---|---|---|
Infinite Loop | 93.16% | 89.15% | 98.29% | 93.50% |
Reentrancy | 93.30% | 96.30% | 85.95% | 90.92% |
Integer Overflow | 86.54% | 87.23% | 85.66% | 86.43% |
Callstack Depth Attack | 90.31% | 90.04% | 88.41% | 90.21% |
Timestamp Dependency | 93.02% | 89.47% | 97.45% | 93.29% |
Integer Underflow | 85.43% | 86.15% | 84.42% | 85.28% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, L.; Chen, W.; Wang, W.; Jin, Z.; Zhao, C.; Cai, Z.; Chen, H. CBGRU: A Detection Method of Smart Contract Vulnerability Based on a Hybrid Model. Sensors 2022, 22, 3577. https://doi.org/10.3390/s22093577
Zhang L, Chen W, Wang W, Jin Z, Zhao C, Cai Z, Chen H. CBGRU: A Detection Method of Smart Contract Vulnerability Based on a Hybrid Model. Sensors. 2022; 22(9):3577. https://doi.org/10.3390/s22093577
Chicago/Turabian StyleZhang, Lejun, Weijie Chen, Weizheng Wang, Zilong Jin, Chunhui Zhao, Zhennao Cai, and Huiling Chen. 2022. "CBGRU: A Detection Method of Smart Contract Vulnerability Based on a Hybrid Model" Sensors 22, no. 9: 3577. https://doi.org/10.3390/s22093577