Authors:
Nahid Ferdous Aurna
1
;
Md Hossain
1
;
Hideya Ochiai
2
;
Yuzo Taenaka
1
;
Latifur Khan
3
and
Youki Kadobayashi
1
Affiliations:
1
Division of Information Science, Nara Institute of Science and Technology, Nara, Japan
;
2
Grad. School of Info. Science and Tech., The University of Tokyo, Tokyo, Japan
;
3
Computer Science Department, The University of Texas at Dallas, Richardson, U.S.A.
Keyword(s):
Banking Malware, Federated Learning, Ensemble Learning, Data Heterogeneity.
Abstract:
Banking malware remains an ongoing and evolving threat as cybercriminals exploit vulnerabilities to steal sensitive user information in the digital banking landscape. Despite numerous efforts, developing an effective and privacy preserving solution for detecting banking malware remains an ongoing challenge. This paper proposes an effective privacy preserving Federated Learning (FL) based banking malware detection system utilizing network traffic flow. Challenges such as, dealing with data heterogeneity in FL scheme while maintaining robustness of the global shared model are addressed here. In our study, three distinct heterogenous datasets consisting benign and one of the prevalent malicious flows (zeus, emotet, or trickbot) are considered to address the data heterogeneity. To ensure model’s robustness, initially, we assess various models, selecting Convolutional Neural Network (CNN) for developing an ensemble model. Subsequently, FL is incorporated to maintain data confidentiality a
nd privacy where ensemble model serves as the global model ensuring the effectiveness of the approach. Moreover, to improve the FL scheme, we introduce conditional update of client models, effectively addressing data heterogeneity among the federated clients. The evaluation results demonstrate the effectiveness of the proposed model, achieving high detection rates of 0.9819, 0.9982, and 0.9997 for client 1, client 2, and client 3, respectively. Overall, this study offers a promising solution to detect banking malware while effectively addressing data privacy and heterogeneity in the FL framework.
(More)