Feature-Space Bayesian Adversarial Learning Improved Malware Detector Robustness

Doan, Bao Gia; Yang, Shuiqiao; Montague, Paul; De Vel, Olivier; Abraham, Tamas; Camtepe, Seyit; Kanhere, Salil S.; Abbasnejad, Ehsan; Ranasinghe, Damith C.

Abstract:We present a new algorithm to train a robust malware detector. Modern malware detectors rely on machine learning algorithms. Now, the adversarial objective is to devise alterations to the malware code to decrease the chance of being detected whilst preserving the functionality and realism of the malware. Adversarial learning is effective in improving robustness but generating functional and realistic adversarial malware samples is non-trivial. Because: i) in contrast to tasks capable of using gradient-based feedback, adversarial learning in a domain without a differentiable mapping function from the problem space (malware code inputs) to the feature space is hard; and ii) it is difficult to ensure the adversarial malware is realistic and functional. This presents a challenge for developing scalable adversarial machine learning algorithms for large datasets at a production or commercial scale to realize robust malware detectors. We propose an alternative; perform adversarial learning in the feature space in contrast to the problem space. We prove the projection of perturbed, yet valid malware, in the problem space into feature space will always be a subset of adversarials generated in the feature space. Hence, by generating a robust network against feature-space adversarial examples, we inherently achieve robustness against problem-space adversarial examples. We formulate a Bayesian adversarial learning objective that captures the distribution of models for improved robustness. We prove that our learning method bounds the difference between the adversarial risk and empirical risk explaining the improved robustness. We show that adversarially trained BNNs achieve state-of-the-art robustness. Notably, adversarially trained BNNs are robust against stronger attacks with larger attack budgets by a margin of up to 15% on a recent production-scale malware dataset of more than 20 million samples.

Comments:	Accepted to AAAI 2023 conference
Subjects:	Cryptography and Security (cs.CR)
Cite as:	arXiv:2301.12680 [cs.CR]
	(or arXiv:2301.12680v1 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2301.12680

Computer Science > Cryptography and Security

Title:Feature-Space Bayesian Adversarial Learning Improved Malware Detector Robustness

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators