Are - CNN - Based - Malware - Detection - Models - R
Are - CNN - Based - Malware - Detection - Models - R
ABSTRACT
The tremendous increase of malicious applications in the android
ecosystem has prompted researchers to explore deep learning based
malware detection models. However, research in other domains sug-
gests that deep learning models are adversarially vulnerable, and
thus we aim to investigate the robustness of deep learning based
malware detection models. We first developed two image-based
E-CNN malware detection models based on android permission and
intent. We then acted as an adversary and designed the ECO-FGSM
evasion attack against the above models, which achieved more than
50% fooling rate with limited perturbations. The evasion attack con-
verts maximum malware samples into adversarial samples while
minimizing the perturbations and maintaining the sample’s syntac-
tical, functional, and behavioral integrity. Later, we used adversarial Figure 1: Framework for Robust Malware Detection Models
retraining to counter the evasion attack and develop adversarially
superior malware detection models, which should be an essential
years. The current malware detection models are based on signa-
step before any real-world deployment.
ture, heuristics, etc., and cannot handle the ever-increasing volume
and veracity of android malware. Thus anti-malware community in-
CCS CONCEPTS
vestigated malware detection models based on deep learning, which
· Security and privacy → Malware and its mitigation. have shown encouraging results [3]. However, research in other
domains like image recognition, object classification, etc., suggests
KEYWORDS that deep learning models might be adversarially vulnerable [1].
Adversarial Learning, Malware Analysis and Detection, Machine Hence, malware detection models based on deep learning should be
Learning, Smartphones investigated for adversarial robustness before integrating them into
real-world solutions. The adversarial threat modeling can be used
ACM Reference Format:
Hemant Rathore, Taeeb Bandwala, Sanjay K. Sahay, Mohit Sewak. 2021. to find vulnerabilities in classification models, and it is performed
Poster Abstract: Are CNN based Malware Detection Models Robust? De- based on the adversary’s goal, knowledge, and capability.
veloping Superior Models using Adversarial Attack and Defense. In The
19th ACM Conference on Embedded Networked Sensor Systems (SenSys ’21), 2 PROPOSED FRAMEWORK
November 15–17, 2021, Coimbra, Portugal. ACM, New York, NY, USA, 2 pages. We propose a modularized framework (Fig: 1) to improve the ro-
https://doi.org/10.1145/3485730.3492867
bustness of any malware detection model. In the first step, android
samples (malware and benign) are collected from various verified
1 INTRODUCTION sources. The dataset used in this work is discussed in detail in our
The last decade witnessed an enormous increase in the number of past paper [2]. We then perform static analysis to extract permission
active android smartphone users coupled with a limitless growth and intent features from each android sample present in the dataset.
in the android application space. These devices store a vast amount These are then fed into the designed classification pipeline that
of general and personal user data that can be exploited by android processes the feature vectors for training and testing the proposed
malware, which have also grown exponentially in the last few E-CNN malware detection models. The E-CNN model implicitly
transforms processed feature vector into image that is fed into
Permission to make digital or hard copies of part or all of this work for personal or CNN layers for malware detection. Next, we design the ECO-FGSM
classroom use is granted without fee provided that copies are not made or distributed adversarial agent to perform evasion attacks against the above
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for third-party components of this work must be honored. detection models. The ECO-FGSM agent takes malware samples
For all other uses, contact the owner/author(s). and transforms them into adversarial samples that are forcibly mis-
SenSys ’21, November 15–17, 2021, Coimbra, Portugal classified as benign by E-CNN models. The agent ensures that the
© 2021 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9097-2/21/11. perturbations do not break the syntactical, functional, and behav-
https://doi.org/10.1145/3485730.3492867 ioral integrity of the malware sample. We execute the ECO-FGSM
355
SenSys ’21, November 15–17, 2021, Coimbra, Portugal Rathore et al.
evasion attack against the above E-CNN models and measure their
performance. In the final step, we execute the adversarial defense
strategy, namely adversarial retraining to counter the evasion attack
and develop adversarially superior malware detection models. We
evaluate the performance of the models before and after the attack
using accuracy, fooling rate, etc.
356