0% found this document useful (0 votes)
13 views11 pages

Adversarial Examples For CNN-Based Malware Detecto

This article discusses the vulnerability of CNN-based malware detectors to adversarial examples (AEs) and proposes novel attack methods that achieve high success rates. It highlights the effectiveness of adversarial training as a defensive mechanism while also addressing its security risks. The authors introduce a pre-detection mechanism to enhance the safety and efficiency of malware detection against evasion attacks.

Uploaded by

Andi Novianto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views11 pages

Adversarial Examples For CNN-Based Malware Detecto

This article discusses the vulnerability of CNN-based malware detectors to adversarial examples (AEs) and proposes novel attack methods that achieve high success rates. It highlights the effectiveness of adversarial training as a defensive mechanism while also addressing its security risks. The authors introduce a pre-detection mechanism to enhance the safety and efficiency of malware detection against evasion attacks.

Uploaded by

Andi Novianto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2913439, IEEE Access

Date of publication xxxx 00, 0000, date of current version January 30, 2019.
Digital Object Identifier 10.1109/ACCESS.2019.DOI

Adversarial Examples for CNN-based


Malware Detectors
BING-CAI CHEN1, 2 , (Member, IEEE), ZHONG-RU REN1 , CHAO YU1 , IFTIKHAR HUSSAIN3 ,
AND JIN-TAO LIU1 .
1
School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
2
School of Computer Science and Technology, Xinjiang Normal University, Urumqi 830054, China
3
School of Computer Science and Technology, University of Science and Technology of China, Hefei 230000, China
Corresponding author: ZHONG-RU REN and CHAO YU (e-mail: [email protected]).
This work was supported by the Natural Science Foundation of China under Grant 61771089 and Grant 61502072.

ABSTRACT The Convolutional Neural Network (CNN) based models have achieved tremendous
breakthroughs in many end-to-end applications, such as image identification, text classification and speech
recognition. By replicating these successes to the field of malware detection, several CNN-based malware
detectors have achieved encouraging performance without significant feature engineering effort in recent
years. Unfortunately, by analyzing their robustness using gradient-based algorithms, several studies have
shown that some of these malware detectors are vulnerable to evasion attacks (also known as adversarial
examples). However, the existing attack methods can only achieve quite low attack success rates. In this
paper, we propose two novel white-box methods and one novel black-box method to attack a recently
proposed malware detector. By incorporating the gradient-based algorithm, one of our white-box methods
can achieve a success rate of over 99%. Without the prior knowledge of the exact structure and internal
parameters of the detector, the proposed black-box method can also achieve a success rate of over 70%.
In addition, we consider adversarial training as a defensive mechanism in order to resist evasion attacks.
While proving the effectiveness of adversarial training, we also analyze its security risk, that is, a large
number of adversarial examples can poison the training dataset of the detector. Therefore, we propose a
pre-detection mechanism to reject the adversarial examples. Experiments show that this mechanism can
effectively improve the safety and efficiency of malware detection.

INDEX TERMS adversarial examples, CNN, malware detection

I. INTRODUCTION ware, but these methods rely on the rules/patterns constructed


The detection of malicious software (malware) has been by domain experts, which are often error-prone and time
playing an increasingly important role in cyber security. In consuming. Therefore, traditional malware detection meth-
recent years, the number of malware and its variants has ods cannot adapt to rapidly growing malware and a large
shown a trend of explosive growth [1]. For example, the number of variants. Machine learning algorithms provide an
notorious WannaCry ransom virus, which caused huge losses opportunity to solve this issue by generalizing known mal-
worldwide in May 2017, has generated a number of variants ware to new ones in an efficient way [4]. In the last decades,
that have been active until now. a great number of machine learning based methods have been
proposed in the literature [5]–[8]. However, most of these
Anti-virus products provide some protection against mal- methods require substantial effort and domain expertise in
ware. Traditional malware detection methods, such as creating and identifying important features, calling for new
signature-based and heuristic-based methods, have been end-to-end malware detection methods without significant
widely used by these anti-virus products untill now [2]. Typi- manual feature engineering.
cally, signature-based methods are primarily used to identify
the known malware, so these methods can be easily bypassed As a widely used machine learning algorithm, Convolu-
by malware writers using anti-anti-virus techniques (such tional Neural Networks (CNN) excels in many end-to-end ap-
as encryption, packing, obfuscation, and polymorphism) [3]. plications, such as image identification [9], text classification
Heuristic-based methods can detect partially unknown mal- [10], and speech recognition [11], because of its excellent

VOLUME 4, 2016 1

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2913439, IEEE Access

Chen et al.: Adversarial Examples for CNN-based Malware Detectors

feature learning capabilities. By replicating these successes the detector details.


to the field of malware detection, in recent years, several 2) We consider adversarial training [14] as a defensive
CNN-based malware models have been proposed, achieving mechanism in order to resist evasion attacks. While proving
high detection accuracy without domain knowledge [12], the effectiveness of adversarial training, we also analyze its
[13]. security risk that the training dataset of the detector can be
However, it has been shown that most machine learning poisoned by a large number of AEs. Therefore, we propose
based models, including state-of-the-art CNN based models, a pre-detection mechanism, which can effectively reject the
are vulnerable to evasion attacks, also known as adversarial AEs, to protect the malware detector.
examples (AEs) [14]. AEs are a class of inputs that are The rest of this paper is organized as follows. Section II
derived from legitimate inputs by adding carefully chosen presents background and related work. Section III describes
perturbations such that the model can be induced to output the methods of evasion attacks. Experiments for attacks are
erroneous predictions. The majority of existing studies main- given in Section IV. Defenses against these attacks are given
ly focus on image classification models [15]–[17], where in Section V. Finally, we conclude the paper in Section VI.
attackers add subtle modifications that are beyond human
recognition to the input image pixels so as to deceive the II. RELATED WORK
victim models. Similarly, in the domain of malware detec- Malware detection is gradually shifting from traditional rule-
tion, an AE is a carefully modified binary file that is derived based approaches to machine learning based methods [22].
from existing malware but misclassified as benign to avoid In this section, we first give a short introduction to machine
detection. Although the use of CNN to detect malware is well learning based malware detection methods (II-A), and then
known, the emergence of AEs makes the existing CNN-based mainly focus on CNN-based malware detectors (II-B). Sub-
malware detectors no longer robust. Thus, it is highly urgent sequently, we discuss the methods of evasion attacks in the
to analyze the vulnerability of existing malware detectors field of image classification and malware detection (II-C).
and propose effective defensive mechanisms to improve the Finally, we briefly review several defensive mechanisms that
robustness of these detectors. have been proposed so far (II-D).
In this paper, we extensively investigate the vulnerability
of the CNN-based malware detectors, specifically a recent- A. MACHINE LEARNING BASED MALWARE DETECTION
ly proposed detector Malconv [12]. Most attack methods Most machine learning based malware detection methods
in previous literature [18]–[20] against Malconv appended analyze the input binary files by man-made features. These
perturbations to the end of malicious files for evading de- features, which are extracted by security experts according
tection. These perturbations were all initialized by random to the specific format of the software, largely determine
noises and iteratively modified by gradient-based algorithms. the quality of detections. The software formats in different
However, the ignorance of the importance of selecting initial operating systems are quite different. For example, the PE
perturbations can lead to low attack success rates using these format [23] is the standard format for Windows operating
methods. To address this issue, we propose two novel white- system software. According to the way of feature extraction,
box attack methods that use saliency vectors to select pertur- these machine learning based methods can be mainly divided
bations from benign files. For an explicit illustration, we use a into dynamic methods and static methods. Static methods,
saliency vector to represent the feature (benign or malicious) including n-gram based, PE-header based, and multi-features
of each region in an input file. These saliency vectors can be based methods etc. [5], [6], [24], are simple and efficient,
generated by using the Gradient-weighted Class Activation which extract features directly from the raw bytes of static
Mapping (Grad-CAM) method [21]. In addition, we also binary files. Dynamic methods, such as system-call-based
present a black-box attack method for situations when attack- and behavior-based methods [7], [8], have higher detection
ers cannot know the exact structure and internal parameters accuracy but require more complex feature engineering by
of the victim model. By implementing these attack methods running binary files. In terms of classification algorithm-
on a dataset of Windows Portable Executable (PE) files, we s, Support Vector Machine (SVM), Random Forest, fully-
demonstrate the vulnerability of the CNN-based malware connected Deep Neural Networks (DNN), Recurrent Neural
detector to evasion attacks. At last, we also consider two Network (RNN), and CNN are widely used in recent years
defensive mechanisms, including adversarial training and [25]–[29]. Since most machine learning based methods re-
rejecting AEs, in order to resist evasion attacks. quire a lot of effort and domain expertise in manual feature
The major contribution of this paper is as follows: engineering, end-to-end malware detection methods, without
1) We propose two novel white-box attack methods and a domain knowledge, have attracted much more attention in
novel black-box attack method against CNN-based malware recent years.
detectors. When attacking a recently proposed malware de-
tector Malconv, one of our white-box methods, by incorporat- B. CNN-BASED MALWARE DETECTORS
ing the Fast Gradient Sign Method (FGSM) [15], can achieve Malware examples are generated by real attackers, who usu-
an attack success rate of over 99%, and our black-box method ally use encryption and obfuscation techniques for malware
can also achieve a success rate of over 70% without knowing to combat static parsing and dynamic analyzing. In this
2 VOLUME 4, 2016

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2913439, IEEE Access

Chen et al.: Adversarial Examples for CNN-based Malware Detectors

case, manual feature engineering requires significant effort erate AEs against a Gradient Boosted Decision Tree (GBDT)
of experts with sufficient prior knowledge. As a typical based malware detector. Grosse et al. [33] proposed a JSMA-
end-to-end algorithm, CNN has excellent feature learning based method against DNN-based malware detection models.
capabilities in many real-life applications, including malware Hu et al. [34] generated AEs against DNN-based models
detection. Recent studies by Raff et al. [12] and Krc̆al et by a black-box method based on Generative Adversarial
al. [13] have reported effective end-to-end malware detectors Networks (GAN). Attack methods against the CNN-based
using the CNN algorithm. These detectors analyze the raw model Malconv have also been proposed by Kolosnjaji et al.
bytes directly and discriminate malicious and benign files [18], Kreuk et al. [19], and Suciu et al. [20] more recently.
by automatically extracting features. The detector named Directly modifying the internal bytes of PE files during
Malconv proposed by Raff et al. [12], using a shallow CNN- the process of generating AEs requires sufficient domain
based model, could obtain a detection accuracy of 95.9% expertise, so most these methods simplify this process by
when trained on a dataset with two million PE files. Based appending perturbations to the end of PE files. Kolosnjaji
on this, Krc̆al et al. then proposed a deeper model which et al. [18] proved the vulnerability of Malconv for the first
could achieve better performance than Malconv when trained time, but their method requires high computational costs and
on a larger dataset. However, recent studies [18]–[20] have achieves low attack success rate. Kreuk et al. [19] proposed
demonstrated that these detectors are vulnerable to evasion an improved loss function to increase the attack success rate
attacks, which calls for a more in-depth analysis to the of the FGSM-based method. Suciu et al. [20] compared sev-
robustness of these detectors and motivates our work in this eral attack methods on a large-scale dataset and a small-scale
paper. dataset. Their experiments show that the FGSM-based attack
method hardly works on the small-scale dataset but achieves
C. EVASION ATTACKS a higher success rate on the large-scale dataset. Common to
Evasion attacks use AEs to trick the model classifier into out- these methods is that the perturbations added to AEs are all
putting incorrect classification results. An AE x̃ is generated initialized by random noises and then iteratively modified
by adding iteratively modified perturbations η to a correctly by gradient-based algorithms. However, the ignorance of the
classified original file x. x̃ is called an effective AE when importance of selecting initial perturbations can lead to low
the model outputs the incorrect classification result for x̃. At success rates of these methods. In contrast, we use saliency
present, the majority of existing evasion attack methods, such vectors to select perturbations in benign files in order to
as L-BFGS [14], Deepfool [30], FGSM [15], and JSMA [16], increase the attack success rate.
focus on the field of image classification. The L-BFGS and
Deepfool reduce the AEs generation problem to the process D. DEFENSIVE MECHANISMS
of searching for the optimal perturbation η. Their complex Defenses against evasion attacks mainly focus on the field
optimization processes result in slow speed and high compu- of image classification. The extending defensive distillation
tational costs. The advantage of FGSM and JSMA is that they method [35] proposed by Papernot et al., which can identify
are easier to generate AEs when the model is very sensitive abnormal inputs with large uncertainty, protects the original
to input changes. The difference is that FGSM can generate model by training a distillation model of the same scale.
AEs quickly when using larger perturbations, but JSMA is Szegedy et al. [14] proposed to actively construct virtual AEs
much slower because it requires adversarial saliency maps for adversarial training. Although adversarial training is ef-
to be calculated at each iteration. Therefore, FGSM is more fective, it cannot resist all AEs and still has limitations. Hos-
widely used when generating AEs with large perturbations. seini et al. [36] proposed to add ‘NULL‘ to the output labels
All these previous attack methods are white-box methods for training and identify AEs by classifying them as ‘NULL‘.
that require attackers to understand the exact structure and Lu et al. [37] proposed a framework named SafetyNet that
parameters of the victim model. Without knowing the model contains detectors and classifiers. Their framework uses the
details, Papernot et al. later used the FGSM and JSMA to detector to discriminate whether the input is an AE, such
implement black-box attacks against the DNN-based models that AEs will be rejected before entering into the classifier.
by training substitute models [17], [31]. The defensive mechanisms against malware AEs are mainly
Recently, generating malware AEs has also become a hot adversarial training and defensive distillation. Al-Dujaili et
topic in the security field. However, different from the image al. [38] proposed an online adversarial training framework
AEs, there are more stringent restrictions on the generation named SLEIPNIR that treats adversarial training as a saddle
of PE format malware AEs. A PE binary file is an executable point problem. Grosse et al. [33] evaluated two potential
file that is characterized by an organizational structure, and defensive mechanisms, including adversarial training and de-
any changes to its bytes can cause its functionality to be fensive distillation, for system-call-based malware detectors.
corrupted or even make it impossible to execute. So attacks They showed that the effect of the defensive distillation
that modify the raw bytes of PE files must maintain the method is not as obvious as the adversarial training. Since
full syntax structure and functionality of the original files. few studies focus on the defenses for end-to-end malware
Anderson et al. [32] proposed a black-box attack method, detectors, in this paper, we study two defensive mechanisms
which uses a Reinforcement Learning (RL) algorithm to gen- in this situation and analyze their pros and cons in detail.
VOLUME 4, 2016 3

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2913439, IEEE Access

Chen et al.: Adversarial Examples for CNN-based Malware Detectors

III. METHOD INPUT<2MB


We first describe the detail architecture of Malconv and MZ\x90\x00\x03\x00\x00\x00\x04\x00\x00\x00\xff\xff...
introduce how to generate saliency vectors for input files Tokenization+Padding
by using the Grad-CAM method. We then introduce two 78,91,3,145,1,4,1,1,1,5,1,1,1,256,256...0,0,0,0(padding with 0)
white-box attack methods by using these saliency vectors.
One is called Benign Features Append (BFA) and the other, 8D Embedding
called Enhanced-BFA, is an enhanced version of the first one 4D
by incorporating the FGSM algorithm. Finally, we briefly 4D
Conv1D Conv1D+Sigmoid
describe the Random attack method and introduce a more
efficient black-box method by summarizing the successful
128*FeatureMaps 128*FeatureMaps
experiences of Random attacks.
Gate
A. ARCHITECTURE
First, we describe the specific network architecture of the Global Max-pooling
Malconv model, as shown in Fig.1 [12]. An input x (max- 128
imum length L) is a sequence of discrete bytes, x = 128D Fully Connected Layer+Relu
(x1 , x2 , ..., xi ), i < L. Data preprocessing, ensuring that
2
the input vectors provided to the network have a fixed size
regardless of the size of input files, is first performed to Malware Softmax Benign
generate a fixed length sequence by padding 0. Then the
preprocessed sequence is mapped to a fixed-size matrix e by
FIGURE 1. Architecture of the CNN-based malware detector Mal-
an embedding layer W, where e = (e1 , e2 , ..., ei ), i = L.
Conv [12].
The embedding layer, which allows the meaning of input
bytes to depend on the context rather than the byte values,
is essentially a lookup table that maps each input byte xi to concept flowing into the final convolutional layer to produce
a D-dimensional vector ei ∈ RD . Then the matrix e is fed CAM, without modifying and retraining the original model.
into two 1-dimensional convolution layers whose activation Inspired by these efforts, we use the Grad-CAM method to
functions are sigmoid and relu respectively. The correspond- generate saliency vectors for subsequent attacks.
ing outputs (feature maps) of the two convolutional layers are
Let the convolution filter size of Malconv be sf ilter . An
multiplied element-wise. This mechanism, which is called
input file x with length m can be divided into u data blocks
gating, was proposed by Dauphin et al. [39] when dealing
of size sf ilter , where u = dm/se. vxc ∈ Ru is used to
with gradient vanishing problems in language models. After-
represent the saliency vector of x for the classification target
wards, the result of gating is passed to a global max-pooling
c ∈ { 0,1} , which can be generated as follows. First, the
layer, such that all feature maps are reduced to a fixed-size
gradient of the kth feature map Ac k for the target label c is
vector before entering into a fully-connected layer. Finally, ∂y
calculated, recorded as αkc = ∂A k , which captures the im-
the classification result will be output by a softmax layer.
portance of the kth feature map for target c. According to the
importance of all feature maps, the weighted superposition is
B. SALIENCY VECTORS
performed to obtain the saliency vector vxc = { v1c , ..., vuc } ,
Next, we describe saliency vectors in detail and introduce
where vic can be given by (1).
how to generate them by using the Grad-CAM method. A
saliency vector, which contains features of a series of data
X
 αkc Aki , i = arg max(Ak1,...,u )
blocks in an input binary file, can roughly show the benign c
vi = k (1)
and malicious regions of the file. The position of the elements 
0, i = other
in the vector corresponds with the regions of the original file
and the value of each eliment indicates the significance of If vic is positive, then the feature of ith data block is mali-
features in the corresponding region, namely, a larger value cious. Similarly, a negative value indicates benign and 0 indi-
indicates a more significant feature. In the image classifica- cates no significant feature. We used two 250KB benign and
tion task, Zhou et al. [40] proposed a method by using the malicious files to calculate their saliency vectors respectively.
Class Activation Mapping (CAM) to produce a ‘visual inter- In order to facilitate visualization, these two saliency vectors
pretation‘ for the decision of a CNN-based model. Raff et al. are converted into 500*500 grayscale images, as shown in
[12] then improved this approach to visualize the sections Fig.2. The dark lines indicate benign regions and the bright
where the malicious features are located. However, these lines represent malicious regions. The remaining regions
two methods, which all require modifying and retraining the with no significant features are represented by gray lines with
original model to obtain the CAM, are difficult to implement a pixel value of 128. It can be seen that a saliency vector can
for attackers. Selvaraju et al. [21] proposed a Grad-CAM effectively display the malicious and benign regions of an
method more recently, which uses the gradients of target input file. By comparison, it can be observed that the bright
4 VOLUME 4, 2016

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2913439, IEEE Access

Chen et al.: Adversarial Examples for CNN-based Malware Detectors

for appending. The more benign features are appended to the


end of the malicious file x, the greater the probability that x̃
will be identified as a benign file by the victim model. The
pseudocode is described in Algorithm 1.

Algorithm 1 The BFA attack method


Input: x0 , x, ybenign , sf ilter , L, m;
Output: x̃;
1: A = Model(x0 );
2: v = GradCAM(A);
(a) Benign (b) Malware 3: l = Index(v < 0);
4: q = min(p, b(L − m)/sf ilter c );
FIGURE 2. Grayscale images converted from the saliency vectors of
a benign file (a) and a malicious file (b) 5: padidx = ( bm/sf ilter c + 1) ∗ sf ilter ;
6: for i = 1 to q do
7: x̃ = Append(x, padidx, x0 [li :li + sf ilter ]);
(malicious) region of the malicious file is obviously larger 8: if Model.predict(x̃) == ybenign then
than the benign one. 9: return x̃;
10: else
C. WHITE-BOX ATTACK METHODS 11: padidx = padidx + i ∗ sf ilter ;
In the following section, we introduce two white-box at- 12: end if
tack methods against Malconv using saliency vectors. Both 13: end for
methods require attackers to get the exact structure and all 14: return F alse;
parameters of the victim model. The difference is that the
first attack method only needs to debug the victim model
once for generating saliency vectors, and then the model can 2) Enhanced-BFA Method
be regarded as a black-box for subsequent attacks, while the The Enhanced-BFA attack method can significantly improve
second attack method, by incorporating the FGSM algorithm, the attack success rate by incorporating the FGSM [15]
requires continuous debugging of the model to obtain a algorithm with the BFA attack method. The FGSM-based
higher attack success rate. attacks against Malconv proposed by Kreuk et al. [19] and
Suciu et al. [20], using random noises to initialize the pertur-
1) Benign Features Append Method bations, could only achieve quite low attack success rate. By
The BFA attack method can generate effective AEs by ap- analyzing these random perturbations flowing in the model,
pending carefully selected perturbations to the end of mali- we can discover why these methods are inefficient. At first,
cious files. These perturbations are selected from benign files these random raw-byte perturbations are mapped to random
by using saliency vectors. A file x0 that was predicted to be vectors through the embedding layer, then features of these
benign by the model is chosen to generate the saliency vector vectors are extracted by the convolution layer. Afterwards,
v. First, the feature maps A are calculated by the Malconv these features are passed to the global maximum pooling
model (line 1). The saliency vector v = { v1c , ..., vuc } can be layer before entering into the fully-connected layer. Since
obtained using the Grad-CAM method described in Subsec- most feature values of these random vectors are not largest
tion III-B, where vic can be given by Equation (1) (line 2). when passed to the global maximum pooling layer, these
The positions of the negative elements in the vector are used features will be ignored by the model before entering the
as indexes of the benign feature blocks in x0 , denoted as l = subsequent fully-connected layer. In this situation, the back
{ l1 , l2 , ..., lp } , where p is the number of data blocks with be- propagation gradients of these vectors will be 0 most of
nign features (line 3). A benign block represents a data block the time, and thus these random vectors cannot be modified
(size is sf ilter ) corresponding to a benign feature index. If by the gradient-based algorithms (e.g. FGSM) effectively.
the size of a malicious file x is m and the maximum length Moreover, since the raw-byte inputs are discrete and their
of model input is L, then the maximum number of data blocks mapped vectors are continuous, the modified vectors cannot
that can be appended to x is b(L − m)/sf ilter c (line 4). At be mapped back to raw bytes by looking up the table W
each time, one data block is selected from p benign blocks as directly. So even if these random vectors are modified suc-
perturbations and appended to x for generating an AE x̃ (line cessfully, the resulting AEs may also be invalid because the
7). This process loops up to min(p, b(L − m)/sf ilter c ) times modified vectors cannot always be mapped back to valid raw-
until x̃ is predicted to be benign (lines 6-13). It is important byte perturbations. Therefore, using random noise as the ini-
to note that the starting location for appending should be set tial perturbations will make these methods have low success
to padidx = ( bm/sf ilter c + 1) ∗ sf ilter . For convenient rates. In contrast, the appended perturbations in our attack
illustration, we will ignore the padding bytes, which will be method, which are initialized by significant benign blocks,
initialized to 0, from the end of the file to the starting location can quickly attract the attention of the model to obtain back
VOLUME 4, 2016 5

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2913439, IEEE Access

Chen et al.: Adversarial Examples for CNN-based Malware Detectors

propagation gradients. On this basis, by incorporating the Algorithm 2 The Enhanced-BFA attack method
FGSM algorithm to iteratively modify these perturbations, Input: x0 , x, ybenign , sf ilter , L, m, ε, β;
AEs can be generated more efficiently. Output: x̃;
The pseudocode of the Enhanced-BFA attack method is 1: A = Model(x0 );
described in Algorithm 2. First, using the BFA attack method 2: v = GradCAM(A);
introduced in the previous section, a benign block index 3: l = Index(v < 0);
vector l = { l1 , l2 , ..., lp } is generated (lines 1-3). According 4: q = min(p, b(L − m)/sf ilter c );
to the index number, a benign block will be selected from the 5: padidx = ( bm/sf ilter c + 1) ∗ sf ilter ;
original benign file x0 as initial perturbations and appended 6: for i = 1 to q do
to the malicious file x (line 7). Then the appended file xa 7: x̃ = Append(x, padidx, x0 [li :li + sf ilter ]);
is mapped to the matrix ea through the embedding layer W 8: ea = W(xa )
(line 8). Let θ be the parameters of the model, ybenign the tar- 9: while J (θ, ea , ybenign ) >= β do
get label (‘benign‘ in our case) and J (θ, ea , ybenign ) the cost 10: g = sign (∇ea J (θ, ea , ybenign ))
function used by the model, then the optimal perturbation 11: ηpad = ε Mask(g)
direction can be expressed as g, which is given as follows: 12: ea [padidx : padidx + i ∗ sf ilter ] = Mask(ea ) −
g = sign (∇ea J (θ, ea , ybenign )) (2) ηpad
13: end while
where g is a vector whose elements are equal to the sign 14: x̃ = W−1 (ea )
of the elements of the gradient of the cost function J with 15: if Model.predict(x̃) == ybenign then
respect to ea . The perturbations will be iteratively modified 16: return x̃;
until the value of the cost function is below the threshold β 17: else
(lines 9-13). In this loop, Mask operation limits the region 18: padidx = padidx + i ∗ sf ilter ;
modified by the perturbations and ε is a weight factor which 19: end if
determines the extent of each modification (lines 11-12). The 20: end for
larger the value of ε, the faster the process but the optimal 21: return F alse;
solution may not be obtained. In our case, only the appended
region can be iteratively modified by imperceptibly small
perturbations. Afterwards, by using the K-Nearest Neighbor 2) Experience-based Method
(KNN) [41] algorithm to find the nearest bytes for vectors The Experience-based attack method selects the perturba-
in ea , the modified matrix ea is mapped back to a sequence tions by summarizing the successful trajectories of Random
of raw bytes x̃ (line 14). If x̃ is predicted to be benign, then attacks. By calculating the contribution degree of each data
the function will return the successful AE x̃ (lines 15-16), block to the successful trajectories, data blocks with high
otherwise, the above process will continue to loop until no contribution degrees will be used as perturbations for subse-
more data blocks can be appended (lines 6-20). quent attacks. First, N Random attacks are performed and
n successful trajectories τ = (τ1 ,τ2 ,...,τn ) are recorded,
D. BLACK-BOX ATTACK METHODS
where τk = (bk1 ,bk2 , ..., bki ) ∈ Rlk represents the kth suc-
In the following section, we introduce two black-box attack cessful trajectory, with each bki representing the index of a
methods for situations when attackers cannot know the exact data block in x0 used for the kth successful trajectory and
structure and internal parameters of the victim model. In lk indicating the total number of data blocks appended in
order to speed up the attack process, it is assumed that the kth trajectory. Then the number of times each data block ap-
convolution filter size sf ilter is known. Apart from this, pearing in τ is counted and multiplied by the corresponding
according to the input file, only the classification result of weights. Thus the contribution degree dj of each block in x0
the model can be obtained. can be computed as follows:
n
1) Random Method X 1
Our Random method, different from the random method dj = ( count(τk , j) + αI(bklk == j)) (3)
lk
k=1
proposed by Suciu et al. [20], appends randomly selected
data blocks of a benign file instead of random noises to where j ∈ (1, ..., p), count(τk , j) indicates the number of
increase the success rate. First, a benign file x0 of moderate occurrences of jth block in τk , l1k is a penalty factor, indicat-
size m is divided into p data blocks as perturbations, where ing that the more data blocks required for the kth trajectory,
p = dm/sf ilter e. Then at each time a data block will be the smaller the importance of the trajectory, α indicates the
randomly selected from the p data blocks and appended extra weight of the last data block in τk and I is the indicator
to the malicious file x as perturbations to generate AE x̃. function (I ∈ (0, 1)). Since the entire trajectory is successful
Assuming that the maximum number of data blocks that can after appending the last data block, the last data block is given
be appended to x is q, this process will be repeated up to an extra weight α to highlight its contribution. Then a vector
min(p, q) times until the model predicts x̃ as benign. d = (d1 , d2 , ..., dp ) with contribution degrees of all the data
6 VOLUME 4, 2016

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2913439, IEEE Access

Chen et al.: Adversarial Examples for CNN-based Malware Detectors

TABLE 1. Performance (Accuracy and AUC) of Malconv


Source Of Benign Software Source Of Malicious Software
XP Benchmark DAS
Win7 Win7 DAS
Experiment 1 2 3 4 5 mean
Win8 XP VirusShare 10% Benchmark ACC 93.0% 93.8% 94.0% 90.7% 90.0% 92.3%
Others 33%
19% 20% AUC 97.6% 97.1% 98.4% 96.4% 97.1% 97.2%

22% 26% 70%


Win8 Others VirusShare
pearsonr = 0.85; p = 6.4e-119 pearsonr = 0.69; p = 6.9e-38

80 80

60 60
FIGURE 3. Source of files in the dataset. Benign (Left) and malware

Malicious_blocks

Malicious_blocks
(Right). 40 40

20
20

0
blocks in x0 is obtained. Finally, d is sorted from high to 0

low and its first q data blocks are chosen to be the final 0 10 20
Benign_blocks
30 40 5 10 15 20 25
Benign_blocks
30 35 40

perturbations for subsequent attacks. The subsequent attack


FIGURE 4. The joint distribution maps show the number of benign
process is similar to the BFA attack. and malicious blocks in a file. Benign files are showed in the left
map and malicious files are showed in the right map.
IV. EXPERIMENTS
In this Section, we describe the experiments carried out for
evasion attacks. First, the dataset for training is described. C. SALIENCY VECTORS GENERATION
Then the malware detector Malconv is trained and its per- Saliency vectors for all files in our dataset are generated to
formance for malware detection is evaluated. Afterwards, the analyze the quantity distribution of benign and malicious
saliency vectors for all files in the dataset are generated to blocks in a file. The joint distribution maps are shown
analyze the quantity distribution of benign and malicious in Fig.4, where the horizontal axis represents the number
blocks. Finally, the trained Malconv is attacked using the of benign blocks in a file and the vertical axis represents
methods introduced above and the effects of each attack the number of malicious blocks. According to the Pearson
method are analyzed in detail. Correlation Coefficient, we can see that there is a strong
linear correlation between the number of benign blocks and
A. DATASET malicious blocks contained in each file. In addition, Fig.4
In our experiments, two groups of PE files are collected to shows that the number of benign blocks in a benign file is
form a dataset. The first group is the malware set containing mainly 10-30. For convenience of description, 20 is used as
5,200 malicious files from three projects: VirusShare [42], the number of benign blocks in a benign file.
DAS [43], and malwarebenchmark [44], where each file is
labelled with 1. The second group is the benign set containing D. EVASION ATTACKS
5,150 benign files from the pure version of Windows XP Using the model trained in Section IV-B as the original
(32bit/SP3), Windows 7 ultimate (64bit/SP1), Windows 8.1 model, two white-box methods in Section III-C, two black-
(64bit) image and more than 30 software companies, where box methods in Section III-D and the FGSM-based method
each file is labelled with 0. All these files are larger than 1KB described by Suciu et al. [20] are used to generate AEs for
and less than 2MB, including multiple PE file types (such as all files in the malware dataset. The effectiveness of all these
exe, dll etc.). Source of files in the dataset is shown in Fig.3. attacks are evaluated by using the success rate (SR): the
percentage of AEs that can successfully evade detection.
B. MODEL PERFORMANCE EVALUATION Table 2 gives the results of SR using three white-box attack
The Malconv model is reproduced by using the Keras library methods, with different sizes of appended perturbations. The
[45] and its parameters used for training are as follows: the BFA method can achieve a SR of 60% by appending 20
maximum filesize is 2,000,000 Bytes, 1-dimensional convo- benign blocks (10,000 bytes) taken from only one benign file.
lution filter size is 500 and stride is 500. Other parameters If another benign file is taken to get 40 benign blocks (20,000
are shown in Fig.1. The dataset is shuffled and divided into bytes) for attacking, the SR will increase to 90%. Further-
a training set, validation set and test set by 80%, 10%, and more, the Enhanced-BFA attacks (ε = 0.01, β = 0.001) can
10% respectively. As with the original literature, the metrics achieve a SR of 74% by appending only one benign block
accuracy and AUC are used to evaluate the performance (500 bytes). By appending 40 benign blocks, more than 99%
of the model. All experiments are performed on a CUDA- of AEs can successfully evade detection. In contrast, it can
enabled NVIDIA Tesla K80 GPU server. In order to lower the be seen that the SR of the FGSM-based attacks is very low,
bias, the training and testing processes are repeated 5 times, only about 1~2%, which is similar to the result on Mini-
as shown in Table 1. It can be seen that the average accu- dataset reported by Suciu et al. [20]. The FGSM-based (also
racy and AUC are similar to the original literature reported known as FGM) attack method in [20], which could only
(ACC=94.0%, AUC=98.1%) [12]. achieve quite low SR on the Mini-dataset (8,598 files) but
VOLUME 4, 2016 7

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2913439, IEEE Access

Chen et al.: Adversarial Examples for CNN-based Malware Detectors

TABLE 2. The success rates of 3 white-box attack methods with


different sizes of appended perturbations.
Appending Bytes FGSM BFA Enhanced-BFA 1.0
500 1% 3% 74%
2000 1% 4% 81%
5000 2% 23% 90%
0.8
10000 2% 60% 96%
20000 2% 90% 99%

Success Rate (%)


TABLE 3. The success rates of 2 black-box attack methods with 0.6

different sizes of appended perturbations.


Appending Bytes Random Experience-based
500 1% 9% 0.4 Enhanced-BFA
2000 2% 35% BFA
FGSM
5000 3% 53%
Experience-based
10000 4% 63% 0.2 Random(Baseline)
20000 7% 73%

0.0
0 2500 5000 7500 10000 12500 15000 17500 20000
achieve 71% SR on the Full-dataset (16.3 million files), are Appending Size (Byte)
unstable and the reason why this method is inefficient has
been analyzed in section III-C2.
FIGURE 5. The success rates of 5 attack methods with different
Table 3 gives the results of SR using two black-box attack
sizes of appended perturbations. Random attack is the base-
methods. The Random attack method is used as a baseline
line and FGSM-based attack is proposed in [20].
to evaluate other methods. Although the Random method
achieves low SR, its successful attack trajectories can be
recorded to implement the Experience-based attacks. We
malware detectors.
performed 19,150 Random attacks (by appending 40 data
blocks) and recorded 1,340 successful trajectories, such that
40 high contribution degree blocks could be obtained for V. DEFENSES
subsequent attacks. The most interesting finding is that more In this section, we investigate two defensive mechanisms
than half of these blocks are identical to the benign blocks against evasion attacks, one is adversarial training, and the
obtained by the BFA attack method, which means that we other is rejecting AEs. Adversarial training can improve the
can use the Experience-based method instead of the BFA robustness of the model itself to AEs, and rejecting AEs is a
method when the model details are unknown. The results in pre-detection mechanism that requires an additional database
Table 3 show that the SR of Experience-based attack (α=2) system to help identify the AEs.
is similar to that of the Full-dataset FGSM attack reported in
[20]. Moreover, if more Random attacks can be performed, 1) Adversarial Training
more effective benign blocks will be obtained through more The most common defensive mechanism against evasion
successful trajectories, resulting in a better performance of attacks is adversarial training, which was first proposed by
Experience-based attacks. The most important is that the Szegedy et al. [35]. Adversarial training involves the follow-
Experience-based method does not need to understand the ing steps: a) Train the model F on the original training set
exact structure and parameters of the victim model. Fig.5 D = M ∪ B, where M is the malware set, and B is the
plots the SR curves of the above five attack methods, with set of benign; b) generate AEs set A against F using the
different sizes of appended perturbations. It can be seen that evasion attack methods described in Section III; c) modify
the Enhanced-BFA method achieves the highest SR with the training set D̃ = D ∪ A and perform iterative training on
minimal perturbations. Although BFA and Experience-based the basis of F .
method start to achieve lower SR, SR will rise rapidly as the First, two groups of effective AEs sets Atrain and Atest ,
perturbations increases. containing 5,168 and 5,152 AEs respectively, are actively
In summary, the two write-box methods and one black-box generated by using the Enhanced-BFA method. The initial
method we proposed can effectively generate AEs to deceive perturbations required to generate these two group AEs are
the CNN-based malware detector Malconv. The Enhanced- selected from two files, xtrain in the training set and xtest
BFA attack method is the most efficient when the exact in the test set. These two files are all predicted to be benign
structure and parameters of the victim model can be given be- by F with high probability. Different number of AEs from
forehand. If the model details are unknown, the Experience- these two AEs groups are used to create new training sets
based attack method can also achieve satisfactory results. and then the new model F̃ can be retrained based on F .
Although these attacks in this paper are all against Malconv, Afterwards, F̃ is used to evaluate the SR of all AEs and to
they can be readily extended to other similar CNN-based test whether xtrain and xtest are correctly classified. Table
8 VOLUME 4, 2016

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2913439, IEEE Access

Chen et al.: Adversarial Examples for CNN-based Malware Detectors

TABLE 4. Evaluation of adversarial training by adding different


number of AEs to training set.
the hash value of the entire file Hash(x), number of times
the record has been accessed xnum and the last accessed
Add Number 2 5 10 50 100 200 timestamp tnow . The pre-detection mechanism identifies and
SR(Atrain ) 59.8% 24.9% 21.8% 4.6% 1.6% 0.4%
SR(Atest ) 62.2% 28.4% 22.5% 4.4% 1.4% 0.6% records the AEs by querying and updating the database.
xtrain      
xtest     × ×
Malconv
Code Hash Index
Malware Predict
4 reports the evaluation and test results in detail. The first Insert
Query
two lines indicate the SR of the two groups of AEs against
the retrained model F̃ . SR(Atrain ) represents the SR of the
Table
first group of AEs Atrain and SR(Atest ) the SR of the second
group Atest . The last two lines indicate whether xtrain and LIEF PK Index Select Record
xtest are correctly classified by F̃ .
Filehash
As can be seen from the data in Table 4, when 5 AEs are Not AE
No Compare Malware
added to the training set (8,280 files), the SR of other AEs Record Visits DB Update
in both groups will reduce to 29%. If 50 AEs are added, the
Timestamp
SR is only less than 5% and lower than the detection error Index
Hash Find AE
rate (7.7%) of the original model, thus the remaining few Input FileHash
GetTime Reject
Now Time
AEs that can successfully avoid detection can be ignored. It
can also be observed that when 100 AEs are added, although Accept

the SR decreased to 1.4%, the file xtest , which is considered


to be benign by F , is incorrectly identified as malicious
FIGURE 6. The architecture of the pre-detection mechanism for
by F̃ . Therefore, we urge researchers to pay attention to rejecting AEs
this problem, that is, implementation of adversarial training
needs to be cautious. The same group of AEs cannot be The Algorithm 3 highlights the architecture of our pre-
used in large numbers for adversarial training. Otherwise, the detection mechanism. First, Hash(xcode ) of the input file
training dataset will be poisoned. Moreover, it is necessary x is used as the index to query the database. If a record
to prevent attackers from adopting a similar method to im- exists, it first indicates that the input file is malware, then
plement poisoning attacks [46]. Attackers can use essential the values of the record and the file attributes are compared
operation system PE files to generate a large number of to discriminate whether the input file is an AE. AEs will be
ineffective AEs and send them to the detector. Although dropped but normal malware can be stored for retraining.
these AEs cannot evade detection, they may be added to the No record means that the input file can be accepted and
training set as regular malware by the defender, resulting in predicted by the malware detector. If the file is predicted to
the training dataset being poisoned. If the defender retrains be malware, its characteristics will be inserted into the table
a new detector by using the poisoned training dataset, the as a new record. In order to prevent a large number of records
essential system files may be misclassified as malware. In in the table from affecting the query speed, one can set a time
this case, the new detector may in turn damage rather than threshold T to periodically delete some expired records.
protect the operating system, so there is an urgent need for a By performing the pre-detection mechanism based on the
defensive mechanism to reject AEs. trained Malconv, we build a malware detection system which
deployed in the following configuration server: 40 Xeon(R)
2) Rejecting Adversarial Examples E5-2630 cpus, 32*16G memory, Ubuntu16.04 64-bit oper-
We next propose a pre-detection mechanism to identify AEs, ating system, MYSQL 14.14 database which is accessed
and these identified AEs are rejected before entering into the locally by the pymysql library of Python 3.6. This detection
malware detector. Fig.6 shows the specific architecture of this system continuously detects normal files (malware and be-
mechanism. A database table is used to record the character- nign) for 24 hours, and every 10 seconds an AE was inserted
istics of malware being detected. Through the expertise of the to the normal files. The experiment shows that 100% of the
PE format, it is generally agreed that directly modifying the AEs can be accurately rejected. Rejecting all AEs means
executable section of a PE file is difficult, because a subtle that our pre-detection mechanism can successfully help the
change may lead to unpredictable consequences, such as malware detector resist evasion attacks. In order to evaluate
destruction function or running error. Therefore, it is assumed the efficiency of the detection system, the clock function
that the hash values of the executable section in AEs for the in Python is used to calculate the cpu time spent by the
same malicious file should be equal. The input x is parsed by pre-detection mechanism and malware detector respectively.
LIEF library [47] to get its executable code section xcode . So The cpu time used for detecting 800 files by Malconv is
the hash values, denoted as Hash(xcode ), can be used as the about 3,800s, but for pre-detection (e.g. file parsing and hash
indexes of the table. The other fields of the table are including operations, database connecting and querying operations) is
VOLUME 4, 2016 9

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2913439, IEEE Access

Chen et al.: Adversarial Examples for CNN-based Malware Detectors

Algorithm 3 The pre-detection mechanism for rejecting AEs There are several future research works. We plan to modify
Input: x,N ; the raw bytes of the code section of the PE files to generate
Output: malware,benign,reject; AEs, and such AEs will be difficult to detect. We also plan to
1: xcode = LIEF(x); repeat the attack methods of this paper on a larger production-
2: index = Hash(xcode ); scale dataset. Moreover, the database system, which may
3: xhash = Hash(x); cause other security risks (e.g. the vulnerabilities of MySQL
4: tnow = Now(); [48]), is introduced by our pre-detection mechanism. There-
5: QueryDB(index); fore, we will continue to study the model-based defensive
6: if record exists then mechanisms to resist the evasion attacks and ultimately im-
7: rnum , rhash , rtime = GetValue(index) ; prove the robustness of the model detectors themselves.
8: if ( rnum > N ) and (rhash 6= xhash ) then
9: UpdateDB(index, rnum + 1, tnow ) REFERENCES
10: return reject;
[1] Symantec, “The 2018 internet security threat report (ISTR),” 2018.
11: else [Online]. Available: https://www.symantec.com/security-center/threat-
12: return malware; report
13: end if [2] Y. Ye, T. Li, D. A. Adjeroh, and S. S. Iyengar, “A survey on
malware detection using data mining techniques,” ACM Comput.
14: else Surv., vol. 50, no. 3, pp. 41:1–41:40, 2017. [Online]. Available:
15: if Model.predict(x)==‘malware‘ then https://doi.org/10.1145/3073559
16: InsertDB(index, 1, xhash , tnow ) ; [3] J. Aycock, “Anti-anti-virus techniques,” Computer Viruses and Malware,
pp. 97–108, 2006.
17: return malware;
[4] D. Ucci, L. Aniello, and R. Baldoni, “Survey on the usage of machine
18: else learning techniques for malware analysis,” CoRR, vol. abs/1710.08189,
19: return benign; 2017. [Online]. Available: http://arxiv.org/abs/1710.08189
20: end if [5] T. Abou-Assaleh, N. Cercone, V. Keselj, and R. Sweidan, “N-gram-based
detection of new malicious code,” in International Computer Software &
21: end if
Applications Conference-Workshops & FAST, 2004, pp. 41–42.
[6] J. Saxe and K. Berlin, “Deep neural network based malware detection
using two dimensional binary program features,” in 10th International
not more than 10s. That is to say, if the number of malware Conference on Malicious and Unwanted Software, MALWARE 2015,
Fajardo, PR, USA, October 20-22, 2015, 2015, pp. 11–20. [Online].
exceeds 0.26% of the total files being detected, the efficiency Available: https://doi.org/10.1109/MALWARE.2015.7413680
of the detection system will be higher than the single detector. [7] K. Rieck, T. Holz, C. Willems, P. Düssel, and P. Laskov, “Learning
In summary, the pre-detection mechanism can not only effec- and classification of malware behavior,” in Detection of Intrusions and
Malware, and Vulnerability Assessment, 5th International Conference,
tively reject the AEs, but also greatly improve the efficiency DIMVA 2008, Paris, France, July 10-11, 2008. Proceedings, 2008, pp.
of the malware detection system. 108–125. [Online]. Available: https://doi.org/10.1007/978-3-540-70542-
0_6
VI. CONCLUSIONS [8] G. Yan, N. Brown, and D. Kong, “Exploring discriminatory features
for automated malware classification,” in Detection of Intrusions and
In this paper, we extensively investigated the vulnerability Malware, and Vulnerability Assessment - 10th International Conference,
of the CNN-based malware detectors, specifically a recently DIMVA 2013, Berlin, Germany, July 18-19, 2013. Proceedings, 2013, pp.
proposed detector Malconv. We proposed two white-box 41–61. [Online]. Available: https://doi.org/10.1007/978-3-642-39235-1_3
[9] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, “Inception-
attack methods and one black-box attack method to attack v4, inception-resnet and the impact of residual connections on
these CNN-based malware detectors. By implementing these learning,” in Proceedings of the Thirty-First AAAI Conference
attack methods to Malconv, our Enhanced-BFA white-box on Artificial Intelligence, February 4-9, 2017, San Francisco,
California, USA., 2017, pp. 4278–4284. [Online]. Available:
method can achieve an attack success rate of over 99%, and http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14806
our Experience-based black-box method can also achieve a [10] Y. Kim, “Convolutional neural networks for sentence classification,” in
success rate of over 70%. These high attack success rates Proceedings of the 2014 Conference on Empirical Methods in Natural
strongly demonstrate the vulnerability of such CNN-based Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar,
A meeting of SIGDAT, a Special Interest Group of the ACL, 2014, pp.
malware detectors. Our Enhanced-BFA method, combining 1746–1751. [Online]. Available: http://aclweb.org/anthology/D/D14/D14-
the Grad-CAM and FGSM algorithms to improve the attack 1181.pdf
success rate, can be readily extended to other similar ad- [11] Y. Zhang, W. Chan, and N. Jaitly, “Very deep convolutional networks for
end-to-end speech recognition,” in 2017 IEEE International Conference
versarial machine learning tasks. In addition, we considered on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans,
adversarial training as a defensive mechanism in order to LA, USA, March 5-9, 2017, 2017, pp. 4845–4849. [Online]. Available:
resist evasion attacks. Experiments show that although ad- https://doi.org/10.1109/ICASSP.2017.7953077
[12] E. Raff, J. Barker, J. Sylvester, R. Brandon, B. Catanzaro,
versarial training has a certain effect, there is a risk that the and C. K. Nicholas, “Malware detection by eating a whole
training dataset can be poisoned by a large number of AEs. EXE,” in The Workshops of the The Thirty-Second AAAI
Finally, we proposed a pre-detection mechanism to reject Conference on Artificial Intelligence, New Orleans, Louisiana,
USA, February 2-7, 2018., 2018, pp. 268–276. [Online]. Available:
the AEs. Experiments show that this mechanism can not https://aaai.org/ocs/index.php/WS/AAAIW18/paper/view/16422
only effectively reject evasion attacks, but also improve the [13] M. Krčál, O. Švec, M. Bálek, and O. Jašek, “Deep convolutional malware
efficiency of malware detection. classifiers can learn from raw executables and labels only,” 2018.

10 VOLUME 4, 2016

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2913439, IEEE Access

Chen et al.: Adversarial Examples for CNN-based Malware Detectors

[14] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. J. Goodfellow, NV, USA, June 27-30, 2016, 2016, pp. 2574–2582. [Online]. Available:
and R. Fergus, “Intriguing properties of neural networks,” CoRR, vol. https://doi.org/10.1109/CVPR.2016.282
abs/1312.6199, 2013. [Online]. Available: http://arxiv.org/abs/1312.6199 [31] N. Papernot, P. D. McDaniel, and I. J. Goodfellow, “Transferability
[15] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing in machine learning: from phenomena to black-box attacks using
adversarial examples,” CoRR, vol. abs/1412.6572, 2014. [Online]. adversarial samples,” CoRR, vol. abs/1605.07277, 2016. [Online].
Available: http://arxiv.org/abs/1412.6572 Available: http://arxiv.org/abs/1605.07277
[16] N. Papernot, P. D. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and [32] H. S. Anderson, A. Kharkar, B. Filar, D. Evans, and P. Roth,
A. Swami, “The limitations of deep learning in adversarial settings,” in “Learning to evade static PE machine learning malware models via
IEEE European Symposium on Security and Privacy, EuroS&P 2016, reinforcement learning,” CoRR, vol. abs/1801.08917, 2018. [Online].
Saarbrücken, Germany, March 21-24, 2016, 2016, pp. 372–387. [Online]. Available: http://arxiv.org/abs/1801.08917
Available: https://doi.org/10.1109/EuroSP.2016.36 [33] K. Grosse, N. Papernot, P. Manoharan, M. Backes, and P. D. McDaniel,
[17] N. Papernot, P. D. McDaniel, I. J. Goodfellow, S. Jha, Z. B. Celik, “Adversarial examples for malware detection,” in Computer Security -
and A. Swami, “Practical black-box attacks against machine learning,” ESORICS 2017 - 22nd European Symposium on Research in Computer
in Proceedings of the 2017 ACM on Asia Conference on Computer Security, Oslo, Norway, September 11-15, 2017, Proceedings, Part II,
and Communications Security, AsiaCCS 2017, Abu Dhabi, United Arab 2017, pp. 62–79. [Online]. Available: https://doi.org/10.1007/978-3-319-
Emirates, April 2-6, 2017, 2017, pp. 506–519. [Online]. Available: 66399-9_4
https://doi.org/10.1145/3052973.3053009 [34] W. Hu and Y. Tan, “Generating adversarial malware examples for
[18] B. Kolosnjaji, A. Demontis, B. Biggio, D. Maiorca, G. Giacinto, black-box attacks based on GAN,” CoRR, vol. abs/1702.05983, 2017.
C. Eckert, and F. Roli, “Adversarial malware binaries: Evading [Online]. Available: http://arxiv.org/abs/1702.05983
deep learning for malware detection in executables,” in 26th [35] N. Papernot and P. D. McDaniel, “Extending defensive
European Signal Processing Conference, EUSIPCO 2018, Roma, distillation,” CoRR, vol. abs/1705.05264, 2017. [Online]. Available:
Italy, September 3-7, 2018, 2018, pp. 533–537. [Online]. Available: http://arxiv.org/abs/1705.05264
https://doi.org/10.23919/EUSIPCO.2018.8553214 [36] H. Hosseini, Y. Chen, S. Kannan, B. Zhang, and R. Poovendran,
[19] F. Kreuk, A. Barak, S. Aviv-Reuven, M. Baruch, B. Pinkas, and “Blocking transferability of adversarial examples in black-box learning
J. Keshet, “Adversarial examples on discrete sequences for beating systems,” CoRR, vol. abs/1703.04318, 2017. [Online]. Available:
whole-binary malware detection,” CoRR, vol. abs/1802.04528, 2018. http://arxiv.org/abs/1703.04318
[Online]. Available: http://arxiv.org/abs/1802.04528 [37] J. Lu, T. Issaranon, and D. A. Forsyth, “Safetynet: Detecting and rejecting
[20] O. Suciu, S. E. Coull, and J. Johns, “Exploring adversarial examples adversarial examples robustly,” in IEEE International Conference on
in malware detection,” CoRR, vol. abs/1810.08280, 2018. [Online]. Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, 2017,
Available: http://arxiv.org/abs/1810.08280 pp. 446–454. [Online]. Available: https://doi.org/10.1109/ICCV.2017.56
[21] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, [38] A. Al-Dujaili, A. Huang, E. Hemberg, and U. O’Reilly, “Adversarial
and D. Batra, “Grad-cam: Visual explanations from deep networks deep learning for robust detection of binary encoded malware,” in
via gradient-based localization,” in IEEE International Conference on 2018 IEEE Security and Privacy Workshops, SP Workshops 2018, San
Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, 2017, Francisco, CA, USA, May 24, 2018, 2018, pp. 76–82. [Online]. Available:
pp. 618–626. [Online]. Available: https://doi.org/10.1109/ICCV.2017.74 https://doi.org/10.1109/SPW.2018.00020
[22] M. G. Schultz, E. Eskin, E. Zadok, and S. J. Stolfo, “Data [39] Y. N. Dauphin, A. Fan, M. Auli, and D. Grangier, “Language
mining methods for detection of new malicious executables,” in 2001 modeling with gated convolutional networks,” in Proceedings of the 34th
IEEE Symposium on Security and Privacy, Oakland, California, International Conference on Machine Learning, ICML 2017, Sydney,
USA May 14-16, 2001, 2001, pp. 38–49. [Online]. Available: NSW, Australia, 6-11 August 2017, 2017, pp. 933–941. [Online].
https://doi.org/10.1109/SECPRI.2001.924286 Available: http://proceedings.mlr.press/v70/dauphin17a.html
[23] M. Pietrek, “Peering inside the pe: A tour of the win32 portable executable [40] B. Zhou, A. Khosla, À. Lapedriza, A. Oliva, and A. Torralba, “Learning
file format,” 1994. deep features for discriminative localization,” in 2016 IEEE Conference
[24] E. Raff, J. Sylvester, and C. Nicholas, “Learning the PE header, malware on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas,
detection with minimal domain knowledge,” in Proceedings of the 10th NV, USA, June 27-30, 2016, 2016, pp. 2921–2929. [Online]. Available:
ACM Workshop on Artificial Intelligence and Security, AISec@CCS https://doi.org/10.1109/CVPR.2016.319
2017, Dallas, TX, USA, November 3, 2017, 2017, pp. 121–132. [Online]. [41] T. M. Cover and P. E. Hart, “Nearest neighbor pattern classification,”
Available: https://doi.org/10.1145/3128572.3140442 IEEE Trans. Information Theory, vol. 13, no. 1, pp. 21–27, 1967.
[25] I. Santos, F. Brezo, X. Ugarte-Pedrero, and P. G. Bringas, “Opcode [Online]. Available: https://doi.org/10.1109/TIT.1967.1053964
sequences as representation of executables for data-mining-based [42] VirusShare, “Virusshare.com,” 2018, accessed: 2018-05-05. [Online].
unknown malware detection,” Inf. Sci., vol. 231, pp. 64–82, 2013. Available: https://virusshare.com/
[Online]. Available: https://doi.org/10.1016/j.ins.2011.08.020 [43] DAS, “DAS MALWERK,” 2018, accessed: 2018-05-05. [Online].
[26] A. Sharma, S. K. Sahay, and A. Kumar, “Improving the detection Available: https://dasmalwerk.eu/
accuracy of unknown malware by partitioning the executables [44] G. Liang, J. Pang, Z. Shan, R. Yang, and Y. Chen, “Automatic
in groups,” CoRR, vol. abs/1606.06909, 2016. [Online]. Available: benchmark generation framework for malware detection,” Security and
http://arxiv.org/abs/1606.06909 Communication Networks, vol. 2018, pp. 4 947 695:1–4 947 695:8, 2018.
[27] B. Kolosnjaji, A. Zarras, G. D. Webster, and C. Eckert, “Deep learning for [Online]. Available: https://doi.org/10.1155/2018/4947695
classification of malware system call sequences,” in AI 2016: Advances [45] F. Chollet et al., “Keras,” 2015. [Online]. Available:
in Artificial Intelligence - 29th Australasian Joint Conference, Hobart, https://github.com/keras-team/keras
TAS, Australia, December 5-8, 2016, Proceedings, 2016, pp. 137–149. [46] S. Shen, S. Tople, and P. Saxena, “Auror: defending against poisoning
[Online]. Available: https://doi.org/10.1007/978-3-319-50127-7_11 attacks in collaborative deep learning systems,” in Proceedings of the
[28] R. Pascanu, J. W. Stokes, H. Sanossian, M. Marinescu, and 32nd Annual Conference on Computer Security Applications, ACSAC
A. Thomas, “Malware classification with recurrent networks,” in 2016, Los Angeles, CA, USA, December 5-9, 2016, 2016, pp. 508–519.
2015 IEEE International Conference on Acoustics, Speech and Signal [Online]. Available: http://dl.acm.org/citation.cfm?id=2991125
Processing, ICASSP 2015, South Brisbane, Queensland, Australia, [47] R. Thomas, “LIEF - Library to Instrument Executable Formats,” April
April 19-24, 2015, 2015, pp. 1916–1920. [Online]. Available: 2017. [Online]. Available: https://lief.quarkslab.com/
https://doi.org/10.1109/ICASSP.2015.7178304 [48] J. Fonseca, M. Vieira, and H. Madeira, “Evaluation of web security
[29] B. Athiwaratkun and J. W. Stokes, “Malware classification with LSTM mechanisms using vulnerability & attack injection,” IEEE Transactions on
and GRU language models and a character-level CNN,” in 2017 IEEE Dependable & Secure Computing, vol. 11, no. 5, pp. 440–453, 2014.
International Conference on Acoustics, Speech and Signal Processing,
ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017, 2017, pp. 2482–
2486. [Online]. Available: https://doi.org/10.1109/ICASSP.2017.7952603
[30] S. Moosavi-Dezfooli, A. Fawzi, and P. Frossard, “Deepfool: A simple and
accurate method to fool deep neural networks,” in 2016 IEEE Conference
on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas,

VOLUME 4, 2016 11

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

You might also like