An Effective End-To-End Android Malware Detection Method - Research Base Paper PDF
An Effective End-To-End Android Malware Detection Method - Research Base Paper PDF
Keywords: Android has rapidly become the most popular mobile operating system because of its open source, rich
Android hardware selectivity, and millions of applications (Apps). Meanwhile, the open source of Android makes it
Malware detection the main target of malware. Malware detection methods based on manual features are easily bypassed by
Convolution neural network
confusing technologies and are suffering from low code coverage. Thus, we propose an automated extraction
Image feature
method without any manual expert intervention. Specifically, we characterize the vital parts of the Dalvik
executable (Dex) to an RGB (Red/Green/Blue) image. Furthermore, we propose a novel convolutional neural
network (CNN) variant with diverse receptive fields using max pooling and average pooling simultaneously
(MADRF), named MADRF-CNN, which can capture the dependencies between different parts of the image
(transferred from the Dex file) by capitalizing on multi-scale context information. To evaluate the effectiveness
of the proposed method, we conducted extensive experiments and our experimental results showed that the
Accuracy of our method is 96.9%, which is much better than state-of-the-art solutions.
1. Introduction Nikam, & Sewak, 2021; Wang et al., 2017) rely on the features ex-
tracted from Android package (APK) files without running. In addition
With the rapid development of the mobile Internet, smartphones to the classic permissions and API, some studies have found that
have become an indispensable part of people’s life. According to Android-related intents, strings and components can also effectively
statista’s smartphone operating system shipment market share report characterize malware. One of the typical works is Drebin proposed
in 2022, Android’s market share increased to 83.8% (Statista, 2021). by Arp et al. (2014). Besides, Mahindru and Sangal (2021) proposed
The huge market has also promoted the development of Android MLDroid to detect real-world malware by selecting permissions and API
malicious software (malware). A special report on Android malware calls as raw features. Gao et al. (2021) converted the malware detection
published by Chianxin Threat Intelligence Center indicated that about problem into a node classification task by mapping Apps and APIs into
2.3 million new malware were detected on the mobile terminal, and a large heterogeneous graph. Dynamic detection (Cai, Jiang, Gao, Li, &
about 6,301 new mobile phone malware were intercepted every day on Yuan, 2021; D’Angelo, Palmieri, Robustelli, & Castiglione, 2021; Enck
average. Among the malicious acts, malicious fee deductions accounted et al., 2014; Haq, Khan, & Akhunzada, 2021; Hasan, Ladani, & Zamani,
for 34.9%. In addition, they also include resource consumption with 2021; Liu et al., 2015; Millar, McLaughlin, del Rincon, & Miller, 2021;
24.2%, rogue behavior with 22.8%, privacy theft with 12.3%, decep- Sihag, Vardhan, Singh, Choudhary, & Son, 2021) refers to monitoring
tion and fraud with 4.3% and remote control with 1.5% (QiAnXin, the runtime behaviors of an App. For instance, Surendran, Thomas, and
2021).
Emmanuel (2020) proposed a graph signal-based malware detection
Android malware detection has attracted considerable attention in
method GSDroid, which captures the runtime system call dependencies
recent years, and the existing malware detection works can be roughly
as raw features. Enck et al. (2014) proposed a dynamic stain analysis
divided into two categories: static analysis and dynamic analysis. Static
tool Taintdroid to detect malware by monitoring sensitive information
analysis (Arp, Spreitzenbarth, Hübner, Gascon, & Rieck, 2014; Gao,
flow. Liu et al. (2015) used API hooks to obtain the software runtime
Cheng, & Zhang, 2021; Mahindru & Sangal, 2021; Rathore, Sahay,
✩ This study was funded by the National Key R&D Program of China (2020YFB1005500), the Leading-edge Technology Program of Jiangsu Natural Science
Foundation (BK20202001) and the National Natural Science Foundation of China (62272204).
∗ Corresponding authors.
E-mail addresses: [email protected] (H. Zhu), [email protected] (H. Wei), [email protected] (L. Wang), [email protected]
(Z. Xu), [email protected] (V.S. Sheng).
https://doi.org/10.1016/j.eswa.2023.119593
Received 27 September 2022; Received in revised form 17 January 2023; Accepted 19 January 2023
Available online 28 January 2023
0957-4174/© 2023 Elsevier Ltd. All rights reserved.
H. Zhu et al. Expert Systems With Applications 218 (2023) 119593
information to detect malware. They applied a real-time system to Ping, Sun, and Ye (2020) extracted the API call sequences by running
monitor potential privacy data abuse and system call behavior and so the App in virtual environments and processed the extracted features
on. for subsequent modules, such as Temporal Convolution Network (TCN)
These methods have achieved excellent performance. However, and attention layer. To exploit the benefits of various types of fea-
static methods are difficult to deal with code confusion technologies, ture and deep learning methods, Gibert, Mateu, and Planes (2020)
and dynamic methods are suffering from time-consuming. Another proposed a multimodal deep learning based method HYDRA, which
interesting research branch of malware detection, namely, directly uses CNN and a fully connected layer to learn meaningful features
converting a Dex file to an RGB (Red/Green/Blue) image, has achieved from multiple types of features (e.g., API-based features, Byte-based
unexpected success (Fang, Gao, Jing, & Zhang, 2020; Hu et al., 2014; features and Opcode-based features) and the learned feature are input
Marastoni, Continella, Quarta, Zanero, & Dalla Preda, 2017; Meng into softmax after being integrated by the fully connected layer. Alazab,
et al., 2016; Wang, Zhou, Lu, & Zhang, 2019). The executable codes of Alazab, Shalaginov, Mesleh, and Awajan (2020) proposed an effective
Android Apps are stored in the Dex files in hexadecimal. Correspond- machine learning based malware detection method employing request
ingly, the color in the computer can also be expressed in hexadecimal. permissions and API calls. It is worth mentioning that they divided
Compared with traditional features (such as API, permissions and com- the APIs into three groups, namely ambiguous, risky and disruptive.
ponents), image features only require simple conversion processing, Their experiments showed that the combination of destructive and
which completely gets rid of manual intervention. Meanwhile, they are risk API calls plays an important role in malware detection. Kabakus
more effective than traditional features in dealing with code confusion (2022) proposed an end-to-end Android malware detection framework
and cover almost all the codes of an App. DroidMalwareDetector based on CNN, which uses intents and API
The existing image-based malware detection methods mainly con- calls alongside the permissions to perform comprehensive malware
vert the whole Dex or manifest file into an image and feed the image analysis. Lakshmanarao and Shashi (2022) extracted opcode sequences
into the neural network. However, after in-depth analysis, we find that from Android APK files. The extracted raw features are preprocessed
not all parts of these files make a positive contribution to malware and input to Long Short-term Memory (LSTM) for malware detection.
detection. Moreover, these efforts often ignore the relationship between A series of malware detection methods based on dynamic feature
different sections of the Dex file and different parts of the corresponding analysis have also been proposed Liu, Li, Zhao, Su, and Liu (2021),
image converted, Therefore we propose a novel Android malware Wang et al. (2020), Zhang, Qi, and Wang (2020). For instance, Zhang
detection framework, named MADRF-CNN, by converting the cropped et al. (2020) proposed a stacked deep network architecture to automat-
compact Dex files into an image. The main contributions of this article ically learn the correlation between API features, which combined with
are as follows: the advantages of CNN and Bi-LSTM. They used the Cuckoo sandbox to
run the samples and simulate the user’s operation, and then extracted
• We propose an image-based end-to-end Android malware de-
the API call sequence to construct the feature vector. Xue, Zhou, Chen,
tection method without any manual expert intervention which
Luo, and Gu (2017) proposed an on-device dynamic analysis tool by
improves the anti-confusion ability. Moreover, the framework
tracking information flow and monitoring the device at the system
includes a novel method MADRF-CNN to mine the association
level and instruction level. Wang et al. (2020) monitored kernel-level
relationship between different parts of the Dex file.
source data to capture the dynamic behavior of each target process, and
• We propose a novel preprocessing method for Dex files, which
then built an anomaly detection model based on a neural embedded
can not only reduce the resource consumption of network training
network. Alzaylaee, Yerima, and Sezer (2020) proposed a deep learning
but also effectively improve detection performance by eliminating
based malware detection system DL-Droid, which utilizes dynamic
unimportant redundant sections.
analysis (e.g., API calls, Actions/Events) combining a stateful input
• We collected the latest Apps from Google Play Store and Virus-
generation method. Martín, Rodríguez-Fernández, and Camacho (2018)
share and so on to build a dataset to effectively characterize the
proposed an Android malware families classification method CANDY-
current situation of malware. We evaluated the effectiveness of
MAN by exploiting dynamic traces and Markov chains. Zhou (2021)
the proposed method in various aspects and compared it with
performed taint analysis and dynamic function tracing to identify pri-
similar state-of-the-art solutions.
vate information leaks. Cai, Meng, Ryder, and Yao (2018) presented a
The rest of this paper is arranged as follows. The second section novel dynamic malware detection method DroidCat based on Random
discusses related work. The third section illustrates the overall archi- Forest (RF) by profiling inter-component communication (ICC) intents
tecture of MADRF-CNN. The fourth section shows and discusses our and method calls.
experimental results. The fifth section summarizes our work and points The existing machine learning-based malware detection works usu-
out future work. ally extract limited features from Dex (manifest) files or part of App
runtime behaviors. These features are usually one-sided and not enough
2. Related works to fully characterize malware. The recent great success of CNN in
image recognition provides a new direction. Bakour and Ünver (2021)
In recent years, machine learning technologies based on manual proposed DeepVisDroid, which converts manifest files and Dex files
features (e.g., permissions, APIs, components, intents) are widely used into grayscale images and feeds them to CNN to detect malware. Simi-
in malware detection (Calleja, Martín, Menéndez, Tapiador, & Clark, larly, Ghouti and Imam (2020) firstly converted the executable code of
2018). As the commonest used features, permissions play an important an App into a grayscale image, and then Support Vector Machine (SVM)
role in malware detection (Li & Li, 2020; Wu et al., 2021; Zhu, Wang, is adopted for classification. Xiao and Yang (2019) proposed a CNN-
Zhong, Li, & Sheng, 2022). For instance, Zhu et al. (2022) proposed a based Android malware detection method by converting the whole
malware detection framework based on hybrid deep learning by using Dalvik bytecode into an RGB image. Bourebaa and Benmohammed
permissions and sensitive APIs to represent an App. Li and Li (2020) (2020) transformed Dex files into grayscale images and then used CNN
extracted multiple information such as permissions, components, sys- for recognition. Yadav, Menon, Ravi, Vishvanathan, and Pham (2022)
tem calls and IP addresses of Apps to construct feature vectors and proposed an image-based Android malware detection method without
improved the robustness of the proposed malware detection model relying on manual analysis. It mainly explores the performance of the
based on integrated deep learning from the perspective of adversarial pre-trained EfficientNet-B4 model in malware detection by inputting
training. Wu et al. (2021) characterized malware with API calls and RGB images converted from Dex files. Sun, Daoudi, Allix, and Bissyandé
permissions and introduced an attention mechanism based on Multi- (2021) exploited the CNN model to detect malware by feeding gray-
layer Perceptron (MLP) to detect malware and interpret it. Huang, Lu, scale images converted from binary code and metadata/configuration
2
H. Zhu et al. Expert Systems With Applications 218 (2023) 119593
files of Android APKs. Vasan et al. (2020) proposed an image-based Algorithm 1: reprocess Dex files
malware detection method IMCFN using fine-tuned CNN. Specifically, it Data: a Dex file
converts the raw malware binaries into color images and then the fine- Result: a txt file with hexadecimal characters
tuned CNN architecture is employed to detect and identify malware Array of 𝐷𝑒𝑥 ← read current Dex file;
families. Zhang, Luktarhan, Ding, and Lu (2021) proposed an effective ℎ𝑒𝑎𝑑𝑒𝑟 ← 𝐷𝑒𝑥[0, 112];
Android malware detection method based on TCN by inputting byte- // six sections with the same operation, so take
code gray images. The gray images are converted from the combination one as an example
of AndroidManifest.xml and the data section of classes.dex. for 𝑛𝑎𝑚𝑒 ← 𝑠𝑡𝑟𝑖𝑛𝑔, 𝑡𝑦𝑝𝑒, 𝑝𝑟𝑜𝑡𝑜, 𝑓 𝑖𝑒𝑙𝑑, 𝑚𝑒𝑡ℎ𝑜𝑑, 𝑐𝑙𝑎𝑠𝑠 do
Unfortunately, the entire Dex files or most of them are converted for 𝑠𝑡𝑎𝑟𝑡_𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛 ← 56 to 96 do
to images and input to CNN, which is excessively time-consuming // Reverse is a function that reverses the
and inefficient. More importantly, most parts of Dex files, such as
order of input array
data and header, are difficult to provide cost-effective information for
// hexToDec is a function that translates hex
malware analysis. Therefore, we propose a cutting method for Dex files,
number to dec number
which can reduce the overhead of subsequent deep network training
𝑛𝑎𝑚𝑒 𝑠𝑖𝑧𝑒 ←
and improve detection performance by removing redundant sections.
ℎ𝑒𝑥𝑇 𝑜𝐷𝑒𝑐(Reverse(ℎ𝑒𝑎𝑑𝑒𝑟[start_position,start_position+4]));
Specifically, these compact and high-value RGB images achieved by
our cutting method are input to the proposed MADRF-CNN network to
𝑛𝑎𝑚𝑒 𝑜𝑓 𝑓 𝑠𝑒𝑡 ←
learn more efficient features. Finally, an efficient end-to-end malware
ℎ𝑒𝑥𝑇 𝑜𝐷𝑒𝑐(Reverse(ℎ𝑒𝑎𝑑𝑒𝑟[start_position+4,start_position+8]));
detection method is proposed in this work.
𝑠𝑡𝑎𝑟𝑡_𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛 ←𝑠𝑡𝑎𝑟𝑡_𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛 + 8;
3. The proposed end-to-end malware detection framework
𝑤𝑟𝑖𝑡𝑒(Dex[𝑛𝑎𝑚𝑒 𝑜𝑓 𝑓 𝑠𝑒𝑡,𝑛𝑎𝑚𝑒 𝑜𝑓 𝑓 𝑠𝑒𝑡 + 𝑛𝑎𝑚𝑒 𝑠𝑖𝑧𝑒]);
// function 𝑤𝑟𝑖𝑡𝑒 put the needed hex characters
We propose an end-to-end malware detection framework without
into a new txt file
relying on manual features, named as MADRF-CNN, to efficiently detect
Android malware. As shown in Fig. 1, the proposed end-to-end malware end
detection framework can be divided into three main phases: Dex file end
cutting, image features generation and classification. Dex files can be
obtained by decompressing the Android APK files. When the necessary
sections of Dex files are filtered, one pixel can be generated from every
three-hexadecimal number of the sections of Dex files. Then, these data of an Android App, which can provide a panorama of the App. To
pixels can form an image. Classification is conducted by our proposed obtain the Dex file, we first decompress the APK file of each App, and
network MADRF-CNN. The source code to extract features is available then retain the files ending with .dex by matching the suffix of files. As
at https://github.com/MADRF-CNN/extractor. shown in Fig. 2, a Dex file can be divided into three portions: header,
index and data. The header portion stores the basic information such
3.1. The cut method of the dex file as sizes and offsets of other sections. The index portion consists of the
following parts: the string index, the type index, the proto index, the
field index, and the method index. The data portion contains the data
An APK file is essentially a zip file which contains the entire project section and class definitions and so on. Therefore, besides the general
of this App, including resource files, signature, Dex files, manifest files direction, a Dex file can be divided into eight sections according to the
and so on. The Dex files contain all operation instructions and runtime offset and size of each section recorded in the header portion. But not
3
H. Zhu et al. Expert Systems With Applications 218 (2023) 119593
One APK file usually contains multiple Dex files with the same 3.3. The classification based on MADRF-CNN algorithm
structures when an APK stores more than 65,536 methods. Therefore,
as shown in Fig. 3, we integrate multiple Dex files by following the Due to its strong data fitting ability, as a feature extractor or
method in Fang et al. (2020). In addition, similar to the Dex file, classifier, CNN performs well in various fields (Oquab, Bottou, Laptev,
the manifest file is another important file in APK, which contains & Sivic, 2014). However, there are some recent deep learning based
significant configuration information of an App. Because the size of methods ignore the global context information as well as the receptive
the manifest file is small and it is also a hexadecimal file before fields of pixels and do not consider the reuse of pixel features during
decompilation, the manifest file can also be used to generate an image. the feature extraction stage (Peng, Yu, Peng, & Lu, 2021; Wang, Guo,
The RGB color contains three components, which are known as & Wang, 2021). Therefore, in this work, we propose a novel variant
red, green and blue. The value of each color component ranges from of CNN with diverse receptive fields using max pooling and average
0 to 255, which is the same as the value range represented by two pooling simultaneously, known as MADRF-CNN. Specifically, we first
hexadecimal characters. As a result, this transformation process takes stack two CNN blocks as feature extractor and attaches a MADRF block.
less effort than extracting text features from manifest or other files. The classifier is implemented by three Fully Connected (FC) layers and
Compared with grayscale images (single channel, which can represent the Softmax function. In addition, we name the network structure that
256 colors), RGB images (three channels, which can represent 224 only use one pooling method (i.e., max pooling and average pooling) as
4
H. Zhu et al. Expert Systems With Applications 218 (2023) 119593
MaxDRF and AvgDRF for comparison in subsequent experiments. The context information. To balance the cost and efficiency, we retain three
architectural overview of the proposed method MADRF-CNN is shown different receptive fields. In addition, average pooling and max pooling
in Fig. 4. are employed in MADRF for scale conversion. Then, three convolutional
Before feeding into deep networks, each image is resized to 200 × layers are utilized to process those three scale feature maps. After
200 and normalized to [0, 1]. The CNN block is generated by the parallel pooling operation, the acquired features are spliced as the input
combination of a convolutional layer with ReLU as its activation and a of the fully-connection layer. The procedure is shown as follows.
max pooling layer. The forward propagation of the convolutional layer
𝑝𝑟,𝑚
𝑢,𝑣 = 𝑚𝑎𝑥 𝑧𝑚 (2)
is as follows. 0≤𝑖≤𝑤𝑤𝑟 −1,0≤𝑗≤𝑤ℎ𝑟 −1 𝑢×𝑠𝑟 +𝑖,𝑣×𝑠𝑟 +𝑗
∑ 𝑘ℎ−1
𝑘𝑤−1 ∑ ∑𝑐
𝑝𝑟,𝑚 𝑧𝑚 (3)
𝑥𝑙−1,𝑑 𝑢,𝑣 = 𝑎𝑣𝑒𝑟𝑎𝑔𝑒
𝑙,𝑚,𝑑
𝑧𝑙,𝑚
𝑢,𝑣 = 𝑓 ( 𝑢×𝑠+𝑖,𝑣×𝑠+𝑗 𝑘𝑖,𝑗 + 𝑏𝑙,𝑚 ) (1) 0≤𝑖≤𝑤𝑤𝑟 −1,0≤𝑗≤𝑤ℎ𝑟 −1
𝑢×𝑠 𝑟 +𝑖,𝑣×𝑠𝑟 +𝑗
𝑖=0 𝑗=0 𝑑=0
𝑎𝑟 = 𝐶𝑜𝑛𝑣(𝑝𝑟 ), 𝑟 ∈ {𝑟1 , 𝑟2 , 𝑟3 } (4)
where 𝑧𝑙,𝑚 represents the 𝑚th feature map of the output of the 𝑙th
layer, and 𝑥𝑙,𝑑 represents the 𝑑th channel of the input of 𝑙th layer after where 𝑟 represents the scale of the width or height of the output
padding. The channel number of the output is denoted by 𝑐. The width, feature map of a pool compared with the input one, 𝑤𝑤𝑟 , 𝑤ℎ𝑟 , 𝑠𝑟 mean
height and stride of the convolutional kernel are represented as 𝑘𝑤, 𝑘ℎ the window width, window height and stride of the pool, which are
and 𝑠, respectively. Additionally, 𝑘𝑙,𝑚,𝑑 represents the weights of the 𝑑th all not fixed but depend on 𝑟, and 𝐶𝑜𝑛𝑣 donates the convolutional
kernels of the 𝑚th convolutional filter in 𝑙th layer, and 𝑏𝑙,𝑚 represents layer. MADRF improves the receptive fields (Luo, Li, Urtasun, & Zemel,
the bias of the filter. Besides, 𝑓 is the activation function and it is ReLu 2016) of feature maps greatly. The receptive field of the feature map
in this work. generated based on the original scale is only 3 × 3, while 60% scale
To further capture the dependencies between different parts of a reaches 5 × 5 and 20% even reaches 15 × 15. The receptive field
Dex file (image) by capitalizing on multi-scale context information, is expanded to capture the long-distance dependence of interaction,
we propose a novel CNN block with MADRF. As shown in Fig. 4, the such as between String_ids and Methods_ids in the Dex file. This multi-
input of MADRF is the feature maps achieved by the CNN blocks. scale information is concatenated and input into FC layers to realize
MADRF can freely expand the receptive field to acquire multi-scale classification. The detailed process is shown below, 𝑊 and 𝑏 are the
5
H. Zhu et al. Expert Systems With Applications 218 (2023) 119593
weights and bias of the FC layers, 𝑓 is the ReLU function, and 𝑥 4.2. Evaluation index
represents the multi-scale features achieved by MADRF.
{ } In this work, we regard malware detection as a binary classification
𝑥 = 𝑐𝑜𝑛𝑐𝑎𝑡𝑒𝑛𝑎𝑡𝑒 𝑓 𝑙𝑎𝑡𝑡𝑒𝑛(𝑎𝑟𝑝𝑜𝑜𝑙 ), 𝑟 ∈ {𝑟1 , 𝑟2 , 𝑟3 }, 𝑝𝑜𝑜𝑙 ∈ {𝑚𝑎𝑥, 𝑎𝑣𝑒𝑟𝑎𝑔𝑒} problem. The evaluation indicators employ the five most common and
representative indicators (e.g., Accuracy, Precision, Recall, F1-Score
(5)
and Matthews Correlation Coefficient (MCC)). It is worth mentioning
that MCC is a relatively appropriate evaluation indicator for unbal-
𝐹 𝐶(𝑥) = 𝑓 (𝑥𝑊 + 𝑏) (6) anced datasets. Because our dataset (2,507 malicious apps and 1,417
benign ones) is unbalanced, we specially introduce the MCC indicator.
The metrics are on account of the true positive (TP), true negative (TN),
𝑦̂ = 𝑠𝑜𝑓 𝑡𝑚𝑎𝑥(𝐹 𝐶(𝐹 𝐶(𝐹 𝐶(𝑥)))) (7)
false positive (FP) and false negative (FN) values, where TP represents
where 𝑟1 = 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙, 𝑟2 = 60%, 𝑟3 = 20%. The cross-entropy loss function the number of malicious Apps correctly identified as malware. TN rep-
is used to calculate the classification loss, resents the number of benign Apps correctly identified as benign Apps.
FP represents the number of benign Apps misclassified as malware. FN
1 ∑
𝑁
̂ 𝑦) = −
𝑙𝑜𝑠𝑠(𝑦, 𝑦 𝑙𝑜𝑔(𝑦̂𝑖 ) (8) represents the number of malware misclassified as benign Apps.
𝑁 𝑖=1 𝑖
𝑇𝑃 + 𝑇𝑁
Accuracy = (9)
where 𝑁 represents the number of classes, and 𝑦 is the actual label in 𝑇𝑃 + 𝐹𝑃 + 𝑇𝑁 + 𝐹𝑁
one-hot representation. 𝑇𝑃
Precision = (10)
𝑇𝑃 + 𝐹𝑃
4. Experiment 𝑇𝑃
Recall = (11)
𝑇𝑃 + 𝐹𝑁
All the experiments have been conducted using an Intel Core i7-
2 × ( Precision × Recall )
11800H with 8 cores and 16 GB RAM. The k-fold cross-validation 𝐹 1-𝑆𝑐𝑜𝑟𝑒 = (12)
Precision + Recall
with k=5 has been applied in this work. The source code to reproduce
the results of our work is available at https://github.com/MADRF- 𝑇𝑃 × 𝑇𝑁 − 𝐹𝑃 × 𝐹𝑁
MCC = √ (13)
CNN/malware_network. (𝑇 𝑃 + 𝐹 𝑃 )(𝑇 𝑃 + 𝐹 𝑁)(𝑇 𝑁 + 𝐹 𝑃 )(𝑇 𝑁 + 𝐹 𝑁)
6
H. Zhu et al. Expert Systems With Applications 218 (2023) 119593
Table 1
MADRF-CNN Performance Evaluation Results (Bold is the best, and italic is the second).
Feature types Accuracy Precision Recall F1-Score MCC
Dex features (CNN) 95.9% 96.5% 98.0% 94.5% 89.3%
Manifest features (CNN) 78.1% 83.2% 86.1% 73.5% 48.3%
Combination features (CNN) 95.1% 96.1% 97.0% 93.6% 87.4%
Dex features (MADRF-CNN) 96.9% 97.1% 98.9% 95.9% 92.0%
Manifest features (MADRF-CNN) 81.0% 84.1% 89.8% 76.3% 54.2%
Combination features (MADRF-CNN) 95.5% 96.8% 97.2% 94.1% 88.4%
7
H. Zhu et al. Expert Systems With Applications 218 (2023) 119593
Table 3
Comparison results with the existing works using Dex Features (Bold is the best, and italic is the second).
Method Accuracy Precision Recall F1-Score MCC
Fang et al. (2020) 82.8% 75.5% 71.5% 73.1% 60.9%
DeepVisDroid (Bakour & Ünver, 2021) 95.2% 88.8% 92.9% 90.6% 87.7%
Xiao and Yang (2019) 92.7% 93.3% 95.5% 94.4% 84.2%
Sun et al. (2021) 90.9% 92.5% 94.6% 93.5% 78.7%
MADRF-CNN (ours) 96.9% 97.1% 98.9% 95.9% 92.0%
as mentioned before, our method generates RGB images that are not manifest files are far less. The manifest is generally less than 100kb,
dominated by the sections commonly appeared in both malware and while the Dex file is usually in MB, which means that the information
benign apps, such as the Dex header and the data section. In addition, stored in the Dex file is more comprehensive and sufficient than the
our work not only capitalizes on the excellent feature learning ability manifest file. Moreover, it can be seen from the hexadecimal manifest
of CNN, but also proposes a novel deep learning block MADRF to file that the content of this small file is also filled with numerous zero
augment the pixel receptive field and enhance the global awareness characters. In other words, the information density of the manifest
of context information. Moreover, by combining two pooling methods file is much lower than that of the Dex file. In addition, there are
in the MADRF block, more comprehensive and effective features in the massive repetitive contents in the file, such as ‘‘Android: name’’ and
image are learned. Thus, the network performance is further improved. ‘‘uses permission’’. On the premise of insufficient information density,
the existence of these redundant contents increases the difficulty of
4.5. Discussion feature recognition and learning. However, it is worth mentioning that
the important configuration information contained in this file is very
In this work, an end-to-end malware detection method has been suitable for manual feature extraction.
proposed based on RGB image representation and MADRF-CNN. Three
types of image-based features have been constructed to verify the 5. Conclusion
effectiveness of MADRF-CNN. We can notice that the image converted
from the manifest file is far less effective than the image converted With the rapid development of mobile App programming and anti-
from Dex files. Under the same network structure and parameters, the reverse-engineering techniques, the difficulty of malware detection
average Accuracy of the latter can achieve 96.9%, while the result of is exacerbated. To address challenges in existing detection solutions
manifest features is only 81.0%. The main reasons we summarize are based on manual features, such as code obfuscation and limited cover-
as follows. Firstly, compared with Dex files, the size and number of age, we proposed an effective end-to-end Android malware detection
8
H. Zhu et al. Expert Systems With Applications 218 (2023) 119593
framework that does not rely on prior knowledge and manual fea- Gao, H., Cheng, S., & Zhang, W. (2021). Gdroid: Android malware detection and
tures. Specifically, the proposed method directly learns representative classification with graph convolutional network. Computers & Security, 106, Article
102264.
features from compact Dex images based on Convolutional Neural
Ghouti, L., & Imam, M. (2020). Malware classification using compact image features
Network. To further capture the dependency or interaction between and multiclass support vector machines. IET Information Security, 14(4), 419–429.
different parts of each Dex image, we proposed a novel CNN vari- Gibert, D., Mateu, C., & Planes, J. (2020). HYDRA: A multimodal deep learning
ant MADRF-CNN. We conducted a series of experiments using the framework for malware classification. Computers & Security, 95, Article 101873.
GooglePlayStore (2022). Google play store. https://play.google.com/store/apps.
dataset collected in the recent two years to verify the effectiveness
Haq, I. U., Khan, T. A., & Akhunzada, A. (2021). A dynamic robust DL-based model
of the proposed methods, including the image cutting scheme and for android malware detection. IEEE Access, 9, 74510–74521.
the MADRF-CNN method. Additionally, we compared MADRF-CNN Hasan, H., Ladani, B. T., & Zamani, B. (2021). MEGDroid: A model-driven event
with the existing state-of-the-art solutions. Experimental results demon- generation framework for dynamic android malware analysis. Information and
Software Technology, 135, Article 106569.
strated that MADRF-CNN obtains better performance in malware de-
Hu, j., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In Proceedings of
tection. The proposed method not only inherits the advantages of the IEEE conference on computer vision and pattern recognition (pp. 7132–7141).
malware detection methods based on image features, such as not Hu, W., Tao, J., Ma, X., Zhou, W., Zhao, S., & Han, T. (2014). MIGDroid: Detecting
relying on manual interference and being able to deal with code APP-repackaging android malware via method invocation graph. In 2014 23rd
confusion technologies, but also inherits some inherent disadvantages international conference on computer communication and networks (pp. 1–7).
Huang, J., Lu, C., Ping, G., Sun, L., & Ye, X. (2020). TCN-ATT: A non-recurrent
of such methods, such as lack of good interpretability. Therefore, we model for sequence-based malware detection. In Pacific-asia conference on knowledge
will design more targeted and interpretable solutions based on the discovery and data mining (pp. 178–190). Springer.
inherent structure of Dex files and the visualization technology of CNN Kabakus, A. T. (2022). DroidMalwareDetector: A novel android malware detection
to promote the applications of such end-to-end solutions. framework based on convolutional neural network. Expert Systems with Applications,
206, Article 117833.
Lakshmanarao, A., & Shashi, M. (2022). Android malware detection with deep learning
CRediT authorship contribution statement using RNN from opcode sequences. International Journal of Interactive Mobile
Technologies, 16(01), 145–157.
Huijuan Zhu: Conceptualization, Methodology, Writing – origi- Li, S., Chen, J., Spyridopoulos, T., Andriotis, P., Ludwiniak, R., & Russell, G. (2015).
Real-time monitoring of privacy abuses and intrusion detection in android system.
nal draft. Huahui Wei: Investigation, Software, Validation. Liangmin In International conference on human aspects of information security, privacy, and trust
Wang: Conceptualization, Methodology, Supervision. Zhicheng Xu: (pp. 379–390). Springer.
Software, Validation. Victor S. Sheng: Conceptualization, Methodol- Li, D., & Li, Q. (2020). Adversarial deep ensemble: Evasion attacks and defenses
ogy, Writing – review & editing. for malware detection. IEEE Transactions on Information Forensics and Security, 15,
3886–3900.
Liu, C., Li, B., Zhao, J., Su, M., & Liu, X. (2021). MG-DVD: A real-time framework for
Declaration of competing interest malware variant detection based on dynamic heterogeneous graph learning. arXiv
preprint arXiv:2106.12288.
The authors declare that they have no known competing finan- Luo, W., Li, Y., Urtasun, R., & Zemel, R. (2016). Understanding the effective recep-
tive field in deep convolutional neural networks. Advances in neural Information
cial interests or personal relationships that could have appeared to
processing systems, 29.
influence the work reported in this paper. Mahindru, A., & Sangal, L. A. (2021). Mldroid-framework for android malware detection
using machine learning techniques. Neural Computing and Applications, 33(10),
Data availability 5183–5240.
Marastoni, N., Continella, A., Quarta, D., Zanero, S., & Dalla Preda, M. (2017). Group-
Droid: Automatically grouping mobile malware by extracting code similarities. In
Data will be made available on request. Proceedings of the 7th software security and protection workshop (pp. 1–12).
Martín, A., Rodríguez-Fernández, V., & Camacho, D. (2018). CANDYMAN: Classifying
References android malware families by modelling dynamic traces with Markov chains.
Engineering Applications of Artificial Intelligence, 74, 121–133.
Meng, G., Xue, Y., Xu, Z., Liu, Y., Zhang, J., & Narayanan, A. (2016). Semantic
Alazab, M., Alazab, M., Shalaginov, A., Mesleh, A., & Awajan, A. (2020). Intelligent
modelling of android malware for effective malware comprehension, detection, and
mobile malware detection using permission requests and API calls. Future Generation
classification. In The 25th international symposium (pp. 306–317).
Computer Systems, 107, 509–521.
Millar, S., McLaughlin, N., del Rincon, J. M., & Miller, P. (2021). Multi-view deep
Alzaylaee, M. K., Yerima, S. Y., & Sezer, S. (2020). DL-Droid: Deep learning based
learning for zero-day android malware detection. Journal of Information Security
android malware detection using real devices. Computers & Security, 89, Article
and Applications, 58, Article 102718.
101663. Oquab, M., Bottou, L., Laptev, I., & Sivic, J. (2014). Learning and transferring mid-
Arp, D., Spreitzenbarth, M., Hübner, M., Gascon, H., & Rieck, K. (2014). DREBIN: level image representations using convolutional neural networks. In Proceedings of
Effective and explainable detection of android malware in your pocket. In Network the IEEE computer society conference on computer vision and pattern recognition (pp.
& distributed system security symposium (pp. 23–26). 1717–1724).
Bakour, K., & Ünver, H. M. (2021). DeepVisDroid: Android malware detection by Peng, D., Yu, X., Peng, W., & Lu, J. (2021). DGFAU-net: Global feature attention upsam-
hybridizing image-based features with deep learning techniques. Neural Computing pling network for medical image segmentation. Neural Computing and Applications,
and Applications, 33(18), 11499–11516. 33(18), 12023–12037.
Bourebaa, F., & Benmohammed, M. (2020). Android malware detection using convo- QiAnXin (2021). Security situation analysis report of android platform in 2020. https:
lutional deep neural networks. In 2020 international conference on advanced aspects //www.qianxin.com/threat/reportdetail?report_id=125.
of software engineering (pp. 1–7). Rathore, H., Sahay, S. K., Nikam, P., & Sewak, M. (2021). Robust android malware
Cai, M., Jiang, Y., Gao, C., Li, H., & Yuan, W. (2021). Learning features from enhanced detection system against adversarial attacks using Q-learning. Information Systems
function call graphs for android malware detection. Neurocomputing, 423, 301–307. Frontiers, 23(4), 867–882.
Cai, H., Meng, N., Ryder, B., & Yao, D. (2018). Droidcat: Effective android malware Sihag, V., Vardhan, M., Singh, P., Choudhary, G., & Son, S. (2021). De-LADY: Deep
detection and categorization via app-level profiling. IEEE Transactions on Information learning based android malware detection using dynamic features. Journal of
Forensics and Security, 14(6), 1455–1470. Internet Services and Information Security, 11(2), 34–45.
Calleja, A., Martín, A., Menéndez, H. D., Tapiador, J., & Clark, D. (2018). Picking on Statista (2021). Smartphone market share. https://www.statista.com/statistics/
the family: Disrupting android malware triage by forcing misclassification. Expert 1236760/worldwide-smartphone-operating-system-shipment-market-share/.
Systems with Applications, 95, 113–126. Sun, T., Daoudi, N., Allix, K., & Bissyandé, T. F. (2021). Android malware detection:
D’Angelo, G., Palmieri, F., Robustelli, A., & Castiglione, A. (2021). Effective classifica- Looking beyond Dalvik bytecode. In 2021 36th IEEE/ACM international conference
tion of android malware families through dynamic features and neural networks. on automated software engineering workshops (pp. S34–39).
Connection Science, 33(3), 786–801. Surendran, R., Thomas, T., & Emmanuel, S. (2020). GSDroid: Graph signal based
Enck, W., Gilbert, P., Han, S., Tendulkar, V., Chun, B., Cox, L. P., et al. (2014). compact feature representation for android malware detection. Expert Systems with
Taintdroid: An information-flow tracking system for realtime privacy monitoring Applications, 159, Article 113581.
on smartphones. ACM Transactions on Computer Systems, 32(2), 1–29. Vasan, D., Alazab, M., Wassan, S., Naeem, H., Safaei, B., & Zheng, Q. (2020). IMCFN:
Fang, Y., Gao, Y., Jing, F., & Zhang, L. (2020). Android malware familial classification Image-based malware classification using fine-tuned convolutional neural network
based on dex file dection features. IEEE Access, 8, 10614–10627. architecture. Computer Networks, 171, Article 107138.
9
H. Zhu et al. Expert Systems With Applications 218 (2023) 119593
VirusShare (2022). Virusshare. https://virusshare.com. Xiao, X., & Yang, S. (2019). An image-inspired and cnn-based android malware
VirusTotal (2022). Virustotal. https://www.virustotal.com/gui/home/upload. detection approach. In 2019 34th IEEE/ACM international conference on automated
Wang, Z., Guo, Y., & Wang, J. (2021). Empower Chinese event detection with improved software engineering (pp. 1259–1261). IEEE.
atrous convolution neural networks. Neural Computing and Applications, 33(11), Xue, L., Zhou, Y., Chen, T., Luo, X., & Gu, G. (2017). Malton: Towards on-device
5805–5820. non-invasive mobile malware analysis for ART. In 26th USENIX security symposium
Wang, Q., Hassan, W. U., Li, D., Jee, K., Yu, X., Zou, K., et al. (2020). You are what (pp. 289–306).
you do: Hunting stealthy malware via data provenance analysis. In 27th annual Yadav, P., Menon, N., Ravi, V., Vishvanathan, S., & Pham, T. D. (2022). EfficientNet
network and distributed system security symposium. convolutional neural networks-based android malware detection. Computers &
Wang, X., Wang, W., He, Y., Liu, J., Han, Z., & Zhang, X. (2017). Characterizing android Security, 115, Article 102622.
apps’ behavior for effective detection of malapps at large scale. Future Generation Zhang, W., Luktarhan, N., Ding, C., & Lu, B. (2021). Android malware detection using
Computer Systems, 75, 30–45. TCN with bytecode image. Symmetry, 13(7), 1107.
Wang, S., Zhou, G., Lu, J., & Zhang, F. (2019). A novel malware detection and Zhang, Z., Qi, P., & Wang, W. (2020). Dynamic malware analysis with feature
classification method based on capsule network. In International conference on engineering and feature learning. In Proceedings of the AAAI conference on artificial
artificial intelligence and security (pp. 573–584). Springer. intelligence (pp. 1210–1217).
Woo, S., Park, J., Lee, J., & Kweon, I. S. (2018). Cbam: Convolutional block attention Zhou, Y. (2021). An automated pipeline for privacy leak analysis of android ap-
module. In Proceedings of the european conference on computer vision (pp. 3–19). plications. In 2021 36th IEEE/ACM international conference on automated software
Wu, B., Chen, S., Gao, C., Fan, L., Liu, Y., Wen, W., et al. (2021). Why an android engineering (pp. 1048–1050). IEEE.
app is classified as malware: Toward malware classification interpretation. ACM Zhu, H., Wang, L., Zhong, S., Li, Y., & Sheng, V. S. (2022). A hybrid deep network
Transactions on Software Engineering and Methodology, 30(2), 1–29. framework for android malware detection. IEEE Transactions on Knowledge and Data
Engineering, 34(12), 5558–5570.
10