Didroid: Android Malware Classification and Characterization Using Deep Image Learning

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

DIDroid: Android Malware Classification and Characterization

Using Deep Image Learning


Abir Rahali Arash Habibi Lashkari Gurdip Kaur
[email protected] [email protected] [email protected]
Canadian Institute for Cybersecurity Canadian Institute for Cybersecurity Canadian Institute for Cybersecurity
(CIC), University of New Brunswick (CIC), University of New Brunswick (CIC), University of New Brunswick
(UNB) (UNB) (UNB)
Fredericton, New Brunswick, Canada Fredericton, New Brunswick, Canada Fredericton, New Brunswick, Canada

Laya Taheri Francois Gagnon Frédéric Massicotte


[email protected] [email protected] [email protected]
Canadian Institute for Cybersecurity Canadian Centre for Cyber Security Canadian Centre for Cyber Security
(CIC), University of New Brunswick (CCCS) (CCCS)
(UNB) Ottawa, Canada Ottawa, Canada
Fredericton, New Brunswick, Canada
ABSTRACT 10th International Conference on Communication and Network Security (IC-
The unrivaled threat of android malware is the root cause of various CNS 2020), November 27–29, 2020, Tokyo, Japan. ACM, New York, NY, USA,
13 pages. https://doi.org/10.1145/3442520.3442522
security problems on the internet. Although there are remarkable
efforts in detection and classification of android malware based
on machine learning techniques, a small number of attempts are
1 INTRODUCTION
made to classify and characterize it using deep learning. Detecting
android malware in smartphones is an essential target for cyber Android is leading the smartphone and operating system market by
community to get rid of menacing malware samples. This paper pro- dominating market shares at all levels. It started from controlling
poses an image-based deep neural network method to classify and a small portion when it first appeared in 2010, to running 86% of
characterize android malware samples taken from a huge malware smartphones globally in 2019 [5] and owning over 40.39% of the
dataset with 12 prominent malware categories and 191 eminent operating system market share worldwide. Android’s dominance
malware families. This work successfully demonstrates the use of looks to continue in the coming years with its expected smartphone
deep image learning to classify and characterize android malware share to increase to 87.1% in 2023 [4]. Simultaneously, the Android
with an accuracy of 93.36% and log loss of less than 0.20 for training malware industry is becoming increasingly disruptive with almost
and testing set. 12,000 new Android malware instances every day [2]. The current
evolutionary process will soon bring malware designed with adap-
tive and success-based learning to improve the efficacy of attacks.
CCS CONCEPTS
Undeniably, there is a need to develop better tools for malware
• Security and privacy → Malware and its mitigation. detection.
Researchers have used different sets of features and analysis
KEYWORDS techniques (static, dynamic and hybrid) to analyze and mitigate
android malware, malware analysis, malware classification, mal- malware samples. It is found that the quality and variety of android
ware characterization, deep learning, convolutional neural network apps used for malware analysis plays a pivotal role in creating a
labeled and comprehensive dataset for analysis. Some labeled mal-
ACM Reference Format: ware datasets used by researchers are mobile malware traffic with
Abir Rahali, Arash Habibi Lashkari, Gurdip Kaur, Laya Taheri, Francois 1900 applications [25], CICAndMal2017 [26] and CICInvesAnd-
Gagnon, and Frédéric Massicotte. 2020. DIDroid: Android Malware Clas- Mal2019 [43] which is the second part of CICAndMal2017 with API
sification and Characterization Using Deep Image Learning. In 2020 the calls added as dynamic features.
Numerous android malware analysis techniques have been used
in the past. Deep learning, alternatively called deep neural learn-
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed ing or deep neural network, has emerged as a recent technique
for profit or commercial advantage and that copies bear this notice and the full citation for android malware detection. It is a subset of machine learning
on the first page. Copyrights for components of this work owned by others than ACM that has networks capable of learning without supervision from
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a unstructured or unlabeled data [8]. Based on that, we propose a
fee. Request permissions from [email protected]. method for static analysis of android malware samples by trans-
ICCNS 2020, November 27–29, 2020, Tokyo, Japan forming the features extracted from APK files to 2D images to build
© 2020 Association for Computing Machinery.
ACM ISBN 978-1-4503-8903-7/20/11. . . $15.00 a deep learning ensemble based on convolution neural networks to
https://doi.org/10.1145/3442520.3442522 classify android apps into benign and malicious.

70
ICCNS 2020, November 27–29, 2020, Tokyo, Japan Rahali and Lashkari, et al.

The main contributions of this paper are: (1) Generating a huge The model outperformed MaMaDroid [34] which was built by using
labeled dataset containing 400,000 android (200K malware and 200K the same characteristics and machine learning algorithms.
benign) samples; and (2) proposing and developing DIDroid, a new Suleiman et al. [50] tested the performance of various machine
Android malware detection system based on deep image learning. learning classifiers (NB, J48, SVM, RF, SL) via static features (per-
The rest of the paper is organized as follows: Section 2 sets the missions, intents, API calls, date of appearance) extracted from
related work on android malware detection. Section 3 introduces date-labeled benign and malware datasets. GefDroid [18] performed
the background tools used in this paper. Section 4 describes the pro- graph embedding based on familial analysis of android malware us-
posed methodology. Section 5 details the dataset and experiments ing unsupervised learning. A fine-grained behavioral model is used
are presented in Section 6. The discussion of experimental results to construct a set of sub-graphs by abstracting program semantics.
and analysis are given in Section 7 followed by the conclusion in AndroDialysis [20] used intents and permissions for malware detec-
Section 8. tion. It compared the effectiveness of intent and permission usage.
Then, the Bayesian Network algorithm is developed to achieve a
2 RELATED WORK higher detection rate and faster detection.
Wang et al. [47] extracted 11 types of static features from each
The advent of sophisticated open-source tools to propagate per-
app basically from API calls, permissions, intents, and hardware
sistent zero-day malware that evades security measures has given
information to characterize the behaviors of the app. It employed
way to different types of analysis for malware identification. It is
the ensemble of multiple classifiers namely, SVM, KNN, NB, CART
observed from the literature that out of prominent malware analy-
and RF to detect malware and categorize benign apps. Atici et
sis techniques, static analysis can reveal errors that do not manifest
al. [13] proposed an approach based on control flow graphs and
themselves until a disaster occurs weeks, months or years after
machine learning algorithms (CART, Probabilistic Neural Network
release. Recently, more advanced methods for static and dynamic
(PNN), NB, 1-NN ) for static android malware analysis.
analyses have been proposed that attempt to incorporate the ad-
ICCDetector [48] is a permission-based method that classifies
vantages of the traditional methods while improving upon their
android malware data into five defined malware categories using
limitations. This section presents some of the recent work related
a detection model after training with a set of benign and malware
to android malware detection and classification.
apps. It employs the trained model for malware detection which
is evaluated with 5,264 malware and 12,026 benign apps using
2.1 ML-based Malware Classification SVM. Garcia et al. [22] proposed a machine learning-based Android
Android malware being a major persevering threat on mobile has malware detection and family identification approach. It’s selected
received major research focus as evident from numerous surveys features leverage categorized android API usage, reflection-based
and review papers with a focus on machine learning. features and features from native binaries of apps. Li et al. [28]
Zhuo et al. [33] presented a machine learning-based android developed a new adversarial example attack method based on a
malware detection technique by creating a control flow graph and bi-objective GAN. They used required permissions, actions and API
extracting API information from it to build three different types calls as features.
of datasets: Boolean, frequency, and time-series. C4.5, DNN, LSTM The hybrid analysis is a combination of static and dynamic anal-
techniques were applied to conduct experiments on 10,010 benign ysis. Teubert et al. [44] developed Hugin, a machine learning-based
and 10,683 malicious applications to achieve 98.98% detection pre- app vetting system that uses features derived from dynamic as
cision. TLAMD [31] calculated the disturbance size on android well as static analysis using a list features (permissions, hardware
samples by using a generic algorithm to generate the adversarial components, intent-filters, third-party libraries, system services,
samples. This technique used 8 features selected from manifest file remote procedure, dynamic permissions) with more than 14,000
and disassembly code to implement the black-box model. However, malware.
the model suffers from high request times. Arslan et al. [12] worked Dhanya and Kumar [17] have proposed a hybrid analysis ap-
in identifying the spare permissions requested by some applications proach using 77 hybrid best features after applying feature selec-
to perform suspicious activities. They used static and code analysis tion with permissions as static features and network activities, file
techniques to achieve 91.95% accuracy. system activities, cryptographic activities, and information leak-
TFDroid [32] used SVM to detect malware via source, sink and age as dynamical features. The characterization of malware was
description of android applications. Applications were clustered done with machine learning algorithms such as Naïve Bayes, J48
into different domains based on the description and were mined and Random Forest. Abdullah and Ibrahim [23] developed mad4a
to determine outliers. A limited size dataset was used to perform to detect malicious applications that use both static and dynamic
cross-validation and only benign applications were used to train analysis techniques.
the classifier which resulted in 93.65% accuracy in identifying the Similar to these approaches, our work also uses static features
malicious application. Suman et al. [45] used a combination of such as permissions, intents, and hardware activities and achieved
sensitive permissions and API features and applied an ensemble high detection rate. However, it differs in following perspectives:First,
learning model based on decision tree classifier and KNN classifier we use a different approach to transform malicious applications
to detect unknown APKs. Zhang et al. [51] formed behavioral se- into gray scale images for classification. Second, we use large and
mantics by calculating the confidence of association rules between balanced android applications’ dataset comprising of 200K benign
the abstracted API calls to describe an application and then applied and 200K malicious apps for training our proposed learning model.
several machine learning algorithms (KNN, RF, SVM) for detection.

71
DIDroid: Android Malware Classification and Characterization Using Deep Image Learning ICCNS 2020, November 27–29, 2020, Tokyo, Japan

2.2 DL-based Malware Classification In summary, one of the advantages of deep learning approaches
Tao et al. [27] focused on event groups to describe apps’ behaviors at is that they are more accurate and fast as compared to machine
the event level and applied neural network for detection. Josh et al. learning-based classification techniques when trained with enough
[35] also applied a deep learning model but only on permissions and amount of data as evident from literature. The work presented in
hardware features. Abdelmonim et al. [40] proposed an approach this paper is a static analysis method that builds on previous studies
that uses five different feature sets (permissions, intents, API calls, and combines both feature extraction and strong model design with
invalid certificates, and the presence of APK files in the asset folder). deep learning to explore malware behaviors and identify malicious-
They applied autoencoder for classification. Xing Ping et al. [42] ness using static analysis. Further, we are able to study malware
focused on Dalvik operation codes. These codes were treated as a behavior at a much larger scale than previously possible since we
text sequence in the sequential convolutional neural network used have a very large dataset of malware and benign samples to fit in the
for detection. deep learning model in order to perform better learning accuracy.
Jie et al. [30] used XGboost for detection by selecting features In addition, it is also uncertain whether the previous works with
such as permissions, intents, APIs and smali files, and then opti- competent accuracy would yield similar results when evaluated
mized it by genetic algorithm and Particle Swarm Optimization with such a big dataset.
(PSO). AndrEnsemble [37] is a characterization system for Android
malware families based on ensembles of sensitive API calls ex-
3 BACKGROUND
tracted from aggregated call graphs of different families. Yao-Saint This section details background on the deep learning model convo-
et al. [49] converted the APK file to java code and then calculated lutions neural networks which will give better readability to our
the importance of the word to generate pictures. Finally, they used work.
the convolutional neural network for classification.
DroidDivesDeep [19] classified malware via low-level moni- 3.1 Deep Neural Networks
torable features (CPU, memory, network, sensors, etc.) using deep Deep Learning is gaining much popularity due to its supremacy
neural networks. David and Netanyahu [15] presented DeepSign, in terms of accuracy when trained with a huge amount of data.
a novel method based on deep learning for automatic malware Deep neural networks are a generic name for a large class of ma-
signature generation and classification. Andro_MD [16] employs chine learning algorithms, including but not limited to perceptron,
different CNN models to train the extracted features (code related Hopfield networks, Boltzmann machines, fully connected neural
patterns, hardware features, filtered intents, request permissions, re- networks, convolutional neural networks, recurrent neural net-
stricted API calls and used permissions) from a dataset constructed works, long short-term memory neural networks, autoencoders,
with 21,000 samples collected from third-party markets and 34,570 deep belief networks, generative adversarial networks and many
features of 7 categories. They evaluated the effectiveness and feasi- more. Most of them are trained with an algorithm called backprop-
bility of these models with standard machine learning algorithms. agation. It contains multiple neurons (nodes) arranged in layers.
McLaughlin et al. [36] proposed an android malware detection sys- Nodes from adjacent layers have connections or edges between
tem that uses a deep convolutional neural network based on static them. All these connections have associated weights.
analysis of the raw opcode sequence from a disassembled program.
NTPDroid [11] used a hybrid feature combination for malware 3.2 Convolutional Neural Networks
detection that extracts network traffic features and permissions Convolutional neural networks could be considered essentially not
from the applications. Mohaisen et al. [38] created a behavior-based fully connected neural nets (each neuron is connected to only a few
malware classification system that extracts the IP, port, unique des- neurons in the previous layer) and neurons share weights. These
tination IP, connections (TCP, UDP, RAW), request type (POST, types of networks have been proven successful especially in the
GET, HEAD), response type, response codes (200s through 500s), fields of computer vision and natural language processing, where
size and DNS. It uses these behavioral artifacts generated by mal- they broke every record.
ware samples at run-time to characterize malware. It also considers Definition 1. The convolution operation is a linear operation,
the order in which behavioral events occur. Kelkar et al. [24] pro- represented by an asterisk, that merges two signals f and g over a
posed a system to identify HTTP-based information ex-filtration point (x,y) such that:
of malicious android applications. They have focused on the leaked ∞
Õ ∞
Õ
information, destinations to which information is ex-filtrated and f [x, y] ∗ д[x, y] = f [n, m].д[x − n, y − m]
their correlations with types of sensitive information. Shanshan n=−∞ m=−∞
et al. [46] proposed an approach for detecting malware based on where m,n is the set of integer numbers.
malicious URLs. Two-dimensional convolutions are used in image processing to
There are multiple other recent studies that focus on network implement image filters. For example, to find a specific patch on an
analysis like [41] [14], MalPaCA [39], a network traffic-based method- image or to find some feature in an image.
ology to cluster malware according to its attacking capabilities using
sequence clustering with four features: packet size, time interval, 3.3 Convolutional Neural Networks Layers
source port, and destination port. These clusters capture various There are three types of layers in a convolutional neural network:
attacking capabilities, such as port scans and reuse of C&C. Table convolutional layer, pooling layer, and fully connected layer. Each
1 provides the pros and cons of existing malware identification of these layers has different parameters that can be optimized and
approaches. perform a different task on the input data.

72
ICCNS 2020, November 27–29, 2020, Tokyo, Japan Rahali and Lashkari, et al.

Table 1: Comparison of Our Approach with DL-based Malware Identification Approaches

Reference Features Pros/Cons


Tao et al. [27] API calls Scalable, event aware
Josh et al. [35] Permissions, hardware features Biased towards complicated models
Abdelmonim et al. [40] Permissions, intents, API calls, invalid certificates, pres- Capable of identifying malicious apps from benign ones
ence of APK files in asset folder
Xing Ping et al. [42] Text sequence High accuracy, less cost
Jie et al. [30] Permissions, intents, APIs, smali files Less false positive rate, increased detection accuracy
AndrEnsemble [37] Call graph More resilient against transformation attacks
Yao-Saint et al. [49] - Very small dataset, average accuracy
DroidDivesDeep [19] CPU, memory, network, sensors Use of low level device run time atrributes, high accuracy
David and Netanyahu API calls, registry entries, websites, ports accessed Generated malware signatures
[15]
Adro_MD [16] Code related patterns, hardware features, filtered in- High accuracy
tents, request permissions, restricted API calls, used
permissions
Mclaughlin et al. [36] Opcodes Able to execute on GPU
DIDroid Permissions, intents actions, intents categories, system Scalable, large dataset, multi-class characterization, highly ac-
features, activities, broadcast receivers and providers, curate, less miss classification
metadata

1. Convolutional Layer A commonly used pooling algorithm is max pooling, which extracts
The main task of this layer is to detect local conjunctions of subregions of the feature map (e.g., 2x2-pixel tiles), keeps their max-
features from the previous layer and map their appearance to a imum value, and discards all other values.
feature map. Image is split into perceptrons compressed in feature
maps of size n ∗ m. This map stores the information where the 3. Dense (Fully Connected) Layer
feature occurs in the image and how well it corresponds to the filter. In a dense layer, every node in the layer is connected to every
There are x filters in each layer. The number of filters applied in node in the preceding layer. The fully connected layers in a con-
one stage is equivalent to the depth of the volume of output feature volutional network are practically a multilayer perceptron that
maps. Each filter detects a particular feature at every location on aims to map the x (t −1) ∗ n (t −1) ∗ m(t −1) activation volume from the
(t )
the input. The output Yi of layer t consists of x (t ) feature maps of combination of previous different layers into a class probability
(t ) distribution. Thus, the output layer of the multilayer perceptron
size n(t ) ∗ m (t ) . The ith feature map denoted Yi , is computed as:
will have x (t −i) outputs, i.e. output neurons where i denotes the
(t −1) number of layers in the multilayer perceptron. If t − 1 is a fully

(t ) (t ) (t ) (t −1) connected layer;
Yi = Bi + Ki, j ∗ Yj
j=1 (t ) (t )
yi = f (zi )
(t ) (t )
where Bi is a bias matrix and Ki, j is the filter connecting the j th
With :
feature map in layer (t − 1) with i t h feature map in layer.
(t −1)

(t ) (t −1)
2. Pooling Layer z (t ) = w i, j yi
This layer is responsible for reducing the dimensionality of the j=1
feature map in order to decrease processing time. In general, they
are used after multiple stages of other layers to reduce the compu- The goal of the complete fully connected structure is to tune the
(l )
tational requirements progressively through the network as well weight parameters w i, j to create a stochastic likelihood represen-
as minimizing the likelihood of overfitting. tation of each class based on the activation maps generated by
The pooling layer t has two hyperparameters, the spatial extent the concatenation of convolutional, non-linearity, rectification and
of the filter F (t ) and the stride S (t ) . It takes an input volume of pooling layers.
size x (t −1) ∗ n (t −1) ∗ m (t −1) and provides an output volume of size
x(t) ∗ n(t) ∗ m(t) where; 3.4 Functions Used
1. Activation
x (t ) = x (t −1) The activation function is a node that is put at the end of or in
n(t ) = (n(t −1) − F (t))/S (t ) + 1 between neural networks. They help to decide if the neuron would
fire or not. We have different types of activation functions, but for
m(t ) = (m(t −1) − F (t))/S (t ) + 1 this project, our focus will be on Rectified Linear Unit (ReLU).

73
DIDroid: Android Malware Classification and Characterization Using Deep Image Learning ICCNS 2020, November 27–29, 2020, Tokyo, Japan

Definition 2. Rectified Linear Units (ReLU) are a special im- (3) Metadata: It is basically an additional option to store infor-
plementation that combines non-linearity and rectification lay- mation that can be accessed through the entire project
ers in convolutional neural networks. A rectified linear unit (i.e. (4) The permissions requested by application: It protects the
(t )
threshold at zero) is a piece-wise linear function defined as: Yi = privacy of the user and is needed to access sensitive user
(t −1) data (such as contacts and SMS)
max(0, Yi )
(5) System features (such as camera and internet)
2. Dropout Step 2 (Feature Capturing)
It is a regularization technique for reducing overfitting in neural In order to extract these features, regular XML parsers cannot be
networks. It has the effect of simulating a large number of networks used since android has its own proprietary binary XML format. We
with very different network structures and, in turn, making nodes wrote a capturing script that extracted all the features as "strings"
in the network generally more robust to the inputs. from the AndroidManifest.xml file. Table 2 shows an example of
the captured features.
3. Flatten
Table 2: Example of Static Features
As the name of this step implies, the feature map is flattened into
a column to insert this data into an artificial neural network later on.
Feature Values
Package Name "com.fb.iwidget"
4. One Hot Encoder Activities "com.fb.iwidget.OverlayActivity"
One hot encoding is a process by which categorical variables are "org.acra.CrashReportDialog"
"com.batch.android.BatchActionActivity"
converted into a form that could be provided to learning algorithms
"com.fb.iwidget.MainActivity"
to do a better job in prediction. "com.fb.iwidget.PreferencesActivity"
"com.fb.iwidget.PickerActivity"
"com.fb.iwidget.IntroActivity"
4 PROPOSED METHODOLOGY Services "com.batch.android.BatchActionService"
Since anatomy analysis of malware needs to take a deep look at the "com.fb.iwidget.MainService"
"com.fb.iwidget.SnapAccessService"
relationships between the features in order to reveal the malicious Receivers/Providers "com.fb.iwidget.ExpandWidgetProvider"
behavior and identify its patterns, we tried to achieve this by im- "com.fb.iwidget.ActionReceiver"
plementing both feature extraction and deep learning. Fig. 1 shows Intents Actions "android.accessibilityservice.AccessibilityService"
"android.appwidget.action.APPWIDGET_UPDATE"
the architecture of our methodology, starting by creating different "android.intent.action.BOOT_COMPLETED"
combinations of features to get different subsets. Then, an extra tree "android.intent.action.CREATE_SHORTCUT"
classifier is applied to the features of each subset to select the most "android.intent.action.MAIN"
important ones. Then part two is the deep learning model, where "android.intent.action.MY_PACKAGE_REPLACED"
"android.intent.action.USER_PRESENT"
the input is 2D created to be fitted in the convolutional layers. In "android.intent.action.VIEW"
the proposed DIDroid model, our hypothesis for converting static "com.fb.iwidget.action.SHOULD_REVIVE"
feature data into images could help in multi-class characterization Intents Categories "android.intent.category.BROWSABLE"
"android.intent.category.DEFAULT"
of android malware applications. Finally, an average prediction is "android.intent.category.LAUNCHER"
applied to this ensemble model resulting to get the classification. Permissions "android.permission.ACCESS_NETWORK_STATE"
"android.permission.CALL_PHONE"
"android.permission.INTERNET"
4.1 Feature Extraction "android.permission.RECEIVE_BOOT_COMPLETED"
"android.permission.SYSTEM_ALERT_WINDOW"
Accurately choosing the right features to train a learning algorithm
"com.android.vending.BILLING"
is an important consideration because an endless amount of training "android.permission.BIND_ACCESSIBILITY_SERVICE"
data, if paired to the wrong set of features, will not produce reliable Meta-Data "android.accessibilityservice"
results. That’s why we applied features extraction on our data "android.appwidget.provider"
# of Icons 331
in order to boost the performance of the proposed model. The # of Pictures 0
following are the steps implemented to collect, capture, extract and # of Audio files 0
select the features as it appears in Fig. 2. # of Videos 0
Step 1 (Feature Collection) Size of the App 4.2M

Android applications come in an Android Package Kit APK.


This .apk file is nothing but a zip archive of AndroidManifest.xml, Step 3 (Feature Extraction and Combination)
classes.dex, resources, and other folders. To extract these features, To create the feature vectors, we created numerical values of the
we initially need to reverse engineer the .apk files using apktool collected features. For permissions, actions, categories, and services,
[6]. The AndroidManifest.xml file contains a lot of features that we created unique lists of all possible values and vectors for each
can be used for static analysis. The main extracted features from type of feature as: Let E be a vector containing a set of X android
AndroidManifest.xml file are: feature types. For every ti application in the dataset, we generated
a binary sequence Ei = e 1 , e 2 , ..., e j where
(1) Activities: An android activity is one screen of the android (
app’s user interface 1, if e j feature exists
ej = (1)
(2) Broadcast receivers and providers 0, otherwise

74
ICCNS 2020, November 27–29, 2020, Tokyo, Japan Rahali and Lashkari, et al.

Figure 1: Methodology Architecture

Figure 2: Feature Extraction

Identified features are stored as a binary sequence of 0 (feature is - Convolutional Layer1: Applies 32 3x3 filters (extracting 3x3-
absent) or 1 (feature is present) in a comma-separated file. For meta- pixel sub regions), with ReLU activation function.
data, receivers, providers, and activities, we take the frequency of - Pooling Layer1: Performs max pooling with a 2x2 filter and
appearance in the app. stride of 2 (which specifies that pooled regions do not overlap), with
Step 4 (Feature Selection) a dropout regularization rate of 0.015 (probability of 0.015 that any
We used Extremely Randomized Trees Classifier (ExtraTreesClas- given element will be dropped during training).
sifier) for selecting features. It is a type of ensemble learning tech- - Convolutional Layer2: Applies 64 3x3 filters, with ReLU activa-
nique that aggregates the results of multiple de-correlated decision tion function.
trees collected in a “forest” to output its classification result [3]. - Pooling Layer2: Performs max pooling with a 2x2 filter and
During the construction of the forest, for each feature, a normal- stride of 2, with a dropout regularization rate of 0.005.
ized total reduction value called Gini importance of the feature is - Convolutional Layer3: Applies 128 3x3 filters, with ReLU acti-
computed. Each feature is arranged in descending order according vation function.
to the Gini importance value and best k features are selected. Based - Pooling Layer2: Performs max pooling with a 2x2 filter and
on this value, we kept only the important features and removed the stride of 2, with a dropout regularization rate of 0.005.
rest. All the vectors are reshaped to 2D gray images. Fig. 3 shows - Flatten layer1: Reshapes the tensor to have the shape equal to
the difference between sample images before (first row) and after the number of elements contained in tensor not including the batch
(second row) feature selection process. dimension 512.
- Dense Layer1: 256 neurons, with dropout regularization rate
4.2 Model Layers of 0.005.
We explain the layers of CNN model in Fig. 4. 3. Output Layer
1. Input Layer The input layer on CNN contains image data One hot encoder is applied to the category names in order to
represented by a 3D matrix. We have used the format of the feature change the format of data from categorical to numerical values
selection matrix which is a 1D vecto and reshaped the vectors of which means the output layer just contains the label which is in
all the apps into 2D gray images. The third dimension is an equal the form of one-hot encoded. We have 13 classes (12 malware and
number of RGB values automatically computed by the model. 1 benign). Therefore, the output layer is:
2. Convolutional Layers We have eight layers in this section - Dense Layer 2: 13 neurons.
including:

75
DIDroid: Android Malware Classification and Characterization Using Deep Image Learning ICCNS 2020, November 27–29, 2020, Tokyo, Japan

Figure 3: 2D images of 4 apps before and after feature selection

Figure 4: Flow chart of the proposed Model

5 DATASET
To test and evaluate our proposed methodology, we collaborated
with the Canadian Center for Cyber Security (CCCS) [10] to gen-
erate a new dataset namely, CCCS-CIC-AndMal-2020, [21] which
includes 400K android apps (200K benign and 200K malware).
• Malware Data:
CCCS supported us to have their real-world collected malware
samples for analysis. We used VirusTotal [7] to specify malware
family and label the dataset. Since, it contains so many anti-viruses,
we labeled every family in the dataset by following a consensus
of 70% anti-viruses to incorporate reliability in labeled dataset.
We searched for similar malware samples to categorize malware
samples in dataset with similar characteristics. Finally, we got the
android malware data distribution as shown in Fig. 5. We have Figure 5: Malware Categories
14 Malware categories including Adware, Backdoor, FileInfector,
No_Category, Potentially Unwanted Apps (PUA), Ransomware,
Riskware, Scareware, Trojan, Trojan-Banker, Trojan-Dropper, Trojan- official android market, Google Play, Anshi, AppChina, 1mobile,
SMS, Trojan-Spy and Zero-Day. We used 12 malware categories and Genome project dataset. A weekly updated list containing all
excluding No_Category and Zero-Day in experimental analyses the detailed information about the apps is created. HTTP API is
owing to incomplete data in these categories. provided to allow the full download of the unaltered APKs from the
• Benign Data: Androzoo dataset. The dataset has already been used to conduct
research in the field of machine learning-based malware detection.
For benign android apps, we used the Androzoo dataset, which
currently contains more than eight million unique android apps • Taxonomy:
and the number is still growing [29]. The architecture is developed A comprehensive understanding of the existing android malware
to collect the Androzoo dataset from different sources including attacks supported by a unified terminology is necessarily required

76
ICCNS 2020, November 27–29, 2020, Tokyo, Japan Rahali and Lashkari, et al.

Table 3: Taxonomy
Sensitive Data Collection Media Hardware Actions / Activities Internet Connection C&C AntiVirus Storage & Settings
Category (#Family) Family # Samples Year D1 D2 D3 D4 D5 M1 M2 M3 H1 H2 H3 H4 A1 A2 A3 A4 A5 I1 I2 I3 I4 C AV S1 S2 S3
dowgin 2679 2013 X X X X X
adflex 418 2014 X X X X X X X X
admogo 79 2014 X X X X X X
adviator 77 2015 X X X X X X
adwo 188 2014 X X X X
airpush 2242 2011 X X X X X X
appad 92 2016 X X X X X
appsgeyser 60 2014 X X X
baiduprotect 984 2015 X X X X X X X X X X
batmobi 458 2017 X X X X X X
dianjin 45 2016 X X X X X X X X
dianle 19 2014 X X X X
domob 103 2014 X X X X X X X X
ewind 1047 2017 X X X
feiwo 108 2015 X X X X X X X
fictus 349 2015 X X X
ganlet 28 2014 X X X
adend 301 2015 X X X X X X
gmobi 17 2016 X X X X
hiddenad 61 2016 X X X X X X
hummingbad 28 2016 X X X X X X
igexin 82 2014 X X X X X
inmobi 330 2015 X X
inoco 5649 2014 X X X X
Adware (48)
kalfere 113 2015 X X X
kuguo 1015 2012 X X X X X X
leadbolt 233 2007 X X X
mobclick 41 2014 X X X X X
mobidash 1033 2015 X X X X X
mobisec 117 2016 X X X X X X
mulad 171 2012 X X X X
oimobi 913 2015 X X X X X
shedun 19036 2017 X X X X
sprovider 227 2015 X X X X X
viser 31 2014 X X X X
wooboo 16 2015 X X X X X
xynyin 44 2015 X X X
zdtad 5694 2015 X X X X
frupi 43 2014 X X X X X
kyhub 28 2016 X
stopsms 26 2014 X X X X
loki 46 2016 X X X X X X X
kyview 127 2013 X X X X X
pandaad 50 2017 X X X X
plague 14 2015 X X X X X
accutrack 7 2014 X
adcolony 17 2016 X
gexin 3 2018 X X
kapuser 15 2017 X X X X X
kmin 24 2011 X X X X X
fobus 171 2014 X X X
mobby 119 2018 X X X X X X
hiddad 664 2016 X X X
Backdoor (11) moavt 166 2014 X X X X X X
androrat 129 2013 X X X X X X X
dendroid 48 2015 X X X X X X
levida 51 2015 X X X X X X X
pyls 24 2011 X X X X
droidkungfu 50 2011 X X X
commplat 77 2014 X X X X X
leech 99 2015 X X X
File_Infector (5) tachi 45 2015 X X X
gudex 14 2014 X X X X X
aqplay 407 2009 X X X X
apptrack 92 2015 X X X X X X X X X
cauly 27 2014 X X
secapk 1004 2015 X X X
umpay 67 2007 X X X X
PUA (8)
wiyun 11 2015 X X X X
youmi 529 2015 X X X X
utchi 139 2012 X X X X
scamapp 99 2015 X X X X
masnu 35 2013 X X X X
congur 252 2017 X X X X X X X X X
fusob 67 2015 X X X X
jisut 820 2014 X X X X X X
Ransomware (8)
koler 79 2014 X X X X X X X
lockscreen 356 2017 X X X
slocker 998 2014 X X X X
smsspy 3319 2012 X X X X X
skymobi 10229 2015 X X X X
anydown 57 2013 X X
badpac 45 2017 X X X X X
deng 58 2016 X X X X X
dnotua 36 2016 X X X X X
jiagu 721 2016 X X
metasploit 28 2012 X X X
mobilepay 1197 2017 X X X X X
remotecode 36 2016 X
revmob 806 2015 X X X X X X X
Riskware (21) secneo 27 2016 X X X X X X X X X X
smspay 28512 2014 X X X
smsreg 50073 2014 X X X X X
talkw 49 2014 X X X
tencentprotect 144 2015 X X X X X X
tordow 7 2016 X X X X X X X X
triada 493 2015 X X X X X
wapron 93 2016 X X X X X
nqshield 46 2015 X X X
kingroot 24 2013 X X X X X X
wificrack 15 2015 X X
avpass 126 2013 X X X
Scareware (3) mobwin 23 2014 X X X X
fakeapp 1332 2012 X X X X X X X X X
77
DIDroid: Android Malware Classification and Characterization Using Deep Image Learning ICCNS 2020, November 27–29, 2020, Tokyo, Japan

Table 4: Taxonomy-Continued
Sensitive Data Collection Media Hardware Actions / Activites Internect Connection C&C AntiVirus Storage & Settings
Category (#Family) Family # Samples Year D1 D2 D3 D4 D5 M1 M2 M3 H1 H2 H3 H4 A1 A2 A3 A4 A5 I1 I2 I3 I4 C AV S1 S2 S3
autosms 239 2016 X X X
coinge 16 2014 X X X X
droiddreamlight 15 2011 X X X X X
gluper 680 2018 X X X
hiddenapp 157 2016 X X X X X X
iconosys 33 2012 X X X X X X X X X
lotoor 661 2010 X X X X
mobtes 343 2016 X X
mseg 148 2013 X X X X
qysly 94 2016 X X X X X X
rootnik 474 2015 X X X X X X
syringe 99 2016 X
wkload 143 2013 X X X
zbot 85 2010 X X X X X
hyspu 112 2016 X X X X X
basebridge 63 2012 X X X X X X X X X
boogr 218 2016 X X X
lovetrap 48 2011 X X X X
oveead 30 2016 X X X X
rusms 27 2014 X X X
systemmonitor 61 2014 X X X X
uupay 27 2014 X X X X X X X X X X
Trojan (45) wintertiger 24 2013 X X X X X X
typstu 28 2012 X X X X
blouns 652 2017 X X X
autoins 479 2014 X X X X X
cnsms 3413 2014 X
gappusin 766 2012 X X X
gedma 11 2013 X X X
ginmaster 130 2011 X X X X X
hypay 360 2016 X X X X
mytrackp 1054 2013 X X X X X
subspod 11 2017 X X X
walkfree 15 2015 X X X
xinyinhe 59 2014 X X X X X X X X X
drosel 59 2016 X X X X
uapush 11 2013 X X X X X X X X X X X
uten 9 2013 X X X X X X
smsagent 1166 2012 X X X X X
styricka 833 2015 X X X X X
autoinst 12 2019 X
noicondl 33 2018 X X X X
obtes 5 2019 X
droiddream 3 2011 X X
hiddenap 3 2016 X
asacub 260 2015 X X X X X X X X X X X X X
fakebank 17 2013 X X X X X X X X
faketoken 52 2012 X X X X X X
marcher 87 2015 X X X X X
minimob 56 2013 X X X X X X X X
Trojan-Banker (11) guerrilla 256 2017 X X X X X
bankbot 4 2014 X X X X X X X X X
gugi 8 2017 X X X X X X
svpeng 68 2014 X X X X X
wroba 9 2016 X X
zitmo 40 2012 X X X X X
cnzz 19 2015 X X X X X
locker 1296 2015 X X X X X
rooter 51 2012 X X X X
xiny 31 2016 X X X X X X X X X
Trojan-Dropper (9) boqx 106 2012 X X X X
hqwar 118 2018 X X X
ramnit 84 2014 X X X X X
ztorg 500 2016 X X X X X X
gorpo 16 2014 X X X X x X
opfake 368 2013 X
hipposms 20 2011 X X
podec 13 2015 X X X X X
feejar 56 2013 X
smsdel 40 2012 X X X X X
Trojan-SMS (11) plankton 186 2011 X X X X
jsmshider 21 2011 X X X X
smsbot 42 2013 X X X X X
boxer 87 2012 X X X
fakeinst 2148 2011 X
vietsms 13 2013 X X X
spynote 21 2018 X X X X X X X X X
kasandra 29 2014 X X X
spyagent 48 2006 X X X X X X X X
spyoo 13 2012 X X X X X X X X X X X
tekwon 19 2013 X X X X X
Trojan-Spy (11) sandr 208 2014 X X X
qqspy 27 2017 X X X X
smforw 1873 2013 X X
smsthief 1058 2013 X X X
smszombie 52 2012 X X X
spydealer 1 2015 X X X X X X X X X X X X
D1: Collect Personal Info (phone number, email address, app accounts ) & Browser history A2: Collect Details / List of Running/ Installed Applications
D2: Collect User Contacts A3: Block / Delete / Use Phone applications or Remote the Phone
D3: Send / Recieve Spam Emails A4: Start execution after a Delay of time
D4: Steal Banking Info or Get Passwords and Confidentials A5: Start execution after Reboot / Cause Repetitive restart
D5: Send/ Receive SMS I1: Steal Network Info (Wifi, IP, DNS,... ) / Connect to the inernet
M1: Make Call / Collect call history I2: Access / Redirect user to Malicious websites
M2: Use Camera / Collect pictures I3: Install other Malicious apps, code or files
M3: Record Audio / Use microphone I4: Show Ads / Notifications or Worning or Send URL & shortcuts
H1: Slow the Phone & Effect Battery C: Communicate with C&C server
H2: Collect Phone info (N°, IMIE,ID, status,... ) AV: Uninstall AV or Avoid it some how or Detect if it installed
H3: Get Location (GPS) S1: Encrypt Data or Use ecryption to avoid Detection
H4: Lock / Block the Phone or Change PIN S2: Modifiy/Collect/Access Files & Sittings
A1: Root Access ( Ask for Root privileges ) S3: Use External data / Creates Memory guarded regions

78
ICCNS 2020, November 27–29, 2020, Tokyo, Japan Rahali and Lashkari, et al.

for the deployment of reliable defence mechanisms against these respective categories whereas only 2,299 (6.7%) samples are mis-
attacks. Table 3 and Table 4 present our dataset taxonomy of 191 classified. Further analysis of confusion matrix reveals that out of
malware families classified under 12 Malware categories. This taxon- 9,026 Adware samples, 8,492 are correctly classified by the model,
omy is based on related technical reports and previously published leaving only 534 samples classified as other malware categories.
papers. Moreover, if we take a look at Riskware, which has largest num-

6 EXPERIMENTS
In this section, we present the experiments designed to execute
the model. The proposed system has been implemented in Python
using Keras and TensorFlow. Following parameters are configured
to perform the experiments:
- Activation (Hidden layers): RELU
- Activation (Output layer): Softmax
- Loss Function: sparse_categorical_crossentropy
- Optimizer: adam
- Epoch: 50
- Batch Size: 16
Our implementation extracts all features from the dataset as de-
scribed earlier. Experiments are conducted on the Ubuntu server
with 50 CPUs and 500GB of RAM. The major challenge in designing
an effective classification model is to eliminate irrelevant, redun-
dant, or noisy features and retain only the highly discriminate
features.
We divided the experimental process into two parts. The first
part of the experiment is performed to find the effectiveness of the
CNN model itself rather than focusing on the importance of each
feature type. To begin with, it is imperative to clean the dataset
to remove all types of impurities. Our dataset had a few missing Figure 6: Confusion Matrix
values and very few unknown values which were replaced by zero.
All the non-numeric columns (malware hash, family, category, and ber of samples in the dataset, 95.91% samples are classified right.
binary field specifying whether the sample is benign or malware) The accuracy of malware categories like Adware, FileInfector, Ran-
are removed from the dataset because only numeric values are somware, Riskware, SMS and Spyware, which have larger samples
required for computation and compilation of the model. in dataset, is more than 91%. However, the model classified 156
Hyperparameters are tuned to prevent overfitting while training Backdoor samples correctly reducing the accuracy of this category
the model. The required patience value in early stopping monitor to lowest for an individual category (59.93% only). Similar is the
is identified and set to stop the training part when it reaches that case for Dropper, PUA, and Scareware. We are able to generalize
patience value after which more training does not yield a significant that when the number of testing set instances is comparatively
improvement in the result. The dataset was split into a standard much smaller, the model has not converged properly leading to
80:20 ratio as a training and testing set after experimenting with decreased accuracy. The cause of this can be lack of training for
70:30 split that was less fruitful. The second part of the experiment such malware categories.
emphasizes selecting the best features from our feature set and • Accuracy:
applying those features to run the CNN model again to improve
In the first part of experiment, accuracy of all the malware cate-
the accuracy of prediction.
gories is computed and plotted in Fig. 7 which represents model
accuracy for each epoch. Computing accuracy aids to find out how
7 ANALYSIS AND DISCUSSION many individual malware categories are correctly classified and
To evaluate the effectiveness of our model, we used common perfor- how many samples in the testing set are labeled properly by the
mance metrics for classification problems such as confusion matrix, model. Working on the same pattern, we computed the training
accuracy, log loss, precision, recall, and F1-score. and testing set accuracy while fitting the model. It is clear from the
figure that the model was properly trained after 10 epoch values
• Confusion Matrix:
and it was stopped to train any further preventing overfitting. The
In the first part of experiment, 2465 best features were selected from training curve accuracy is more than 93.36% and testing accuracy
a total of 9504 features whereas in the second part we shortlisted is more than 93%.
2465 to 2237 features. Fig. 6 shows the confusion matrix gener- To perform the second part of experiment, we further shortlisted
ated after executing the model to predict 12 malware categories. the best features in part one from 2465 to 2237 to feed to the model.
Summation of the diagonal values in the confusion matrix shows As explained in Step 4 in Section III, selecting the best features
31,901 (93.3%) malware samples are correctly classified into their yields better results. Our model was able to classify more number

79
DIDroid: Android Malware Classification and Characterization Using Deep Image Learning ICCNS 2020, November 27–29, 2020, Tokyo, Japan

Figure 7: Model Accuracy and Loss

Table 5: Validation Results

Malware Category Precision Recall F1-Score Accuracy #Training Instances #Testing Instances
Adware 0.935 0.929 0.932 92.82 35592 8899
Backdoor 0.721 0.643 0.680 59.93 1169 292
Banker 0.759 0.759 0.759 92.4 686 171
Dropper 0.850 0.686 0.759 63.96 1777 444
FileInfector 0.909 0.789 0.845 70.31 514 128
PUA 0.677 0.682 0.679 69.29 1574 394
Ransomware 0.798 0.944 0.864 91.98 4741 1185
Riskware 0.963 0.967 0.965 96.55 74157 18539
SMS 0.917 0.886 0.901 93.99 2395 599
Scareware 0.836 0.764 0.799 74.32 1185 296
Spyware 0.924 0.835 0.877 91.94 2679 670
Trojan 0.895 0.896 0.896 89.09 10327 2582

of malware instances correctly and model accuracy for training set • Problem of overfitting:
was increased to 94% where is the testing accuracy remained same.
It is observed while running the model that the efficiency of the
• Log Loss: model first increases with the increase in epoch values and starts
decreasing after a point. In other words, the model starts to overfit
Logarithmic loss, or simply log loss, is a classification loss function the training data. If that threshold value where the model begins
which quantifies the accuracy of a classifier by penalizing false to overfit is not identified, the performance will decline and model
classifications. Minimizing the Log Loss is basically equivalent to execution time will increase. This will lead to resource exhaustion
maximizing the accuracy of the classifier. It can be interpreted in the worst case. Overfitting will result in the generalization of
from Fig. 7 that loss function for both training and testing set is training data and incorrectly identified malware instances in the
decreasing with increase in epoch values. The final value of log loss confusion matrix. To overcome this problem, EarlyStopping parame-
for training curve is less than 10% while for testing curve, it turns ter with patience value=3 is set when calling model.fit() function [1].
out to be under 25% for both parts of the experiments. This parameter monitors training performance and once triggered,
it stops training.
• Precision, Recall, and F1-Score:

Table 5 presents the values of precision, recall, and F1-Score along 8 CONCLUSION AND FUTURE SCOPE
with total number of training and testing set instances used for Android malware is one of the most serious threats on the internet
experimentation for all malware categories. The model has high which has witnessed an unprecedented upsurge in recent years.
values for precision, recall and F1-Score. It is apparent that preci- It is an open challenge for cybersecurity experts. There are many
sion and recall values are high for most of the malware categories techniques available to identify and classify android malware based
and there is a balance between precision and recall values as it is on machine learning, but recently, deep learning has emerged as
confirmed from F1-Score which is more than 80% on an average a prominent classification method for such samples. This paper
for all malware categories. High precision values indicate that the introduces DIDroid to successfully implement malware classifica-
model is able to classify actually relevant instances to relevant. tion and characterization technique using feature extraction and
Similarly, high recall values stipulate that the model was able to image-based deep learning. We have received promising results for
find relevant instances from the huge dataset. DIDroid with accuracy of 93.36% and a low model loss. Moreover,

80
ICCNS 2020, November 27–29, 2020, Tokyo, Japan Rahali and Lashkari, et al.

most of the instances of all the 191 families and 12 malware cate- [12] Recep Sinan Arslan, İbrahim Alper Doğru, and Necaattin Barişçi. 2019.
gories are correctly classified by DIDroid proving its effectiveness. Permission-Based Malware Detection System for Android Using Machine Learn-
ing Techniques. International Journal of Software Engineering and Knowledge
Unlike previous best models, DIDroid is scalable as it can classify Engineering 29, 1 (2019), 43–61. https://doi.org/10.1142/S0218194019500037
a larger dataset and malware families also. It supports multi-class [13] Mehmet Ali Atici, Seref Sagiroglu, and Ibrahim Alper Dogru. 2016. Android
malware analysis approach based on control flow graphs and machine learning
characterization with high accuracy and less false negative rate as algorithms. 4th International Symposium on Digital Forensic and Security, Little
proposed in hypothesis. Rock, AR (2016), 26–31. https://doi.org/10.1109/ISDFS.2016.7473512
However, there are some limitations of the proposed model. [14] Zhenxiang Chen, Qiben Yan, Hongbo Han, Shanshan Wang, Lizhi Peng, Lin
Wang, and Bo Yang. 2018. Machine learning based mobile malware detection
Firstly, the current model deployment plan includes one sequential using highly imbalanced network traffic. Information Sciences 433-434 (2018),
input layer, eight hidden layers, and one output layer. The number 346–364.
of hidden layers can be increased to accommodate larger dataset [15] Omid E. David and Nathan S. Netanyahu. 2015. DeepSign: Deep Learning for
Automatic Malware Signature Generation and Classification. International Joint
which will result in improved model performance. Secondly, the Conference on Neural Networks (IJCNN) (2015), 1–8. https://doi.org/10.1109/
dataset contains thousands of features divided into 6 categories. IJCNN.2015.7280815
[16] Omid E. David and Nathan S. Netanyahu. 2018. Andro_MD: Android Malware
We have used all the features in one go to evaluate the model. We Detection based on Convolutional Neural Networks. International Journal of
took 9504 features in first part of experiment and shortlisted 2465 Performability Engineering 14, 3 (2018).
best features from them to feed to the model. In the second part [17] K.A. Dhanya and T. Gireesh Kumar. 2019. Efficient Android Malware Scanner
Using Hybrid Analysis. International Journal of Recent Technology and Engineering
of the experiment, we used 2465 best features selected from part 7, 5S3 (2019), 168–176.
one and further shortlisted 2337 best features to execute the model. [18] Ming Fan, Xiapu Luo, Jun Liu, Meng Wang, Chunyin Nong, Qinghua Zheng, and
However, a permutation of feature categories can be formed to find Ting Liu. 2019. Graph Embedding based Familial Analysis of Android Malware
using Unsupervised Learning. Proceedings of the 41st International Conference on
the best feature set. These two limitations are treated as future Software Engineering (2019), 771–782. https://doi.org/10.1109/ICSE.2019.00085
work for the research. [19] Parvez Faruki, Bharat Buddhadev, Bhavya Shah, Akka Zemmari, Vijay Laxmi,
and Manoj Singh Gaur. 2019. DroidDivesDeep: Android Malware Classification
via Low Level Monitorable Features with Deep Neural Networks. Security and
Privacy, Communications in Computer and Information Science, Springer, Singapore
AVAILABILITY 939 (2019), 125–139.
[20] Ali Feizollah, Nor Badrul Anuar, Rosli Salleh, Guillermo Suarez-Tangil, and Steven
The source code for Android App static analyzer and classifier is Furnell. 2017. AndroDialysis: Analysis of Android Intent Effectiveness in Malware
publicly available in GitHub [1] and dataset is publicly available at Detection. Computers & Security 65 (2017), 121–134. https://doi.org/10.1109/
[9]. ICSE.2019.00085
[21] Francois Gagnon and Frederic Massicotte. 2017. Revisiting Static Analysis of
Android Malware. Proceedings of the 10th USENIX Conference on Cyber Security
Experimentation and Test, Vancouver, BC, Canada (2017).
ACKNOWLEDGMENTS [22] Joshua Garcia, Mahmoud Hammad, and Sam Malek. 2016. Lightweight,
Obfuscation-Resilient Detection and Family Identification of Android Malware.
We thank the Mitacs Globalink Program for providing the Research ACM Transactiosn on Software Engineering Methodology 26, 3 (2016).
Internship (GRI) opportunity and Harrison McCain Young Scholar [23] Abdullah Talha Kabakus and Ibrahim Alper Dogru. 2018. An in-depth analysis of
Foundation funds from University of New Brunswick (UNB) for Android malware using hybrid techniques. Digital Investigation 24 (2018), 25–33.
[24] Soham Kelkar, Timothy Kraus, Daria Morgan, Junjie Zhang, and Rui Dai. 2018.
supporting this project. We also thank CCCS for sharing the CCCS- Analyzing HTTP-Based Information Exfiltration of Malicious Android Appli-
CIC-AndMal-2020 dataset with us. cations. 17th IEEE International Conference On Trust, Security And Privacy In
Computing And Communications/ 12th IEEE International Conference On Big Data
Science And Engineering (TrustCom/BigDataSE), New York, USA (2018), 1642–1645.
REFERENCES https://doi.org/10.1109/TrustCom/BigDataSE.2018.00242
[25] Arash Habibi Lashkari, Andi Fitriah A. Kadir, Hugo Gonzalez, Kenneth Fon Mbah,
[1] 2019. Android App Static analyzer and classifier. Retrieved 01 Oct 2019 from and Ali A. Ghorbani. 2017. Towards a Network-Based Framework for Android
https://github.com/ahlashkari/AndroidAppStaticlyzer Malware Detection and Characterization. In: Proceeding of the 15th International
[2] 2019. Cyber attacks on Android devices on the rise. Retrieved 15 Jul Conference on Privacy, Security and Trust, PST, Calgary, Canada (2017), 233–23309.
2019 from https://www.gdatasoftware.com/blog/2018/11/31255-cyber-attacks- https://doi.org/10.1109/PST.2017.00035
on-android-devices-on-the-rise [26] Arash Habibi Lashkari, Laya Taheri, Andi Fitriah A. Kadir, and Ali A. Ghorbani.
[3] 2019. An extra-trees classifier. Retrieved 30 Jun 2019 from https://scikit-learn. 2018. Toward Developing a Systematic Approach to Generate Benchmark An-
org/stable/modules/generated/sklearn.ensemble.ExtraTreesClassifier.html droid Malware Datasets and Classification. In: Proceeding of the IEEE ICCST’18,
[4] 2019. Operating System Market Share Worldwide. Retrieved 12 Jun 2019 from Montreal, Quebec, Canada (2018), 1–7. https://doi.org/10.1109/CCST.2018.8585560
https://gs.statcounter.com/os-market-share [27] Tao Lei, Zhan Qin, Zhibo Wang, Qi Li, and Dengpan Ye. 2019. EveDroid: Event-
[5] 2019. Smartphone Market Share. Retrieved 25 Oct 2019 from https://www.idc. Aware Android Malware Detection Against Model Degrading for IoT Devices.
com/promo/smartphone-market-share/os Digital Investigation 6, 4 (2019), 6668–6680. https://doi.org/10.1109/JIOT.2019.
[6] 2019. A tool for reverse engineering Android apk files. Retrieved 01 Aug 2019 2909745
from https://ibotpeaches.github.io/Apktool/ [28] Heng Li, ShiYao Zhou, Wei Yuan, and Jiahuan Liand Henry Leung. 2019.
[7] 2019. VirusTotal Website. Retrieved 10 Jun 2020 from https://www.virustotal. Adversarial-Example Attacks Toward Android Malware Detection System. IEEE
com/gui/home/upload Systems Journal 14, 1 (2019), 653–656. https://doi.org/10.1109/JSYST.2019.2906120
[8] 2019. Why Deep Learning over Traditional Machine Learning? Retrieved 24 Jul [29] Li Li, Jun Gao, Médéric Hurier, Pingfan Kong, Tegawendé F. Bissyandé, Alexandre
2019 from https://towardsdatascience.com/why-deep-learning-is-needed-over- Bartel, Jacques Klein, and Yves Le Traon. 2017. AndroZoo++: Collecting Millions
traditional-machine-learning-1b6a99177063 of Android Apps and Their Metadata for the Research Community. Proceedings of
[9] 2020. Android Malware 2020. Retrieved 10 Jan 2020 from https://www.unb.ca/ the 13th International Conference on Mining Software Repositories (2017), 468–471.
cic/datasets/andmal2020.html https://arxiv.org/abs/1709.05281
[10] 2020. Canadian Centre for Cyber Security. Retrieved 10 Jan 2020 from https: [30] Jie Ling, Xuejing Wang, and Yu Sun. 2019. Research of Android Malware Detection
//cyber.gc.ca/en/ based on ACO Optimized Xgboost Parameters Approach. Proceedings of the 3rd
[11] Anshul Arora and Sateesh K Peddoju. 2018. NTPDroid: A Hybrid Android International Conference on Mechatronics Engineering and Information Technology
Malware Detector Using Network Traffic and System Permissions. 17th IEEE (2019), 364–371. https://doi.org/10.2991/icmeit-19.2019.60
International Conference On Trust, Security And Privacy In Computing And [31] Xiaolei Liu, Xiaojiang Du, Xiaosong Zhang, Qingxin Zhu, and Mohsen Guizani.
Communications/ 12th IEEE International Conference On Big Data Science And 2019. Adversarial Samples on Android Malware Detection Systems for IoT
Engineering (TrustCom/BigDataSE), New York, USA (2018), 808–813. https: Systems. Cornell University (2019). https://arxiv.org/abs/1902.04238
//doi.org/10.1109/TrustCom/BigDataSE.2018.00115

81
DIDroid: Android Malware Classification and Characterization Using Deep Image Learning ICCNS 2020, November 27–29, 2020, Tokyo, Japan

[32] Songhao Lou, Shaoyin Cheng, Jingjing Huang, and Fan Jiang. 2019. TFDroid: [42] XingPing Sun1, JiaYuan Peng, HongWei Kang, and Yong Shen. 2019. Android
Android Malware Detection by Topics and Sensitive Data Flows Using Machine Malware Detection using Sequential Convolutional Neural Networks. Journal
Learning Techniques. IEEE 2nd International Conference on Information and of Physics: Conference Series 1168 (2019), 1–8. https://doi.org/10.1088/1742-
Computer Technologies, Kahului, HI, USA (2019), 30–36. https://doi.org/10.1109/ 6596/1168/6/062010
INFOCT.2019.8711179 [43] Laya Taheri, Andi Fitriah A. Kadir, and Arash Habibi Lashkari. 2019. Extensible
[33] Zhuo Ma, Haoran Ge, Yang Liu, Meng Zhao, and Jianfeng Ma. 2019. A Combi- Android Malware Detection and Family Classification Using Network-Flows
nation Method for Android Malware Detection Based on Control Flow Graphs and API-Calls. 2019 International Carnahan Conference on Security Technology
and Machine Learning Algorithms. IEEE Access 7 (2019), 21235–21245. https: (ICCST), CHENNAI, India (2019), 1–8. https://doi.org/10.1109/CCST.2019.8888430
//doi.org/10.1109/ACCESS.2019.2896003 [44] Dominik Teubert, Johannes Krude, Samuel Schuppen, and Ulrike Meyer. 2017.
[34] Enrico Mariconti, Lucky Onwuzurike, Panagiotis Andriotis, Emiliano De Cristo- Hugin: A Scalable Hybrid Android Malware Detection System. SECURWARE
faro, Gordon Ross, and Gianluca Stringhini. 2017. MAMADROID: Detecting 2017: The Eleventh International Conference on Emerging Security Information,
Android Malware by Building Markov Chains of Behavioral Models. (2017). Systems and Technologies (2017), 168–176.
https://arxiv.org/abs/1612.04433 [45] Suman R. Tiwari and Ravi U. Shukla. 2018. An Android Malware Detection
[35] Josh McGiff, William G. Hatcher, James Nguyen, Wei Yu, Erik Blasch, and Chao Technique Based on Optimized Permissions and API. International Conference on
Lu. 2019. Towards Multimodal Learning for Android Malware Detection. Inter- Inventive Research in Computing Applications (2018), 2611–2616. https://doi.org/
national Conference on Computing, Networking and Communications, Honolulu, 10.1109/ICCONS.2018.8662939
HI, USA (2019), 432–436. https://doi.org/10.1109/ICCNC.2019.8685502 [46] Shanshan Wang, Zhenxiang Chen, Qiben Yan, Ke Ji, Lin Wang, Bo Yang, and
[36] Niall McLaughlin, Jesus Martinez del Rincon, BooJoong Kang, Suleiman Yerima, Mauro Conti. 2018. Deep and Broad Learning Based Detection of Android
Paul Miller, Sakir Sezer, Yeganeh Safaei, Erik Trickel, Ziming Zhao, Adam Doupé, Malware via Network Traffic. IEEE/ACM 26th International Symposium on Quality
and Gail Joon Ahn. 2017. Deep Android Malware Detection. Proceedings of the of Service (IWQoS), Banff, AB, Canada (2018), 1–6. https://doi.org/10.1109/IWQoS.
Seventh ACM on Conference on Data and Application Security and Privacy (2017), 2018.8624143
301–308. https://doi.org/10.1145/3029806.3029823 [47] Wei Wang, Yuanyuan Li, Xing Wang, Jiqiang Liu, and Xiangliang Zhang. 2018.
[37] Omid Mirzaeiq, Guillermo Suarez-Tangil, and Jose M. de Fuentes. 2019. An- Detecting Android malicious apps and categorizing benign apps with ensemble
drEnsemble: Leveraging API Ensembles to Characterize Android Malware Fami- of classifiers. Future Generation Computer Systems 78 (2018), 987–994. https:
lies. Proceedings of the 3rd International Conference on Mechatronics Engineering //doi.org/10.1109/ICSE.2019.00085
and Information Technology (2019). [48] Ke Xu, Yingjiu Li, and Robert H. Deng. 2016. ICCDetector: ICC-Based Malware
[38] Aziz Mohaisen, Omar Alrawi, Jeman Park, Joongheon Kim, DaeHun Nyang, and Detection on Android. IEEE Transactions on Information Forensics and Security
Manar Mohaisen. 2018. Network-based Analysis and Classification of Malware 11, 6 (2016), 1252–1264. https://doi.org/10.1109/TIFS.2016.2523912
using Behavioral Artifacts Ordering. EAI Endorsed Transactions on Security and [49] Yao-Saint Yena and Hung-Min Sun. 2019. An Android mutation malware de-
Safety (2018), 1–14. https://doi.org/abs/1901.01185 tection based on deep learning using visualization of importance from codes.
[39] Azqa Nadeem, Christian Hammerschmidt, Carlos H. Ganan, and Sicco Verwer. Microelectronics Reliability 93 (2019), 109–114.
2019. MalPaCA: Malware Packet Sequence Clustering and Analysis. Computer [50] Suleiman Y. Yerima and Sarmadullah Khan. 2019. Longitudinal performance
Science (2019). analysis of machine learning based Android malware detectors. International
[40] Abdelmonim Naway and Yuancheng Li. 2019. Android Malware Detection Using Conference on Cyber Security and Protection of Digital Services, Oxford, United
Autoencoder. (2019), 1–9. https://arxiv.org/ftp/arxiv/papers.pdf Kingdom (2019), 1–8. https://doi.org/10.1109/CyberSecPODS.2019.8885384
[41] Michele Scalas, Davide Maiorca, Francesco Mercaldo, Corrado Aaron Visaggio, [51] Hanqing Zhang, Senlin Luo, Yifei Zhang, and Limin Pan. 2019. An Efficient
Fabio Martinelli, and Giorgio Giacinto. 2019. On the Effectiveness of System API- Android Malware Detection System Based on Method-Level Behavioral Semantic
Related Information for Android Ransomware Detection. Computers & Security Analysis. IEEE Access 7 (2019), 69246–69256. https://doi.org/10.1109/ACCESS.
86 (2019), 168–182. 2019.2919796

82

You might also like