0% found this document useful (0 votes)

50 views

A Review of Deep Learning Models To Detect Malware in Android Applications

This document reviews deep learning models that have been used to detect malware in Android applications. It finds that convolutional neural networks, gated recurrent neural networks, deep neural networks, bidirectional long short-term memory, long short-term memory (LSTM) and cubic-LSTM are the most prominent deep learning models used for this purpose. However, monitoring malware behavior and information flow is challenging due to the evolving nature of malware and human behavior. Training users and sharing updated malware datasets are important for developing effective detection models. There is also a need to detect malware before applications are downloaded to improve Android security.

Uploaded by

johnyabiko

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views

A Review of Deep Learning Models To Detect Malware in Android Applications

Uploaded by

johnyabiko

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Cyber Security and Applications 1 (2023) 100014

Contents lists available at ScienceDirect

Cyber Security and Applications

journal homepage: http://www.keaipublishing.com/en/journals/cyber-security-and-applications/

A review of deep learning models to detect malware in Android

applications
Elliot Mbunge a,b,∗, Benhildah Muchemwa a, John Batani c, Nobuhle Mbuyisa a
a
Department of Computer Science, Faculty of Science and Engineering, University of Eswatini, Private Bag 4 Kwaluseni, Eswatini
b
Department of Information Technology, Faculty of Accounting and Informatics, Durban University of Technology, P O Box 1334, Durban 4000, South Africa
c
Faculty of Engineering and Technology, Botho University, Maseru 100, Lesotho

a r t i c l e i n f o a b s t r a c t

Keywords: Android applications are indispensable resources that facilitate communication, health monitoring, planning, data
Malicious software sharing and synchronization, social interaction, business and financial transactions. However, the rapid increase
Android applications in the smartphone penetration rate has consequently led to an increase in cyberattacks. Smartphone applications
Detection
use permissions to allow users to utilize different functionalities, making them susceptible to malicious software
Deep learning
(malware). Despite the rise in Android applications’ usage and cyberattacks, the use of deep learning (DL) models
Smartphones
to detect emerging malware in Android applications is still nascent. Therefore, this review sought to explain DL
models that are applied to detect malware in Android applications, explore their performance as well as identify
emerging research gaps and present recommendations for future work. This study adopted the preferred reporting
items for systematic reviews and meta-analyses (PRISMA) guidelines to guide the review. The study revealed that
convolutional neural networks, gated recurrent neural networks, deep neural networks, bidirectional long short-
term memory, long short-term memory (LSTM) and cubic-LSTM are the most prominent deep learning-based
malicious software detection models in Android applications. The findings show that deep learning models are
increasingly becoming an effective technique for malicious software detection in Android applications in real-
time. However, monitoring and tracking information flow and malware behavior is a daunting task because of
the evolving nature of malware and human behavior. Therefore, training mobile application users and sharing
updated malware datasets is paramount in developing detection models. There is also a need to detect malicious
software before downloading mobile applications to improve the security of Android smartphones.

1. Introduction poses [7]. However, the rise in mobile app usage makes mobile phones
vulnerable to cyberattacks and threats such as malware (Trojan horses,
The increased mobile phone penetration rate facilitates the devel- viruses, worms, and spyware, among others). Such threats violate data
opment and deployment of mobile applications in various domains [1]. integrity, system or device availability, and data confidentiality [6]. Due
Mobile phones, especially smartphones, facilitate communication [2], to popularity, third-party code and openness, Android applications tend
health monitoring [3,4], support teaching and learning, planning, data to be vulnerable to various types of malware. For instance, at least
sharing and synchronization, social interaction, business, and financial 3.25 million Android applications were infected with malicious soft-
transactions. As smartphones become ubiquitous and pervasive, they ware in 2016 [8]. In response to these attacks, several methods, includ-
also become vulnerable to cyberattacks and malicious users. For in- ing the dynamic taint analysis mechanism, have been applied to de-
stance, between 2016 and 2020, approximately 218 billion mobile ap- tect malware. Dynamic taint analysis monitors and tracks information
plications (“apps”) were downloaded, which is an increase of over 50% flow leakage at runtime [9]. This approach is error-prone because of the
from 140.7 billion mobile apps downloaded in 2016 [5]. This consists of greater possibility of missing some important security flaws and emerg-
Android, Blackberry, iPhone, and Symbian apps, among others. Among ing malware variants. These approaches detect malware using feature
other platforms, Android-based mobile applications recorded the high- extraction analysis, static, hybrid analysis methods and dynamic analy-
est number of downloads, installed on over 1.5 billion mobile devices sis. Static analysis gathers features of the Android application by using
in 2021 [6]. The official Android play store presently has at least 2.6 packer tools like VMprotect and UPX and analyses features to detect
million apps which can be downloaded and installed for different pur- malware without executing the Android package kit file [10]. Gener-

∗
Corresponding author.
E-mail address: [email protected] (E. Mbunge).

https://doi.org/10.1016/j.csa.2023.100014
Received 14 April 2022; Received in revised form 13 December 2022; Accepted 11 February 2023
Available online 12 February 2023
2772-9184/© 2023 The Authors. Published by Elsevier B.V. on behalf of KeAi Communications Co., Ltd. This is an open access article under the CC BY-NC-ND
license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
E. Mbunge, B. Muchemwa, J. Batani et al. Cyber Security and Applications 1 (2023) 100014

ally, the Android package kit file contains information such as operat- (iv) Identify emerging research gaps in deep learning-based malware
ing systems API calls, opcodes, and network addresses, among others. detection models in Android applications.
It is difficult to detect malware that uses dynamic code loading using
Hereafter, this paper is structured as follows: Section 2 discusses
static analysis [11]. However, some scholars applied dynamic analysis
the materials and methods adopted in carrying out the review.
to monitor malicious software behavior by tracking information flow
Section 3 presents deep learning-based malicious software detection
during the executing stage [12]. This method encounters some imped-
models and their respective performance and the source of the datasets
iments, such as the high computational overhead required to monitor
that were used on the models. Section 4 discusses the recommendation
both the process and information flow [10], as well as the maximum
for detecting malware in Android applications. Lastly, Section 5 presents
number of processes that can be monitored and tracked in real-time
the conclusion of the study.
while executing user’s commands. To improve malware detection accu-
racy, some scholars combined static and dynamic analysis to develop
2. Materials and methods
hybrid malware detection approaches [11]. Despite its outstanding ac-
curacy, hybrid analysis is time-consuming and computationally inten-
The researchers adopted the PRISMA approach [25] to search and
sive when extracting malware behaviours, monitoring information flow
select relevant papers. Various search keywords were used to search
and collecting static features of malware from the APK file.
for relevant publications from different online public repositories. The
To alleviate the challenges associated with classical malware detec-
researchers searched papers from different reputable and prominent
tion approaches, several scholars, including [13–15] applied various
databases and applied inclusion and exclusion criteria to select relevant
machine learning algorithms like support vector machines (SVM), Naïve
papers as illustrated in Fig. 1.
Bayesian networks, random forest (RF), multilayer perceptron, and de-
cision trees to detect malware in Android applications. Several review 2.1. Search strategy
studies conducted by [16–18] analyzed machine learning-based mal-
ware detection models using training, testing and validation data ex- The study applied search keywords such as “deep learning” OR “deep
tracted from Android applications. The extracted features include static learning techniques” OR “Deep neural networks” AND “malware detection”
features, required permissions, sensitive application programming in- OR “detecting malware” OR “malicious software” AND “Android malware
terfaces (APIs), static data flow, suspicious behaviours, and network detection”, OR “Android applications”, OR “malware detection models in
flows to detect malware [19]. However, machine learning algorithms Android applications”. Relevant papers were retrieved from different on-
contain shallow structured architectures that can address simple and line repositories such as Web of Science Google Scholar, Scopus, IEEE
well-constrained classification and clustering problems [20]. Malware Xplore, Science Direct and Springer Link.
apps have been rapidly increasing, making it difficult to detect using
machine learning techniques due to the advanced nature and complex- 2.2. Inclusion and exclusion criteria
ity of the attacks and threats. This calls for innovative, proactive and
adaptive solutions [21] to identify and detect new variants of malicious All papers/articles written in English or had English translation were
software in Android apps. included in this study. Commentaries, letters to the editor and preprint
were not considered in this study. The articles’ publication period was
1.1. Contribution of the study restricted to the period between January 2016 and March 2022, inclu-
sive. Articles were excluded if the malware detection model used was
Cyberattacks have exponentially burgeoned with the rise in human not a deep learning model. All duplicate papers were not considered.
reliance on mobile phones [22]. Accurately detecting emerging mal-
ware in Android apps using machine learning models is increasingly 2.3. Screening process
becoming difficult due to various factors including (i) limited or out-
dated datasets [19], (ii) complexities and diversity of malware [23], The titles, abstracts and content of articles were screened as shown
and (iii) sub-optimal feature extraction [23]. Malware applications have in Fig. 1. Data extracted from the selected articles include references,
been continuously evolving, making it difficult to detect using classi- deep learning applied to detect malware, performance metrics, dataset
cal protection mechanisms due to the sophisticated nature and com- source, and limitations or future work (see Table 1).
plexity of attacks and threats. Furthermore, malware applications are Fig. 1 shows the PRISMA steps following to search literature in vari-
heterogeneous, that is, attacks vary depending on the target, the type ous prominent electronic databases and select relevant papers based on
of service exploited, the spreading source, and the location of the tar- the inclusion and criteria explained above. A total of twenty-five (25)
get [24]. Therefore, the accuracy of malware detection models depends papers were selected, as depicted in Table 1.
on various factors, including proper training of the algorithm, feature
extraction, understanding of abstract behavior, and attacking patterns. 3. Results
This requires robust intelligent-based malware detection models such
as deep learning with highly structured architectures that can learn and Table 1 presents a summary of the study findings, showing the iden-
analyze the longer sequence of malware patterns and behaviours from tified deep learning models, performance, dataset sources, and limita-
various huge datasets [20]. Deep learning models can extract mean- tions /future work for each reviewed article. The identified deep learn-
ingful insights and analyze huge datasets through a higher level of ab- ing models are deep belief networks, GRU, CNN, Bi-LSTM, LSTM, deep
straction and semantic knowledge learning. However, the use of deep neural networks, and cubicLSTM.
learning models to detect emerging malware in Android applications is a) Deep learning models applied to detect malicious software in
still nascent. Therefore, this comprehensive review contributes to this Android applications
nascent research area and aims to:
Deep learning (DL) techniques have been successfully used in vari-
(i) Identify and explain DL-based malicious software detection mod- ous domains including fraud detection [21], security [47], object iden-
els in Android applications. tification and detection [48], and malicious software detection, among
(ii) Analyze the performance of DL models used to detect malware in others. DL is a subset of machine learning, mostly used in image pro-
Android applications. cessing, text classification and speech processing [49]. A deep learning
(iii) Identify dataset sources used to detect malicious software in An- model has numerous hierarchical layers and deeply structured architec-
droid applications. ture with several hidden layers of several connected artificial neurons

2
E. Mbunge, B. Muchemwa, J. Batani et al. Cyber Security and Applications 1 (2023) 100014

Fig. 1. PRISMA ﬂowchart.

to process data [50]. Each layer of a deep neural network comprises and fine-tuning (optimization) phase. During the pre-training stage, the
many artificial neurons, each having its weights and potentially its acti- RBM is trained layer by layer sequentially, starting from the bottom
vation functions that could be different from the ones on the other layers layer [57]. The RBM is a non-directed probability graph model with a
[51,52]. Though the weights can differ, the initial weights may be set layer each for observable and hidden variables [58]. Fine-tuning can
randomly, or the same weight can be set across all the weights during be achieved using backpropagation, where the pre-trained Deep Belief
initialization [53], after which the model will adjust them accordingly Network is optimized in a supervised manner with labelled samples
based on the error value. The outstanding feature of deep learning is its [59]. DBN offers a layer-by-layer learning approach for network ini-
ability to automatically extract and abstract features, thus, eradicating tialization and automatic feature extraction, unlike traditional neural
the need for manual and tedious feature extraction hence, automatically networks. However, it consumes system resources during the training
identifying sophisticated and more useful high-order features [41,54]. stage and consumes time. A study conducted by [6] applied deep be-
Many deep learning models can be and have been applied to detect mal- lief networks to detect malware using datasets from Android PRAGuard
ware in Android applications. Table 1 shows the identified DL models, Dataset and VirusShare and achieved 95.79% precision, 97.62% recall
their performances in detecting malware in Android applications, and and 96.82% accuracy. Also, [33] applied deep belief networks to detect
the sources of the datasets used. malware using Contagio Community, Android Malware Genome Project
Fig. 2 shows various DL models applied to detect malware. These and achieved a precision of 95.77%, recall of 97.84% and accuracy of
models include deep belief networks, Gated Recurrent Units (GRU), 96.76%.
CNN, DNN, LSTM, Bidirectional LSTM, Cubic LSTM, and hybrid models.
ii Gated Recurrent Unit (GRU)
i Deep Belief Networks (DBNs)
The GRU is an enhanced version of recurrent neural networks (RNNs)
According to [6], the proponent of the deep belief networks was Ge- which was introduced by [6], that is widely used to solve classification
offrey Hinton in 2006. A deep belief network is a probability-based gen- problems. The model solves the “vanishing gradient” problem that is
erative model composed of many layers of stochastic [55], latent vari- inherently associated with standard RNNs [60]. The Gated Recurrent
ables with non-directed and symmetric links between the top two layers Unit uses two gates (update and reset gates) to handle and deal with the
[56]. The lower layers of the model have direct links from the above vanishing gradient problem associated with standard RNNs [6,61]. The
layers with the arrows pointing towards the nearest layer to the data. gates are trainable in information retention from a long back, remove
Deep belief networks are trained in two stages: the pre-training phase irrelevant information and pass relevant information down a chain of

3
E. Mbunge, B. Muchemwa, J. Batani et al. Cyber Security and Applications 1 (2023) 100014

Table 1
Deep learning models for detecting malicious software in Android applications.

Reference Deep learning model Performance Dataset Limitations/Future work

[26] Convolutional neural Accuracy- 99.82% VirusShare The model requires more training time.
networks (CNN) F1-score- 99.86%
Recall- 99.91%
Precision - 99.91%
[22] Deep Neural Networks Accuracy −93.4% CICInvesAndMal2019 and The model could not detect malware
(DNN) F1-score- 93.2% CICAndMal2017 benign or malicious applications before
Recall- 93.4% downloading it.
Precision- 93.5%
[27] CNN and LSTM Accuracy- 95.83% VirusTotal and Drebin The model could not detect android
Precision- 95.24% malicious software samples based on
Recall- 96.15% hybridized image-based features.
F1-score- 95.69%
[28] CNN Accuracy- 99.56% VirusShare and Drebin The dynamic analysis coverage and
extraction of dynamic features need to be
improved.
[29] CNN Accuracy- 91.27% VirusShare The model excluded obfuscated
applications that could not extract
application programming interface call
graphs from Flowdroid.
[30] DeepVisDroid (CNN) Accuracy - 98.96% Benign The model could not identify some code
camouflage.
[31] Deep neural networks Accuracy-98.86% F1-measure- 98.65% Benign dataset, Dataset features were easily detectable.
Recall - 98.47% Precision - 98.84% AMD Dataset,
AndroZoo Dataset,
Drebin Malware Collection
[6] Deep Belief Precision – 95.79% Android PRAGuard Dataset and The dataset used was small. Dataset
Network-Gated Recall – 97.62% VirusShare features were easily detectable. The
Recurrent Unit Accuracy– 96.82% calculation time for the hybrid model was
more than the one for the separate models
[32] CNN Precision –96.3434% VirusShare, Malgenome, Drebin, The model takes more time to calculate
Recall – 96.335% Contagio Minidump better results
F1-Score–96.333%
[33] Deep Belief Networks Precision –95.77% Contagio Community, The use of a small dataset will not
Recall – 97.84% Android Malware Genome Project produce better results and does not make
Accuracy –96.76% better use of the deep learning models to
obtain higher accuracy in real-world
Android malicious software detection.
[34] CNN Precision – 99% Android Malware Genome Project, Vulnerable to impersonation attacks
Recall – 95% McAfee Labs
F1-Score – 97%
Accuracy −98%
[35] CNN ContagioDump, To consider the use of other program
Accuracy - 95.46% Marvin, graphs like program dependence graphs,
Drebin, and for model training
VirusShare
[36] Cubic-LSTM Accuracy - 99% CICAndMal2017 The authors aim to use different datasets
as well as different algorithms of deep
learning for malicious software detection
in future work.
[37] Deep Neural Networks Precision - 98.09% Recall - 99.56% McAfee Labs Time-consuming
Accuracy – 98.5%
[38] Convolutional Neural Accuracy – 95.4% • Drebin The approach requires further analysis to
Networks • Android Malware Dataset prove its efficiency.

[39] Long Short-Term Accuracy –97.74% • VirusShare Sometimes it fails

Memory • MassVet to detect malicious behaviours whose
loading and execution occur at runtime.
Needs a frequent
update with new
labelled features to avoid ambiguous
predictions
[40] Deep Neural Network Precision – 97.15% Drebin Vulnerable to
Recall – 94.18% impersonation attacks
F1-Score – 95.64%
[10] Long Short-Term Precision-93.7% Malgenome Applications were run in the emulator for
Memory Recall-98.8% a
F1-Score-96.1% short time and refrained from showing
Accuracy-93.9% any malicious activities
[41] Bidirectional Long Accuracy-97.22% Android Malware Dataset Time-consuming
Short-Term Memory F1-Score=98.21%
[42] Deep Belief Network Precision-98.68% • Dreblin Inherent limitations of static analysis
Recall-98.12% • VirusTotal
F1-Score-98.40% • Contagio

(continued on next page)

4
E. Mbunge, B. Muchemwa, J. Batani et al. Cyber Security and Applications 1 (2023) 100014

Table 1 (continued)

Reference Deep learning model Performance Dataset Limitations/Future work

[20] Deep Belief Network Precision-98.3% • Android Malware Dataset too small with too many features
Recall-96.6% Genome Project
• Dreblin
F1-Score-97.4%
• Contagio
Accuracy-97.4%

[43] Deep Belief Network Precision-98.5% • Android Malware Genome Inherent limitations of static analysis
Recall-99.3% Project
F1-Score-98.72% • VirusShare
Accuracy-98.71%
[44] Gated Recurrent Unit Precision-96.9% CICAndMal2017 Not available
Recall-99.2%
F1-Score-98.0%
Accuracy-98.2%
[45] Deep Neural Network Precision-95.35% • Contagio Mobile malicious Aim to improve the malware detection
Recall-95.31% software minidump accuracy by experimenting with other
F1-Score-95.31% • DroidBench deep learning methods.
Accuracy-95.31% • GitHub- Android malware master
• VirusShare
• VirusSign

[46] LSTM Precision- 99.3% Android Malware and Goodware The model cannot incorporate new
Recall-99.2% behavioural-driven heuristics to make it
F1-Score-99.3% adapt to new, unseen malicious software
Accuracy-99.3% threats.

events to make better predictions. The update gate regulates the quan- quired to process the data [67]. CNNs can use ReLu, Maxout, tanh and
tity of information from preceding time steps that has to be forwarded sigmoid activation functions to introduce the nonlinearity required to
to the future (succeeding steps) [62]. On the other hand, the reset gate solve nonlinearly separable problems/ features [65].
decides the amount of the previous information (history) to be forgotten The fully connected layer determines the association between fea-
by the network [63]. To improve the RNN’s memory capacity and ease tures and the target class [68]. Several studies, including [21] [17, 22-
of model training, one can use a GRU. The Gated Recurrent Unit was 25, 34], and [38], successfully applied CNNs to detect malware in An-
applied by [44] to detect malware using CICAndMal2017 and recorded droid applications. A study conducted by [26] applied Convolutional
a precision of 96.9%, recall of 99.2%, F1-measure of 98% and accuracy neural networks to detect malware using the dataset from VirusShare
of 98.2%. However, GRU has slow convergence and low learning effi- and achieved the best performance results of 99.82% accuracy, F1-score
ciency [64]. of 99.86%, recall of 99.91% and 99.91% precision. However, the model
can effectively detect malware when working with huge and updated
iii Convolutional Neural Networks (CNNs)
malware datasets.
CNNs consist of three distinguishable layers, namely convolutional,
iv Deep Neural Networks (DNNs)
pooling and fully connected, as depicted in Fig. 3. The first layer, convo-
lutional, is generally used to compute different feature maps. The out- DNNs entail numerous layers used to solve complex nonlinear prob-
puts of the convolutional layer are then passed to the pooling layer [65]. lems [69]. It consists of many hidden layers connected to the input
The pooling layer is connected between two layers of convolution [66]. and output layers [70]. However, deep neural networks generally have
It is used to reduce the size of the convolved features while maintain- more layers than artificial neural networks [71]. The input layer ac-
ing important features. This also decreases the computational power re- cepts an input vector, while the hidden layers perform some computa-

Fig. 2. Deep learning-based malware detec-

tion models.

5
E. Mbunge, B. Muchemwa, J. Batani et al. Cyber Security and Applications 1 (2023) 100014

Fig. 3. Taxonomy of CNN and application areas [65].

tions and send the output to the output layer. To minimize the error or in MassVet and VirusShare malware datasets and achieved an accuracy
cost, the weights and biases are iteratively fine-tuned using algorithms of 97.74%.
such as stochastic gradient descent). The model performs well in error
vi Bidirectional Long Short-Term Memory (Bi-LSTM)
minimization between the network output and the true class because,
after performing every forward pass through the network, backpropa- Bi-LSTM comprises two LSTMs used to accept input, nonetheless in
gation executes a reverse (backward) pass to automatically adjust the opposite directions (forward and backward directions). The input flows
model’s parameters. A study by [45] applied deep neural networks to in two directions (front and back) to preserve the future and past in-
detect malware using datasets from Contagio Mobile malicious software formation unlike in LSTM where the input flows in one direction, either
minidump, DroidBench, VirusShare and VirusSign and achieved a pre- backwards or forward. The network can create a condition for each char-
cision of 95.35%, recall of 95.31%, F1-score-95.31% and accuracy of acter in the input text depending on its past and future [73]. By doing
95.31%. Also, [35] detected malware in the Drebin dataset using deep this, detection accuracy can be easily achieved when using Bi-LSTM be-
neural networks and achieved a precision of 97.15%, recall of 94.18%, cause it preserves past and future information. For instance, a study by
and F1-score of 95.64%. [40] applied Bi-LSTM to detect malware using Android Malware Dataset
and achieved an accuracy of 97.22% and an F1-score of 98.21%.
v Long Short-Term Memory (LSTM)
vii Hybrid Model (CubicLSTM and Bi-LSTM)
LSTM is a subset of RRN that trains and learns long-range tempo-
ral dynamics in sequences of arbitrary length [10]. Generally, the LSTM The hybrid model combines the Cubic-LSTM and Bi-LSTM to detect
model consists of an input gate, output gate and forget gate [72]. The malicious software in Android applications. The CubicLSTM is a network
model remembers the long sequences for a long period. The LSTM model with two states namely the temporal and spatial states that are created
has a cell state which ensures that there are no alterations to the infor- by two independent convolutions so that it allows diverse types of infor-
mation flowing through the units. Each unit has an input gate, an output mation to be processed and carried by different operations and states.
gate, and a forget gate used by the cells to control the information to The information in the temporal state and spatial state is processed
be retained or discarded before relaying the long-term and short-term separately to reduce the burden of prediction. CubicLSTM comprises
information to the succeeding cell. The gates act as filters that remove three branches which are the temporal, spatial, and output branches.
irrelevant and unwanted selected information [10]. The filtered-out in- The branches are built along the three axes in the Cartesian coordinate
formation is only that from variables deemed useless. The forget gate system [75]. The temporal branch flows along the x-axis and the convo-
uses the sigmoid activation function to decide the information from the lution aims to acquire and process motions. Since the temporal branch
previous cell state to keep or forget. It is achieved by computing the contains the motion information its responsibility is to create or produce
product of the inbound long-term memory and a forget vector produced the temporal state. The spatial branch flows along the z-axis and the
by the present input and inbound short memory. Lastly, there is the convolution’s responsibility is to capture and analyze moving objects.
output gate which takes the present input, the preceding short-term This is the branch that generates the spatial state as it ferries the spatial
memory, and a newly calculated long-term memory to generate new layout information regarding moving objects. The output branch pro-
short-term memory and decide which information to forward to the fol- duces the final forecasting frames along the y-axis in accordance with
lowing hidden state. Long Short-Term Memory can only retain previous the forecasted motions given by the temporal branch and the moving
information since it takes received inputs from the preceding neurons, object information supplied by the spatial branch. A two-dimensional
known as backward direction and this results in a poor prediction rate network can be formed by piling several Cubic LSTM units along the spa-
as it is deprived of information about the future [73]. A study conducted tial and output branches. The two-dimensional network if evolved along
by [74] applied LSTM to detect malware in Android and achieved a pre- the x-axis can further construct a three-dimensional network[36]. Using
cision of 91.3%, recall of 96.6% as well as accuracy of 93.7% and a low the hybrid model helps in detecting higher accuracy rates and reduced
false positive rate of 9.3%. Also, [37] applied LSTM to detect malware false rates. A hybrid model is also time efficient as compared to a single

6
E. Mbunge, B. Muchemwa, J. Batani et al. Cyber Security and Applications 1 (2023) 100014

Table 2
Malware dataset sources.

Ref Dataset Collection Time Malware Samples Source

[77] CICAndMal2017 2017 365 https://www.unb.ca/cic/datasets/andmal2017.html

[26] VirusShare 2018, 2019, 2020 4038 https://virusshare.com/
[78] Android Malware Genome Project August2010-October 2011 1260 http://www.malgenomeproject.org/
[79] MassVet 2015 127,429
https://www.usenix.org/system/ﬁles/conference/usenixsecurity15/
sec15-paper-chen-kai.pdf
[13] Intel Security Not available 11,505 https://steppa.ca/portfolio-view/malware-threat-intel-datasets/
[80] Android PRAGuard Dataset 2015 2260 https://pralab.diee.unica.it/en/AndroidPRAGuardDataset
[14] Contagio December 2011-March 2013 1150 http://contagiodump.blogspot.com/
[76] Dreblin August 2010-October 2012 5560 https://www.sec.cs.tu-bs.de/~danarp/drebin/
[81] Android Malware Dataset 2010–2016 24,650 http://amd.arguslab.org/
[6] VirusTotal 2012–2018 Not available http://www.virustotal.com
[45] VirusSign 2011 146 www.virussign.com

model. For instance, [34] applied a hybrid model by combining Bi-LSTM CICAndMal2017 [77] was collected in 2017 and has 365 malicious soft-
and Cubic LSTM to detect malware in the CICAndMal2017 dataset and ware samples. There is also the McAfee Labs dataset which can be ac-
achieved 99% accuracy. However, the hybrid model is highly computa- cessed through the website with 11 505 malicious samples. The Android
tionally intensive and requires more memory space. PRAGuard dataset [80] consists of 10 479 malicious samples and was
viii Hybrid Model (Deep Belief Network and Gated Recurrent Unit) collected in 2015. Marvin dataset [82] with 10 559 malicious samples
collected between June 2012 and May 2014. There is also the MassVet
A hybrid model that combines DBN and GRU was used because of dataset[79] which consists of 127 429 malicious samples, VirusSign
the various features static and dynamic features of Android apps. Deep collected in 2011 with 146 malware samples, the VirusTotal dataset
Belief Network has some advantages having a better performance and which can be accessed through the website and lastly DroidBench with
faster learning rate of static features of Android applications. Compar- 30 samples and GitHub- Android malware master with 80 samples,
ing it with the traditional Recurrent Neural Network model, the Gate both datasets can be accessed on GitHub. The deep learning models in
Recurrent Unit performs better when handling fewer parameters with Table 1 achieved more than 90% detection accuracy using the sources
prolonged time operation sequences, quicker training rate and fewer of the datasets shown in Table 2 although some of them have few mali-
data needed to obtain a good generalization. This makes it ideal for cious samples.
processing the dynamic characteristics of Android apps. For training the
models, the dynamic feature vectors were used for the Deep Belief Net- c) Identified research gaps emanating from deep learning-based
work, and the static feature vectors were used for the Gated Recurrent malware detection models
Unit, and the output vectors were inputted to the fully connected layer. The study identified some challenges and weaknesses associated with
To fine-tune the parameters of the hybrid model (comprising the deep datasets used to detect malware by previous researchers. For instance,
belief network and Gated Recurrent Unit), they used the SoftMax activa- some of the datasets used are small, however, deep learning models
tion function. This activation function squeezes the output of numerous require huge datasets to perform better. Some deep learning-based mal-
neurons, squashing it in the (0,1) range, while the classification output ware detection models used old datasets such as DREBLIN to detect mal-
is probabilistic [6]. The hybrid model improves the model’s malware ware, thus, making detection models susceptible to emerging malware.
detection accuracy relative to its independent constituent models. More- Also, some datasets are not accessible publicly making it difficult to
over, the DBN-GRU malware detection model’s results are not distorted validate the detection models. Some deep learning models have chal-
because of repackaging the software. This is so because the extracted lenges of vulnerability to attacks during the training phase and testing
features (both dynamic and static) are not affected by the repackaging phases. The models can suffer “data poisoning” attacks during training.
of the software as well as the model’s training process. A study con- The data poisoning attacks are perpetrated through manipulation of the
ducted by [4] developed and applied a hybrid model that combines both model training to instill data that make the model make errors. During
Deep Belief Network and Gated Recurrent Unit to detect malware using testing, the malware detection models are exposed to adversary attacks,
Android PRAGuard Dataset and VirusShare and achieved a precision of impersonation attacks, and many others. An adversary attack can mis-
95.79%, recall of 97.62% and accuracy of 96.82%. lead DNNs and consequently cause misclassification [83].
b) Identified malware dataset sources 4. Recommendations for detecting malware in android
The study revealed that deep learning-based malware detection mod- applications
els use various datasets from various online malware databases, as
shown in Table 2. Deep learning thrives on huge datasets; hence, it There is a need to increase malware dataset sizes as well as improve
requires a big dataset that is kept up to date with current and re- the accessibility of malware datasets to the public in order to train, test
cent malicious malware to leverage the full potential of deep learning. and validate DL-based malware detection models using various datasets.
Table 2 shows that the most used dataset is the DREBIN dataset [76], As Android applications continue to increase in the market, more dy-
with 5 560 malware samples collected over nearly two years from Au- namic and hybrid DL-based malware detection models are required to
gust 2010 to October 2012. detect emerging malicious software. Some researchers such as [6] sug-
The Android Malware Genome Project dataset [78] has 1260 mali- gested a hardening of deep learning models against adversarial attacks.
cious samples gathered between August 2010 and October 2011. The To achieve this, they proposed retraining and distillation as a way of
Contagio dataset consists of 1150 malware samples collected in 2011, confronting the adversarial attacks on deep learning models. However,
and the VirusShare dataset which is publicly available through the web- these solutions were initially created to fight “adversarial attacks” in
site (https://virusshare.com/) has 4712 samples that were collected in computer vision, instead of malicious software, so further studies can
2018, 2019 and 2020. Android Malware Dataset [81] has 24,650 ma- be done to investigate the effectiveness of these techniques in detecting
licious software samples which were elicited between 2010 and 2016. malware.

7
E. Mbunge, B. Muchemwa, J. Batani et al. Cyber Security and Applications 1 (2023) 100014

Conclusion [18] M. Taleby, Q. Li, M. Rabbani, A. Raza, A survey on smartphones security: soft-
ware vulnerabilities, malware, and attacks, Int. J. Adv. Comput. Sci. Appl. 8 (2017),
doi:10.14569/IJACSA.2017.081005.
The unmatched increase of malware in Android applications requires [19] Gaurav A., Gupta B.B., Panigrahi P.K. A comprehensive survey on ma-
some efficient solutions to prevent them. The study revealed that classi- chine learning approaches for malware detection in IoT-based enterprise
cal techniques based on static, dynamic and hybrid analyses continue to information system. Https://DoiOrg/101080/1751757520212023764 2022.
doi:10.1080/17517575.2021.2023764.
be susceptible to emerging malicious software. Therefore, as malware [20] X. Su, W. Shi, X. Qu, Y. Zheng, X. Liu, DroidDeep: using Deep Belief Network to
features continue increasing, deep learning models can detect malware characterize and detect android malware, Soft Comput. 24 (2020) 6017–6030 2020
with high accuracy. The study further revealed that deep learning mod- 248, doi:10.1007/S00500-019-04589-W.
[21] J. Batani, An adaptive and real-time fraud detection algorithm in online transactions,
els such as DBNs, GRU, CNNs, DBNs, LSTM, Bi-LSTM, CubicLSTM, and
Int. J. Comput. Sci. Bus. Informatics 17 (2017) 1–12.
hybrid models can effectively detect malicious software in Android ap- [22] S.I. Imtiaz, Rehman S ur, A.R. Javed, Z. Jalil, X. Liu, W.S Alnumay, Deep-
plications. The study also revealed that Convolutional Neural Networks AMD: detection and identification of Android malware using high-efficient Deep
Artificial Neural Network, Futur. Gener. Comput. Syst. 115 (2021) 844–856,
have been prominently used and achieved generally high accuracy as
doi:10.1016/J.FUTURE.2020.10.008.
compared to other malware detection models. However, there is a need [23] Tirkey A., Mohapatra R.K., Kumar L. Sniffing android malware using deep learning
for frequently update malware datasets so that deep learning-based mal- 2022:489–505. doi:10.1007/978-981-19-0019-8_37.
ware detection models can train, learn and detect emerging malware [24] B. Urooj, M.A. Shah, C. Maple, M.K. Abbasi, S. Riasat, Malware detection: a frame-
work for reverse engineered android applications through machine learning algo-
and new cyberattacking trends and techniques [21]. The findings of this rithms, IEEE Access (2022), doi:10.1109/ACCESS.2022.3149053.
study show that deep learning can be an effective technique for detect- [25] M.J. Page, J.E. McKenzie, P.M. Bossuyt, I. Boutron, T.C. Hoffmann, C.D. Mulrow,
ing malicious software in Android applications. Future work can focus et al., The PRISMA 2020 statement: an updated guideline for reporting systematic
reviews, BMJ 372 (2021), doi:10.1136/BMJ.N71.
on applying deep learning models to detect malicious software before [26] W. Wang, M. Zhao, J. Wang, Effective android malware detection with
downloading Android applications to improve the security of Android a hybrid model based on deep autoencoder and convolutional neural net-
smartphone devices. work, J. Ambient Intell. Humaniz. Comput. 10 (2018) 3035–3043 2018 108,
doi:10.1007/S12652-018-0803-6.
[27] N. Lu, D. Li, W. Shi, P. Vijayakumar, F. Piccialli, V. Chang, An efficient combined
deep neural network based malware detection framework in 5G environment, Com-
Declaration of Competing Interests put. Netw. 189 (2021) 107932, doi:10.1016/J.COMNET.2021.107932.
[28] Z. Wang, G. Li, Z. Zhuo, X. Ren, Y. Lin, J. Gu, A deep learning method for android ap-
plication classification using semantic features, Secur. Commun. Netw. 2022 (2022),
The authors declare that they have no known competing financial doi:10.1155/2022/1289175.
interests or personal relationships that could have appeared to influence [29] J. Kim, Y. Ban, E. Ko, H. Cho, J.H. Yi, MAPAS: a practical deep learning-
based android malware detection system, Int. J. Inf. Secur. (2022) 1–14,
the work reported in this paper.
doi:10.1007/S10207-022-00579-6/TABLES/4.
[30] K. Bakour, H.M. Ünver, DeepVisDroid: android malware detection by hybridiz-
ing image-based features with deep learning techniques, Neural Comput. Appl. 33
References (2021) 11499–11516, doi:10.1007/S00521-021-05816-Y/TABLES/7.
[31] A. Pektaş, T. Acarman, Deep learning for effective Android malware detec-
[1] GSMA, 2016 mobile industry impact report: Sustainable Development Goals, GSMA; tion using API call graph embeddings, Soft Comput. 24 (2020) 1027–1043,
(2016). doi:10.1007/S00500-019-03940-5/TABLES/6.
[2] J. Batani, S. Musungwini, T.G. Rebanowako, An Assessment of the use of mobile [32] Karbab E.B., Debbabi M., Derhab A., Mouheb D. Android malware detection using
phones as sources of agricultural information by tobacco Smallholder farmers in deep learning on API method sequences 2017.
Zimbabwe, J. Syst. Integr. 2019 (2019) 1–22, doi:10.20470/jsi.v10i3.375. [33] Yuan Z., Lu Y., Xue Y. DroidDetector: Android malware characterization and detec-
[3] J. Batani, M.S. Maharaj, Towards data-driven models for diverging emerging tech- tion using deep learning. vol. 21. 2016.
nologies for maternal, neonatal and child health services in Sub-Saharan Africa: a [34] N. McLaughlin, J.M. Del Rincon, B.J. Kang, S. Yerima, P. Miller, S. Sezer, et al., Deep
systematic review, Glob. Heal. J. (2022), doi:10.1016/j.glohj.2022.11.003. android malware detection, CODASPY 2017 - Proc. 7th ACM Conf. Data Appl. Secur.
[4] J. Batani, M.S. Maharaj, Towards data-driven pediatrics in Zimbabwe, 2022 Int. Priv. (2017) 301–308, doi:10.1145/3029806.3029823.
Conf. Artif. Intell. Big Data, Comput. Data Commun. Syst. (2022) 1–7 IEEE, [35] Z. Xu, K. Ren, S. Qin, Craciun F. CDGDroid, Android malware detection based on
doi:10.1109/icABCD54961.2022.9855907. deep learning using CFG, DFG (2018) 11232.
[5] L. Ceci, Annual number of mobile app downloads worldwide 2020 | Statista, Statista [36] M. Ahmad, D. Javeed, M. Shoaib, N. Younas, A. Zaman, An efficient approach of
(2021). deep learning for android malware detection, United Int. J. Res. Technol. 02 (2021)
[6] T. Lu, Y. Du, L. Ouyang, Q. Chen, X. Wang, Android malware detection 15–20.
based on a hybrid deep learning model, Secur. Commun. Netw. 2020 (2020), [37] M.K. Alzaylaee, S.Y. Yerima, Sezer S. DL-Droid, Deep learning based an-
doi:10.1155/2020/8863617. droid malware detection using real devices, Comput. Secur. 89 (2020) 101663,
[7] A. Mahindru, A.L. Sangal, FSDroid:- a feature selection technique to detect malware doi:10.1016/J.COSE.2019.101663.
from android using machine learning techniques: fsdroid, Multim. Tools Appl. 80 [38] C. Hasegawa, H. Iyatomi, One-dimensional convolutional neural networks for An-
(2021) 13271–13323, doi:10.1007/S11042-020-10367-W/TABLES/21. droid malware detection, in: Proc - 2018 IEEE 14th Int Colloq Signal Process Its Appl
[8] K. Liu, S. Xu, G. Xu, M. Zhang, D. Sun, H. Liu, A review of android malware detec- CSPA 2018, 2018, pp. 99–102, doi:10.1109/CSPA.2018.8368693.
tion approaches based on machine learning, IEEE Access 8 (2020) 124579–124607, [39] Xu K., Li Y., Deng R.H., Chen K. DeepRefiner: multi-layer android malware de-
doi:10.1109/ACCESS.2020.3006143. tection system applying deep neural networks; deeprefiner: multi-layer android
[9] V.G. Shankar, G. Somani, M.S. Gaur, V. Laxmi, M. Conti, AndroTaint: an efficient malware detection system applying deep neural networks 2018. doi:10.1109/Eu-
android malware detection framework using dynamic taint analysis, ISEA Asia Secur. roSP.2018.00040.
Priv. Conf. (2017) ISEASP 2017 2017, doi:10.1109/ISEASP.2017.7976989. [40] D. Li, Z. Wang, Y. Xue, Fine-grained android malware detection based on
[10] R. Vinayakumar, K.P. Soman, P. Poornachandran, S Sachin Kumar, Detecting an- deep learning, 2018 IEEE Conf Commun Netw Secur CNS 2018, 2018,
droid malware using long short-term memory (LSTM), J. Intell. Fuzzy Syst. 34 (2018) doi:10.1109/CNS.2018.8433204.
1277–1288, doi:10.3233/JIFS-169424. [41] Ma Z., Ge H., Wang Z., Liu Y., Liu X. Droidetec: Android malware detection and
[11] A. Alotaibi, Identifying malicious software using deep residual long-short term mem- malicious code localization through deep learning 2020.
ory, IEEE Access 7 (2019) 163128–163137, doi:10.1109/ACCESS.2019.2951751. [42] T. Chen, Q. Mao, M. Lv, H. Cheng, Li Y. Droidvecdeep, Android malware detection
[12] A. Qamar, A. Karim, V. Chang, Mobile malware attacks: review, taxon- based on word2vec and deep belief network, KSII Trans. Internet Inf. Syst. 13 (2019)
omy & future directions, Fut. Gener. Comput. Syst. 97 (2019) 887–909, 2180–2197, doi:10.3837/TIIS.2019.04.025.
doi:10.1016/J.FUTURE.2019.03.007. [43] X. Qin, F. Zeng, Zhang Y. MSndroid, The android malware detector based on
[13] Zadeh Nojoo Kambar M.E., Esmaeilzadeh A., Kim Y., Taghva K. A survey multi-class features and deep belief network, ACM Int Conf Proceeding Ser, 2019,
on mobile malware detection methods using machine learning 2022:0215–21. doi:10.1145/3321408.3321606.
doi:10.1109/CCWC54503.2022.9720753. [44] O.N. Elayan, A.M. Mustafa, Android malware detection using deep learning, Proce-
[14] Q. Wu, X. Zhu, B. Liu, A survey of android malware static detection technology based dia Comput. Sci. 184 (2021) 847–852, doi:10.1016/J.PROCS.2021.03.106.
on machine learning, Mob. Inf. Syst. 2021 (2021), doi:10.1155/2021/8896013. [45] A. Naway, Y LI, A review on the use of deep learning in android malware detection,
[15] Idika N., Mathur A.P. A survey of malware detection techniques 2007. Cryptogtaphy Secur. (2018), doi:10.48550/arXiv.1812.10360.
[16] J. Senanayake, H. Kalutarage, Al-Kadri MO, Android mobile malware detection using [46] E. Amer, S. El-Sappagh, Robust deep learning early alarm prediction model based
machine learning: a systematic review, Electron 10 (2021) 1606 Page 1606 2021;10, on the behavioural smell for android malware, Comput. Secur. 116 (2022) 102670,
doi:10.3390/ELECTRONICS10131606. doi:10.1016/J.COSE.2022.102670.
[17] Kouliaridis V., Kambourakis G. A comprehensive survey on machine learning tech- [47] T.M. Sanyanga, M.S. Chinzvende, T.D. Kavu, J. Batani, Searching objects in a
niques for android malware detection. Inf 2021, Vol 12, Page 185 2021;12:185. video footage, Int. J. ICT Res. Africa Middle East 8 (2019) 18–31, doi:10.4018/ijic-
doi:10.3390/INFO12050185. trame.2019070102.

8
E. Mbunge, B. Muchemwa, J. Batani et al. Cyber Security and Applications 1 (2023) 100014

[48] T. Ahmad, Y. Ma, M. Yahya, B. Ahmad, S. Nazir, A. Haq, Object detection through a review, Lect. Notes Inst. Comput. Sci. Soc. Telecommun. Eng. LNICST 405 (2022)
modified YOLO neural network, Sci. Program (2020) 2020. 182–202 LNICST, doi:10.1007/978-3-030-93314-2_12/COVER.
[49] E. Mbunge, J. Batani, R. Mafumbate, C. Gurajena, S. Fashoto, T. Rugube, et al., Pre- [67] Z. Wang, Q. Liu, Y. Chi, Review of android malware detection based on deep learn-
dicting student dropout in massive open online courses using deep learning models ing, IEEE Access 8 (2020) 181102–181126, doi:10.1109/ACCESS.2020.3028370.
- a systematic review, Cybern. Perspect. Syst. CSOC 2022. Lect. Notes Netw. Syst., [68] L. Alzubaidi, J. Zhang, A.J. Humaidi, A. Al-Dujaili, Y. Duan, O. Al-Shamma, et al.,
Cham: Springer (2022) 212–231, doi:10.1007/978-3-031-09073-8_20. Review of deep learning: concepts, CNN architectures, challenges, applications, fu-
[50] C. Janiesch, P. Zschech, K. Heinrich, Machine learning and deep learning, Electron ture directions, J. Big Data 8 (2021), doi:10.1186/S40537-021-00444-8.
Mark 31 (2021) 685–695, doi:10.1007/s12525-021-00475-2. [69] G. Montavon, W. Samek, K.R. Müller, Methods for interpreting and un-
[51] J. Batani, E. Mbunge, B. Muchemwa, G. Gaobotse, C. Gurajena, S. Fashoto, et al., derstanding deep neural networks, Digit Signal Process 73 (2018) 1–15,
A review of deep learning models for detecting cyberbullying on social media net- doi:10.1016/J.DSP.2017.10.011.
works, in: Lecture. Notes Networks System, Cham: Springer, 2022, pp. 528–550, [70] W. Samek, G. Montavon, S. Lapuschkin, C.J. Anders, K.R. Müller, Explaining deep
doi:10.1007/978-3-031-09073-8_46. neural networks and beyond: a review of methods and applications, Proc. IEEE 109
[52] E. Mbunge, R.C. Millham, M.N. Sibiya, S. Takavarasha, Application of machine learn- (2021) 247–278, doi:10.1109/JPROC.2021.3060483.
ing models to predict malaria using malaria cases and environmental risk factors, [71] E. Mbunge, B. Muchemwa, Deep learning and machine learning techniques
2022 Conf Inf Commun Technol Soc ICTAS 2022 - Proc 2022, 2022, doi:10.1109/IC- for analyzing travelers, in: online reviews: a review. Https://ServicesIgi-
TAS53252. GlobalCom/Resolvedoi/ResolveAspx?Doi=104018/978-1-7998-8306-7Ch002 1AD,
[53] E. Mbunge, S.G. Fashoto, H. Bimha, Prediction of box-office success: a review of 2022, pp. 20–39, doi:10.4018/978-1-7998-8306-7.CH002.
trends and machine learning computational models, Int. J. Bus. Intell. Data Min. 20 [72] P. Shanmugam, B. Venkateswarulu, R. Dharmadurai, T. Ranganathan, M. Indiran,
(2022) 192–207, doi:10.1504/IJBIDM.2022.120825. M. Nanjappan, Electro search optimization based long short-term memory network
[54] A. Vial, D. Stirling, M. Field, M. Ros, C. Ritz, M. Carolan, et al., The role of deep for mobile malware detection, Concurr. Comput. Pract. Exp. 34 (2022) e7044,
learning and radiomic feature extraction in cancer-specific predictive modelling: a doi:10.1002/CPE.7044.
review, Transl. Cancer Res. 7 (2018) 803–816, doi:10.21037/tcr.2018.05.02. [73] I.U. Haq, T.A. Khan, A. Akhunzada, A dynamic robust DL-based model for an-
[55] M.A. Keyvanrad, M.M. Homayounpour, A brief survey on deep belief net- droid malware detection, IEEE Access 9 (2021) 74510–74521, doi:10.1109/AC-
works and introducing a new object oriented toolbox, (DeeBNet) (2014), CESS.2021.3079370.
doi:10.48550/arxiv.1408.3264. [74] X. Xiao, S. Zhang, F. Mercaldo, G. Hu, A.K. Sangaiah, Android malware detection
[56] G.E. Hinton, Deep belief networks, Scholarpedia 4 (2009) 5947, based on system call sequences and LSTM, Multimed. Tools Appl. 78 (2019) 3979–
doi:10.4249/SCHOLARPEDIA.5947. 3999, doi:10.1007/S11042-017-5104-0/FIGURES/9.
[57] A.R. Mohamed, G. Hinton, G. Penn, Understanding how deep belief networks per- [75] H. Fan, L. Zhu, Y. Yang, Cubic LSTMs for video prediction, in: 33rd AAAI Confer-
form acoustic modelling, ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process - ence on Artificial Intelligence AAAI 2019, 31st Innovation Appl Artificial Intelli-
Proc. (2012) 4273–4276, doi:10.1109/ICASSP.2012.6288863. gence Conference IAAI 2019 9th AAAI Symp Educ Adv Artif Intell EAAI 2019, 2019,
[58] Y. Hua, J. Guo, H. Zhao, Deep Belief Networks and deep learning, in: Proceeding of pp. 8263–8270, doi:10.1609/aaai.v33i01.33018263.
the 2015 International Conference on Intell Computer Internet Things, ICIT 2015, [76] D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K.R. Ndss, 2014U. Drebin: effec-
2015, pp. 1–4, doi:10.1109/ICAIOT.2015.7111524. tive and explainable detection of android malware in your pocket, ProsecMlsecOrg
[59] K. Zhang, S. Shi, S. Liu, J. Wan, L. Ren, Research on DBN-based eval- (2014).
uation of distribution network reliability, E3SWC 242 (2021) 03004, [77] A.H. Lashkari, A.F.A. Kadir, L. Taheri, A.A. Ghorbani, Toward developing a sys-
doi:10.1051/E3SCONF/202124203004. tematic approach to generate benchmark android malware datasets and clas-
[60] H. Zhou, X. Yang, H. Pan, W. Guo, An android malware detection approach sification, Proc. - Int. Carnahan Conf. Secur. Technol. (2018) 2018-Octob,
based on SIMGRU, IEEE Access 8 (2020) 148404–148410, doi:10.1109/AC- doi:10.1109/CCST.2018.8585560.
CESS.2020.3007571. [78] Y. Zhou, X. Jiang, Dissecting Android malware: characterization and evolution, Proc.
[61] I. Chingombe, T. Dzinamarira, D. Cuadros, M.P. Mapingure, E. Mbunge, S. Chaput- - IEEE Symp. Secur. Priv. (2012) 95–109, doi:10.1109/SP.2012.16.
sira, et al., Predicting HIV status among men who have sex with men in bulawayo & [79] K. Chen, P. Wang, Y. Lee, X. Wang, N. Zhang, H. Huang, et al., Finding unknown mal-
harare, zimbabwe using bio-behavioural data, recurrent neural networks, and ma- ice in 10 seconds: mass vetting for new threats at the Google-Play scale, Undefined
chine learning techniques, Trop. Med. Infect. Dis. 7 (2022) 231 Page 231 20227, (2015).
doi:10.3390/TROPICALMED7090231. [80] D. Maiorca, D. Ariu, I. Corona, M. Aresu, G. Giacinto, Stealth attacks: an extended
[62] Rehman S ur, Khaliq M, S.I. Imtiaz, A. Rasool, M. Shafiq, A.R. Javed, et al., DIDDOS: insight into the obfuscation effects on Android malware, Comput. Secur. 51 (2015)
an approach for detection and identification of Distributed Denial of Service (DDoS) 16–31, doi:10.1016/J.COSE.2015.02.007.
cyberattacks using Gated Recurrent Units (GRU), Futur. Gener. Comput. Syst. 118 [81] F. Wei, Y. Li, S. Roy, X. Ou, W. Zhou, Deep ground truth analysis of
(2021) 453–466, doi:10.1016/J.FUTURE.2021.01.022. current android malware, Lect. Notes Comput. Sci. (Including Subser. Lect.
[63] S. Kostadinov, Understanding GRU networks, Towar Data Sci. (2017). Notes Artif. Intell. Lect. Notes Bioinformatics) 10327 (2017) 252–276 LNCS,
[64] X. Wang, J. Xu, W. Shi, J. Liu, OGRU: an optimized gated recurrent unit neural doi:10.1007/978-3-319-60876-1_12.
network, J. Phys. Conf. Ser. 1325 (2019), doi:10.1088/1742-6596/1325/1/012089. [82] M. Lindorfer, M. Neugschwandtner, C.MARVIN Platzer, Efficient and comprehensive
[65] J. Gu, Z. Wang, J. Kuen, L. Ma, A. Shahroudy, B. Shuai, et al., Recent ad- mobile app classification through static and dynamic analysis, Proc. - Int. Comput.
vances in convolutional neural networks, Pattern Recognit. 77 (2018) 354–377, Softw. Appl. Conf. 2 (2015) 422–433, doi:10.1109/COMPSAC.2015.103.
doi:10.1016/J.PATCOG.2017.10.013. [83] Specht F., Otto J. Hardening deep neural networks in condition mon-
[66] E. Mbunge, S. Fashoto, R. Mafumbate, S. Nxumalo, Diverging hybrid and deep learn- itoring systems against adversarial example attacks 2021:103–11.
ing models into predicting students’ performance in smart learning environments – doi:10.1007/978-3-662-62746-4_11.

Starbucks Barista Training Guide PDF
100% (1)
Starbucks Barista Training Guide PDF
32 pages
Hard+drive+motor+driver+schematic: Read/Download
No ratings yet
Hard+drive+motor+driver+schematic: Read/Download
2 pages
Deep Learning Based Android Malware Detection Using Real PDF
No ratings yet
Deep Learning Based Android Malware Detection Using Real PDF
11 pages
Significant Permission Identification For Machine Learning Based Android Malware Detection
No ratings yet
Significant Permission Identification For Machine Learning Based Android Malware Detection
10 pages
zhang2018
No ratings yet
zhang2018
5 pages
Journal Paper_230411_131005
No ratings yet
Journal Paper_230411_131005
8 pages
Report Documentation: Key Words: - Android Security, Malware Detection, Characterization, Deep Learning
No ratings yet
Report Documentation: Key Words: - Android Security, Malware Detection, Characterization, Deep Learning
1 page
pdf4
No ratings yet
pdf4
11 pages
Defensedroid: A Modern Approach To Android Malware Detection
No ratings yet
Defensedroid: A Modern Approach To Android Malware Detection
12 pages
Future Generation Computer Systems
No ratings yet
Future Generation Computer Systems
13 pages
Droid Detector: Android Malware Characterization and Detection Using Deep Learning
No ratings yet
Droid Detector: Android Malware Characterization and Detection Using Deep Learning
2 pages
Ntdroid: Android Malware Detection Using Network Traffic: Features
No ratings yet
Ntdroid: Android Malware Detection Using Network Traffic: Features
12 pages
Information 15 00025
No ratings yet
Information 15 00025
25 pages
Android Malware Detection Report
No ratings yet
Android Malware Detection Report
9 pages
A Review On The Use of Deep Learning in Android Malware Detection PDF
No ratings yet
A Review On The Use of Deep Learning in Android Malware Detection PDF
17 pages
A_Survey_on_Android_Malware_Detection_Techniques_Using_Machine_Learning_Algorithms
No ratings yet
A_Survey_on_Android_Malware_Detection_Techniques_Using_Machine_Learning_Algorithms
8 pages
1 s2.0 S2667305323001436 Main
No ratings yet
1 s2.0 S2667305323001436 Main
10 pages
Android MLAlg
No ratings yet
Android MLAlg
8 pages
Recent Advances in Android Mobile Malware Detection: A Systematic Literature Review
No ratings yet
Recent Advances in Android Mobile Malware Detection: A Systematic Literature Review
32 pages
7.analysis and Detection of Malware in Android Applications Using Machine Learning
No ratings yet
7.analysis and Detection of Malware in Android Applications Using Machine Learning
55 pages
Heuristic-Based Malware Detection For Android Using Machine Learning
No ratings yet
Heuristic-Based Malware Detection For Android Using Machine Learning
6 pages
1 s2.0 S1877050921007481 Main
No ratings yet
1 s2.0 S1877050921007481 Main
6 pages
5_6143364505431708942
No ratings yet
5_6143364505431708942
36 pages
A Review of Android Malware Detection Approaches Based On Machine Learning
No ratings yet
A Review of Android Malware Detection Approaches Based On Machine Learning
29 pages
Machine Learning Approach For Malware de
No ratings yet
Machine Learning Approach For Malware de
11 pages
The Methods of Artificial Intelligence For Malicious Applications Detection in Android Os
No ratings yet
The Methods of Artificial Intelligence For Malicious Applications Detection in Android Os
7 pages
A Hybrid Analysis-Based Approach To Android Malware Family Classification
No ratings yet
A Hybrid Analysis-Based Approach To Android Malware Family Classification
23 pages
A_Survey_on_Android_Malware_Detection_Techniques_Using_Supervised_Machine_Learning
No ratings yet
A_Survey_on_Android_Malware_Detection_Techniques_Using_Supervised_Machine_Learning
24 pages
FINAL REVIEW PAPER Android Dynamic Malware Analysis
No ratings yet
FINAL REVIEW PAPER Android Dynamic Malware Analysis
12 pages
Droiddetector: Android Malware Characterization and Detection Using Deep Learning
No ratings yet
Droiddetector: Android Malware Characterization and Detection Using Deep Learning
10 pages
Malware - Me Project Document
No ratings yet
Malware - Me Project Document
28 pages
Malware - Me Project Document
No ratings yet
Malware - Me Project Document
31 pages
Android Malware Detection Using Machine Learning
No ratings yet
Android Malware Detection Using Machine Learning
4 pages
Droidmark - A Tool For Android Malware Detectionusing Taint Analysis and Bayesian Network
No ratings yet
Droidmark - A Tool For Android Malware Detectionusing Taint Analysis and Bayesian Network
6 pages
A Lightweight Multi-Source
No ratings yet
A Lightweight Multi-Source
25 pages
511-C0015
No ratings yet
511-C0015
12 pages
Odusami2018_Chapter_AndroidMalwareDetectionASurvey
No ratings yet
Odusami2018_Chapter_AndroidMalwareDetectionASurvey
12 pages
18.hybrid Intelligent Android Malware Detection Using Evolving Support Vector Machine Based On Genetic Algorithm and Particle Swarm Optimization
No ratings yet
18.hybrid Intelligent Android Malware Detection Using Evolving Support Vector Machine Based On Genetic Algorithm and Particle Swarm Optimization
15 pages
A Survey On Various Threats and Current State of Security in Android Platform
No ratings yet
A Survey On Various Threats and Current State of Security in Android Platform
35 pages
Android Security A Survey of Issues, Malware Penetration Dan Defense
No ratings yet
Android Security A Survey of Issues, Malware Penetration Dan Defense
24 pages
masum2019
No ratings yet
masum2019
5 pages
MalPat Mining Patterns of Malicious and Benign Android Apps Via Permission Related APIs
No ratings yet
MalPat Mining Patterns of Malicious and Benign Android Apps Via Permission Related APIs
15 pages
Crowdroid: Behavior-Based Malware Detection System For Android
No ratings yet
Crowdroid: Behavior-Based Malware Detection System For Android
11 pages
CIC-AndMal-2017
No ratings yet
CIC-AndMal-2017
5 pages
A Systematic Review of Android Malware Detection Techniques
No ratings yet
A Systematic Review of Android Malware Detection Techniques
18 pages
GBKPA and AuxShield
No ratings yet
GBKPA and AuxShield
9 pages
Towards a Fair Comparison and Realistic Evaluation Framework of Android Malware
No ratings yet
Towards a Fair Comparison and Realistic Evaluation Framework of Android Malware
18 pages
Malware Detection in Android in Different Application Categories
No ratings yet
Malware Detection in Android in Different Application Categories
6 pages
V25I0107
No ratings yet
V25I0107
6 pages
Research Paper 1
No ratings yet
Research Paper 1
24 pages
Machine Learning For Mobile Defense Detecting SMS Malware and Riskware On Android
No ratings yet
Machine Learning For Mobile Defense Detecting SMS Malware and Riskware On Android
5 pages
Android Security A Survey of Issues, Malware Penetration, and Defenses
No ratings yet
Android Security A Survey of Issues, Malware Penetration, and Defenses
25 pages
Malware Sandbox Evasion Techniques in Mobile Devices
No ratings yet
Malware Sandbox Evasion Techniques in Mobile Devices
6 pages
An Efficient Android Malware Detection Using Adaptive Red Fox Optimization Based CNN
No ratings yet
An Efficient Android Malware Detection Using Adaptive Red Fox Optimization Based CNN
22 pages
Didroid: Android Malware Classification and Characterization Using Deep Image Learning
No ratings yet
Didroid: Android Malware Classification and Characterization Using Deep Image Learning
13 pages
Feature Engineering and Evaluation For Android Malware Detection Scheme
No ratings yet
Feature Engineering and Evaluation For Android Malware Detection Scheme
18 pages
Detection of Malicious Android Apps Using Machine Learning Techniques
No ratings yet
Detection of Malicious Android Apps Using Machine Learning Techniques
7 pages
SHAURYA SINGH M.Tech
No ratings yet
SHAURYA SINGH M.Tech
56 pages
Improved Chimp Optimization Algorithm (ICOA) Feature Selection and Deep Neural Network Framework For Internet of Things (IOT) Based Android Malware Detection
No ratings yet
Improved Chimp Optimization Algorithm (ICOA) Feature Selection and Deep Neural Network Framework For Internet of Things (IOT) Based Android Malware Detection
8 pages
Mobile Malware Infringement and Detection
From Everand
Mobile Malware Infringement and Detection
Abdul Razaque
No ratings yet
Android Application Security Essentials
From Everand
Android Application Security Essentials
Pragati Ogal Rai
No ratings yet
Mobile Security Fundamentals: A Guide for CompTIA Security+ 601 Exam
From Everand
Mobile Security Fundamentals: A Guide for CompTIA Security+ 601 Exam
Adil Ahmed
No ratings yet
Labview Fpga Based Noise Cancelling Using The Lms Adaptive Algorithm
No ratings yet
Labview Fpga Based Noise Cancelling Using The Lms Adaptive Algorithm
4 pages
ASEAN Quiz User Guide
No ratings yet
ASEAN Quiz User Guide
29 pages
CHAPTER 1 Review Questions
No ratings yet
CHAPTER 1 Review Questions
12 pages
Container: Useful Methods of Component Class
No ratings yet
Container: Useful Methods of Component Class
32 pages
Data Mining and Predictive Analytics
No ratings yet
Data Mining and Predictive Analytics
7 pages
railway mgmt system
No ratings yet
railway mgmt system
49 pages
ICT Practice Solution
No ratings yet
ICT Practice Solution
7 pages
a18f10fn_a18f10fnnz - User Manual
No ratings yet
a18f10fn_a18f10fnnz - User Manual
24 pages
CIS Controls v8.1 Mapping To CSA Cloud Controls Matrix v4 2024 07 19
No ratings yet
CIS Controls v8.1 Mapping To CSA Cloud Controls Matrix v4 2024 07 19
133 pages
Hall2017 - Love Free Encounters
No ratings yet
Hall2017 - Love Free Encounters
9 pages
Ch1 بنية
No ratings yet
Ch1 بنية
13 pages
HTML Injection
No ratings yet
HTML Injection
10 pages
Chat 2023 06 21 Log
No ratings yet
Chat 2023 06 21 Log
12 pages
Data Structures Data Structure and Algorithms
No ratings yet
Data Structures Data Structure and Algorithms
4 pages
Comparative Analysis of OSI and TCPIP Models in Network Communication
No ratings yet
Comparative Analysis of OSI and TCPIP Models in Network Communication
8 pages
Ansible Commands
No ratings yet
Ansible Commands
9 pages
Computational Physics
No ratings yet
Computational Physics
36 pages
Victron VM 3P75CT Energy Meter PDF en
No ratings yet
Victron VM 3P75CT Energy Meter PDF en
16 pages
Programmable DC Power Supply
No ratings yet
Programmable DC Power Supply
4 pages
GPC GIS Corporate Brochure
No ratings yet
GPC GIS Corporate Brochure
2 pages
HW2 CT1503
No ratings yet
HW2 CT1503
4 pages
Apprentice Edit Details PDF
No ratings yet
Apprentice Edit Details PDF
2 pages
Hart Communication Report
No ratings yet
Hart Communication Report
25 pages
CNS-Key MGMT
No ratings yet
CNS-Key MGMT
35 pages
Programmierhandbuch alle
100% (1)
Programmierhandbuch alle
873 pages
Gate Da
No ratings yet
Gate Da
221 pages
0 Dumpacore 3rd Com.samsung.android.app.Contacts
No ratings yet
0 Dumpacore 3rd Com.samsung.android.app.Contacts
2,772 pages
Caltta DMR Trunking Technology White Paper
No ratings yet
Caltta DMR Trunking Technology White Paper
28 pages

A Review of Deep Learning Models To Detect Malware in Android Applications

Uploaded by

A Review of Deep Learning Models To Detect Malware in Android Applications

Uploaded by

Cyber Security and Applications 1 (2023) 100014

Contents lists available at ScienceDirect

Cyber Security and Applications

A review of deep learning models to detect malware in Android

Fig. 1. PRISMA ﬂowchart.

Reference Deep learning model Performance Dataset Limitations/Future work

[39] Long Short-Term Accuracy –97.74% • VirusShare Sometimes it fails

(continued on next page)

Reference Deep learning model Performance Dataset Limitations/Future work

Fig. 2. Deep learning-based malware detec-

Fig. 3. Taxonomy of CNN and application areas [65].

Ref Dataset Collection Time Malware Samples Source

[77] CICAndMal2017 2017 365 https://www.unb.ca/cic/datasets/andmal2017.html

You might also like