0% found this document useful (0 votes)
48 views18 pages

Hybrid Machine Learning Model For Malware Analysis in

Uploaded by

ttwiann
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views18 pages

Hybrid Machine Learning Model For Malware Analysis in

Uploaded by

ttwiann
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Pervasive and Mobile Computing 97 (2024) 101859

Contents lists available at ScienceDirect

Pervasive and Mobile Computing


journal homepage: www.elsevier.com/locate/pmc

Hybrid machine learning model for malware analysis in


android apps
Saba Bashir a, Farwa Maqbool b, Farhan Hassan Khan c, *, Asif Sohail Abid c
a
Department of Software Engineering, Federal Urdu University of Arts, Sciences and Technology, Islamabad, Pakistan
b
Department of Computer Science and Software Engineering, Islamic International University, Islamabad, Pakistan
c
Knowledge & Data Science Research Center (KDRC), Department of Computer and Software Engineering, College of E&ME, National University of
Sciences & Technology (NUST), Islamabad, Pakistan

A R T I C L E I N F O A B S T R A C T

Keywords: Android smartphones have been widely adopted across the globe. They have the capability to
Android access private and confidential information resulting in these devices being targeted by malware
Malware detection devisers. The dramatic escalation of assaults build an awareness to create a robust system that
Machine learning
detects the occurrence of malicious actions in Android applications. The malware exposure study
Ensemble learning
consists of static and dynamic analysis. This research work proposed a hybrid machine learning
Classification
model based on static and dynamic analysis which offers efficient classification and detection of
Android malware. The proposed novel malware classification technique can process any android
application, then extracts its features, and predicts whether the applications under process is
malware or benign. The proposed malware detection model can characterizes diverse malware
types from Android platform with high positive rate. The proposed approach detects malicious
applications in reduced execution time while also improving the security of Android as compared
to existing approaches. State-of-the-art machine learning algorithms such as Support Vector
Machine, k-Nearest Neighbor, Naïve Bayes, and different ensembles are employed on benign and
malign applications to assess the execution of all classifiers on permissions, API calls and intents
to identify malware. The proposed technique is evaluated on Drebin, MalGenome and Kaggle
dataset, and outcomes indicate that this robust system improved runtime detection of malware
with high speed and accuracy. Best accuracy of 100% is achieved on benchmark dataset when
compared with state of the art techniques. Furthermore, the proposed approach outperforms state
of the art techniques in terms of computational time, true positive rate, false positive rate, ac­
curacy, precision, recall, and f-measure.

1. Introduction

Malware also termed as malicious software interrupts system’s working by getting access to its sensitive data and private infor­
mation, and resulting in a damage or loss [1]. Nowadays, most of the operating systems are based on mobiles and use android ap­
plications. Mostly, the device manufacturer’s pre install the applications on smart phone based operating systems. The operating
system working on smart phones provides different features and functionalities to users and also helps in interaction with end users.
Multiple services such as Short Messaging Service (SMS), email, route finding, social networking, calls and online payments etc are

* Corresponding author.
E-mail address: [email protected] (F.H. Khan).

https://doi.org/10.1016/j.pmcj.2023.101859
Received 30 October 2021; Received in revised form 15 September 2023; Accepted 2 November 2023
Available online 8 November 2023
1574-1192/© 2023 Elsevier B.V. All rights reserved.
S. Bashir et al. Pervasive and Mobile Computing 97 (2024) 101859

provided by mobile operating systems [2].


The use of Android smart phones has been increased rapidly with the passage of time and therefore, it have increasingly targeted by
hackers and cyber criminals. They use malicious applications to attack the mobile operating systems. These applications have the
capability to access private and sensitive information of mobiles by making premium calls, accessing the location, sending multiple
span advertisements, and also by sending SMS and the user are unaware of them [3]. A number of studies have been conducted on
malware applications and it is analyzed that the most critical of them is privacy violation which accesses the personal information of
mobile phone user [4]. Furthermore, many of them have the capability of multiple harmful attacks such as DDos attack and botnets etc
which allows the attacker to access the complete control of a device along with its connections [5]. Another most critical issue is fraud
phishing and identity theft, which results in stolen of personal and financial data fraudulently that is stored in smartphones for the use
of tax paying and financial assistance [6].
The installation of mobile phone applications by third party is not trust worthy as it can easily attack the android operating system
by using different malwares [7]. A research conducted by McAfee security [8] stated that the number of mobile malwares has increased
very rapidly, and crossed 6 billion score in first quarter of 2018. Such a high rate is very critical for mobile users and need to be
controlled. There is a need of effective and efficient techniques which can restrict the explosion of malware applications. Although
android platforms provide number of security mechanisms such as android permission control mechanisms which controls the access
of malicious applications to some extent but some applications explicitly affirm permissions in order to access sensitive data such as
getting user’s consent, and obtaining personal contact information. There is a need to improve the security mechanisms and developers
must be aware of latest threats and request permission calls, and on the other hand users should also know whether to share sensitive
information or not. This permission control mechanism would result in limited circulation, and expansion control of malicious ap­
plications [9].
The application behavior pattern can be automatically learned by machine learning techniques when they are used with program
analysis techniques. The mainstream malicious applications can be identified effectively by applying this method. Program analysis
techniques are comprised of two types of approaches i.e static and dynamic approaches. The internal structure of a program is checked
by static analysis techniques whereas dynamic analysis is performed during the execution of a program [9,10]. Machine learning
techniques include classification which is used for categorization of data into multiple classes. Such as, mobile applications can be
categorized into malicious and non-malicious applications. There are number of classification algorithms used for categorization such
as Support Vector Machine (SVM), Random Forest (RF), Naïve Bayes (NB) and K-Nearest Neighbor (K-NN) etc. Similarly, combinations
of these algorithms are also utilized in an ensemble such as Majority Voting, Bagging, Boosting, Stacking, etc. [11].

1.1. Research motivation

A current report has exposed that about 2.6 million android applications including 22% poor standard apps are accessible on
market [10]. The fame of android application directed to an increase of Android malware. Mostly these malware are dispersed in
market run by third parties, which paradoxically cannot guarantee that all registered applications of Google Android market are threat
free. Some of the malware are comprised of Banking Trojans, Phishing, Bots, Spyware, SMS Fraud, Root Exploit, Fake Installer, and
Premium Dialers. After installation, Trojan apps download malicious code which implies that these apps cannot be detected by Google
technology while registering in Google Android market.
Outdated smartphones and operating system with lack of current patches give a chance to hackers to install malware into device.
Besides this, other reasons include unavailability of recent updates for device or even when an update is available; it may not be
installed by customers. On the other hand, cheap devices come with preinstalled malware which is unseen to owners and cannot be
disabled. In this way, cyber criminals can easily gain access to smartphones, and all private information within. Existing machine
learning techniques for malware classification are either static or dynamic which cannot completely eliminate the malware from
devices whereas the proposed research is a hybrid model based on combination of static and dynamic analysis. It uses static features,
and API calls which overcome the limitations of each other to detect, and classify malware efficiently and more accurately. Thus, all
types of malware will be detected by proposed technique.

1.2. Research contributions

The following contributions are presented in this research:

• Proposed a novel malware classification model that can process an application, extracts its features, and predicts whether the
applications under process is malware or benign. The proposed malware detection model can characterizes diverse malware types
from Android platform with high positive rate.
• Multiple machine learning classifiers are evaluated such as Support Vector Machine, k-Nearest Neighbor, Naïve Bayes and En­
sembles using API calls, manifest permission, intent filters, and command signature. It employs various monitored structures such
as permission, intent filter, provider, process name and constant strings mined by Android application.
• The proposed model utilized feature selection and an ensemble based approach to improve malware classification performance
thereby resolving the existing problems of homogeneous single classifier-based approaches.
• Proposed ensemble learning method for malware classification has high accuracy and reduced computation time.

The rest of the paper is organized as follows: Section 2 is focusing on related techniques. Section 3 presents a brief description on

2
S. Bashir et al. Pervasive and Mobile Computing 97 (2024) 101859

proposed malware detection technique. Section 4 conducts both experimental and analytical evaluation on detection of malware and
discusses its influences the Android performance. Finally, section 5 presents a summary of this proposed framework as well as de­
liberates for future work.

2. Literature review

This section presents critical review of different state of the art approaches for malware detection stated by their main idea, ad­
vantages, limitations, methods and evaluations. Broadly, two categories of these techniques are static and dynamic analysis. Ap­
proaches falling under each of the category are described below.

2.1. Static analysis techniques

Milosevic et al. [11] introduced a model for app permissions classification which utilizes machine learning techniques for detection
of malware. The proposed model can be installed on any mobile device as it is light weight and not very costly. The API calls are
analyzed using MODROID dataset which is connected to the server. The next step after examination is release signature that is used for
threat findings and is transferred towards users. The analysis of results indicates that permission based technique identified 89%
malicious applications whereas analysis of source code identified 95%. SVM classifier is used for classification which results in 95.1%
accuracy where high dimension feature spaced is mapped for input. It is observed that source code classification takes more time and
therefore is expensive in terms of execution time. It takes more than 10 seconds on one application. Google play store apps can be
scanned using this proposed model. Efficient data analysis and classification can be performed using static dataflow analysis technique.
A static analysis technique is presented by Wu et al. [12] which accurately extracts features of API that are correlated to dataflow.
K-NN algorithm is used to increase the accuracy. Furthermore, API features are used to calculate high weightage of malicious patterns
that help to reduce time overhead of static analysis. The data sample is collected by Chinese Android market. Malware applications are
detected by the proposed model and resulting in 97.66% accuracy. Static privacy leakage analysis results in time reduction by almost
40%. A scalable data processing framework is presented by Kozik et al. [13] which detect distributed data of Apache Spark. The
proposed system access NetFlow data and then it is stored in Hadoop distributed file system (HDFS). The Mlib of Apache Spark cannot
be multiplied to larger matrix due to risen of memory. This issue is resolved by algorithmic classifier that runs on machine learning
model. The Map-Reduce model range and spread out Extreme learning machine (ELM) classifier processes to expose the malware
actions in NetFlow. The CTU dataset is used in proposed technique which comprises of cyber-attacks and botnet types. The analysis of
results indicates that ELM classifier based on NetFlow analysis shows efficient performance and consistency for detection of malware
applications.
Moonsamy et al. [14] introduced a system based on permission pattern which classifies the applications in malicious and clean
applications. The performance evaluation of mining permission pattern algorithm gathered 1227 legitimate applications that trans­
ferred on benchmark dataset and third-party marketplace. Indirect onherent permissions are not supported by the proposed technique
whereas it can efficiently analyze the permission pattern. The potential malware is detected by Callejaet al. [15] and introduced a
triage procedure which generated mislabeled malware samples. The proposed model is based on IagoDroid technique which has the
capability of faster searching for evasive folder. Generation 4 has achieved 100% evasive results. The analysis indicates that the range
for evasive detection lies between 90% and 99% which depends on introduced modifications. IagoDroid technique has the potential to
identify original malware families. However, this technique limits the use of human intervention which approves the needed changes
to authentic modifications.
Yousefi-Azar et al. [16] also proposed a model to differentiate between malware and benign families using Malytic approach. The
Android applications are categorized using static features that are collected from binary files. The proposed framework is working in
three steps: abstraction of features, classification, and resemblance measurement. Neural network is then used for classification which
uses two hidden layers and one outer layer. The performance of proposed scheme is further improved using ELM output layer. The
comparison of proposed Malytic approach generates better results when it is compared with DexShare and PEShare. Imbalanced
dataset is used for experimentation and analysis. The analysis of results indicates that Malytic shows robustness and resiliency in
addressing zero-day malware sample. 97.21% accuracy whereas 99.45% F-score are achieved by the proposed model. This model can
be applied efficiently on large scale dataset with high speed and promising results. However, it is limited to the use of a lot of memory
which stores all input samples and output layer weights. It is not considered as an efficient approach for binary files which are also
targeted by malicious applications.

2.2. Dynamic analysis techniques

Rehman et al. [17] also presented a malware detection method using SVM classifier which performs dynamic analysis that detects
different types of malware. In addition to this, before installing application, installed APK file strings are also used for malware string
comparison. If APK file string and malware strings are not matched to each other, then the application is considered as malicious
otherwise legitimate. Hence, the proposed scheme completes strings comparison before installing APK file on Android phone. The
MODROID dataset comprises on Android applications to detect either application is legitimate or not. The highest accuracy is achieved
by SVM that is 85.5%. The SVM classifier works efficiently with binaries of Android apps. An advanced Virtual Machine Monitor
(VMM) based on Automated Internal and External (A-IntExt) system is proposed by Kumara et al. [18]. The Virtual Machine Intro­
spection (VMI) occasionally detects the status of Virtual Machine (VM) by the use of Memory Forensics Analysis (MFA) and machine

3
S. Bashir et al. Pervasive and Mobile Computing 97 (2024) 101859

learning model. A-IntExt system applies an Intelligent Cross View Analyzer (ICVA) process to notice hidden, distinguish, dead of data
on VMI and detects malware indicators through the Time Interval Threshold (TIT) technique. The data that takes out from the MFA is
further analyzed on machine learning model to detect malicious executables. For evaluation of A-IntExt system is used on 3750 huge
actual malware and 4500 benign applications that run on live VM. 10 fold cross validation is applied to attain 0.004 False Positive Rate
(FPR) and 99.55% accuracy for A-IntExt system. The proposed system is considered robust and practically can be used in many real-life
applications. On other hand, the A-IntExt has few shortcomings that noted while periodic introspection, A-IntExt does not consider the
malware explosion in kernel method.
An online learning framework named Context-aware, Adaptive and Scalable ANDRoid mAlware detector (CASANDRA) is proposed
by Narayanan et al. [19]. The proposed scheme repeatedly takes labeled data and updates model to make prediction of a new sample.
CASANDRA relies on four design goals: first is accuracy, which depends upon how better to get Program Representation Graph (PRG)
expressiveness. Second, the efficiency of CASANDRA is attained by scalable graph and Confidence Weighted (CW) algorithm. Third is
adaptiveness, CASANDRA adapts to malware population drift by using online classifier and addresses the issue of population drift.
Fourth and the last goal is explainability, CWLK and linear algorithm in CASANDRA are used together to permit PRG feature depiction
and addressed explainability which acts like a black-box solution. DREBIN provides benchmark dataset that consists of 5560 malware
samples linked to 179 families. The 87257 apps dataset are collected from In The Wild (ITW) available on Virus Total web. 89.92%
accuracy is attained on ITW dataset. For large experimental dataset, CASANDRA takes only 28.23 milli seconds to detect labeled
sample. The CASANDRA, due to its highly scalability and accurate performance is used mostly in online solutions and for malware
application detection tasks. Dynamic feature based method is proposed by Feng et al. [20] and proposed EnDroid analysis framework
that consequently uses dynamic behavior based features to detect malware families. The proposed technique incorporates Chi-square
algorithm that used dynamic features to eliminate noise and to take out critical features. The extracted features are considered more
critical to support detection of malware behavior. The risky behavior is applied on EnDroid technique to achieve actual malware
exposure through stacking ensemble. The proposed technique is evaluated upon two dataset M1 and M2. The M1 dataset consists of
8806 legitimate and 5213 malicious applications. Likewise, M2 data is comprised of 5000 benign and 5000 malicious applications
from AndroZoo. The stacking ensemble obtained good performance in classification of malware detection.
Niu et al. [21] proposed a dynamic analysis technique for malware classification using deep learning. Function Call Graph (FCG) is
used to perform the behavioral analysis and captures the malware features efficiently. The proposed model is comprised of Op-code
level FCG along with deep learning. The deep learning method used is Long Short Term Memory (LSTM). The dataset is obtained from
Virusshare and AndroZoo which consists of 1796 Android malware samples and 1000 benign. The proposed approach achieved the
accuracy of 97%. However, Function based Opcode has better performance for .apk files and it could be employed to improve the
performance further. Furthermore, optimization of deep learning can be used to further increase the accuracy level. Keyes et al. [22]
introduced entropy based analysis method for malware classification. The entropy based behavioral analysis is termed as EntropLyzer.
The dynamic characteristics are identified using six different classes of malware. 147 malware families are used for analysis of the
proposed model. CCCS-CIC-AndMal2020 dataset is used where 98.4% and 98.3% precision and recall is obtained, respectively. Real
smart phone devices can be used to detect malware samples at real time. Tables 1 and 2 present an overview of state of the art malware
detection and classification techniques based on static and dynamic analysis.
Previously proposed malware detection techniques utilized either API calls (dynamic analysis) only or permissions (static analysis)

Table 1
Static Analysis based State of the art Techniques for Malware Detection and Classification.
Reference/Year Technique Dataset Results Limitations

Wu et al. [12] Static dataflow analysis with VirusShare apps Benign 1160 97% Accuracy Static privacy leakage analysis.
(2016) machine leaning Malware 1050 Dynamic analysis should be
incorporated
Milosevic et al. SVM MODROID 95.1% Accuracy Takes more than 10s on one application.
[11] (2017) Time consuming.
Kumara et al. VMM based on A-IntExt VMM Benchmark dataset 99.55% Does not consider malware explosion in
[18] (2017) Accuracy 0.004 kernel method
FPR
Wei et al. [23] Affiliations in system function 102 benign 219 malicious samples. 7% FPR Lower True Positive Rate. Cannot
(2017) calls, sensitive permissions efficiently identify malware
applications
Alqatawna et al. Method-level application Drebin (5.9K benign 5.6K malware) 96% Accuracy Unable to analyze API calls of malicious
[26] (2017) association relationship AMD (20.5K benign 20.8K malware) behavior
Li, J. et al. [32] AndroPyTool OmniDroid 22,000 features Potential Semantic characterization is incorrect
(2018) usability
Zhang et al. [34] MalDAE Network, system, and process data 94.39% Failed to detect obfuscation malware
(2019) Accuracy
Kumar et al. [35] MalPat 31185 benign 15336 malicious 98.24% F1 score Features over fitting problem
(2019)
Jannat et al. [36] Mine permission features of 504 benign 231 malicious 87.99% High false alarm rate
(2019) manifest file Accuracy
Massarelli et al. Detrended Fluctuation Analysis Drebin and AMD datasets 78% mean Advanced UI tool can be used instead of
[38] (2020) (DFA) and Pearson’s correlation Accuracy Monkey Tool

4
S. Bashir et al. Pervasive and Mobile Computing 97 (2024) 101859

only. This concept is acceptable to some extent but when malware writer makes variations in code, these methods break down
instantaneously. Therefore, the proposed system overcomes these difficulties by utilizing each feature of application like keywords,
permissions, intent filters, providers and receivers. Similarly, combination of constant strings in binaries is also mined. Likewise
malware detection and classification techniques also provide complete safety to users of Android. Additionally, the proposed
framework gained high performance in contrast to existing models.

3. Materials and methods

The proposed framework is a hybrid model which introduces a novel approach based on static and dynamic analysis techniques. It
can efficiently identify all kinds of malware present in the system. Static analysis is performed by scanning all the application code and
then identifies the malicious behavior without code execution. Android Asset Packaging Tool and Baksmali tool are used to fetch the
static features such as requested hardware elements, application elements for instance, service, receiver, content provider, intent
filters, suspicious API calls and restricted API calls. Whereas, dynamic analysis is performed by installing and executing the application
and its run time behavior is analyzed. Decrypted and dynamically code is also monitored. The real time behavior of application is
analyzed using dynamic features of system call. It is analyzed that malicious applications raise more frequent application calls as
compared to legitimate applications. This frequent occurance of system calls such as open, ioctl, brk, read, write, close, sendto, sendmsg,
recvfrom, recvmsg shows malicious application behavior.
Fig. 1 shows the main components of proposed malware detection model and Fig. 2 presents a detailed flow. In this approach,
Android application dataset is gathered and then different methods are applied to identify malware and benign applications. This
method involves running an application on a device and identifying the behavior of application using dynamic and static analysis.

Table 2
Dynamic Analysis based State of the art Techniques for Malware Detection and Classification.
Reference/Year Technique Dataset Results Limitations

Saracino et al. MADAM, host-based malware 2,800 apps 125 different Accuracy 93%-96% The botnet class is not discovered by
[27] (2016) detection to analyze & correlate families MADAM (proposed model) at run-time
features
Narayanan et al. Online learning framework DREBIN 5560 malwares 179 89.92% Accuracy 28.23ms to detect labeled sample.
[19] (2017) families System speed is slow.
Singh et al. [24] TAPVerifier 337 system calls of Linux High Accuracy and Recall Failed to detect all permissions
(2017) rate
Al Ali et al. [25] Signature based noticed an Kaggle, Microsoft Distinguished malware Analyzed only one dataset. Should be
(2017) occurrence of opcode with highest accuracy incorporated more datasets
Pekta et al. [31] Dynamic approach 11,000 APK 123 permission Accuracy 97.9% Limits use of only one dataset
(2017)
Rehman et al. Hybrid machine learning MODROID 85.5% Accuracy Takes 3s second to scan. Scan time
[17] (2018) framework should be in milliseconds otherwise
system will be slow.
Feng et al. [20] EnDroid analysis framework M1 (8806 legitimate, 5213 Stacking outperformed in May fail to trigger malicious behaviors
(2018) that consequently used dynamic malicious) M2 (5000 benign classification of malware due to lack of necessary UI
behavior based features 5000 malicious) operations
Calleja et al. Triage procedure to detect RevealDroid DREBIN 1919 90-99% Accuracy Comprehensive evaluation missing
[15] (2018) potential models of malware malware 29 variant
malware families
Yousefi-Azar Malytic approach to distinguish Drebin, DexShare, PEShare 97.21% F1-score on Human intervention for transforming
et al. [16] the malware families dataset. Android dex, 99.45% F1- recommended changes. System should
(2018) score on Windows PE work without human intervention.
Sun et al. [28] Behavior Detection and 150 benign and 104 Accuracy 88.2% To analyze large number of apps took a
(2018) categorization in known/ malware apps lot of time. Not efficient.
unknown malware
Morales-Molina Classify malign apps through Google play-store 278 Out of 337 only 31 systems Low accuracy on Ransomware attacks
et al. [29] runtime performance benign 216 malicious calls identified correctly
(2018)
Arshad et al. Merged blockchain and machine Google Chinese play store High accuracy and Failure to manage obfuscated malware
[30] (2018) learning for efficient detection 6192 benign 5560 malicious robustness
Du et al. [33] SVM 107,327 benign 8,701 82.93% Accuracy A_BROWSER and A_SECURITY have
(2019) malapps very poor categorization accuracy
Burnap et al. Behavior features classified 1220 malware 88.3% TPR 3.9% FPR Failed to detect evasion malware
[37] (2019) using API and parameters
Niu et al. [21] Opcode level FCG and deep 1796 malware 1000 benign 97% Accuracy Limits the use of Function based Opcode
(2020) learning
Kayes et al. [22] Entropy based behavioral CCCS-CIC-AndMal2020 98.4% Precision 98.3% Use of emulator reduced the detection of
(2021) analysis technique datase Recall malware samples
Hosseini et al. Convolutional neural network Drebin dataset, binary files 98.8% Accuracy Deep neural network can be employed
[39] (2021) and LSTM to obtain high accuracy

5
S. Bashir et al. Pervasive and Mobile Computing 97 (2024) 101859

Fig. 1. Components of Proposed Malware Classification Model.

3.1. Data collection

App Archive (.apk): The Android Package (APK) files are archive files that are used to install any android phone app and are used by
Android operating system. An APK file consists of program code, resources, constant strings and manifest file. The malware/ malicious
applications are identified from the constant string of downloaded APK file. Similarly, manifest (.xml) file is also an important file that
is used to identify the malicious applications from the benign. This method is used to identify whether a program is malware or not
before its installation onto an Android phone. When APK file is decompressed using Android Asset Packaging Tool, following contents
are extracted.
Manifest (.xml) File: This file is also an important file that exists in APK archive. When an Android application is executed, it is read
first. Its location is the root of Android project and is used to describe the most important features of the application such as the intent
filters, permissions, sender & receiver information and hardware components etc.
Classes (.dex) File: Typically, an Android application is written in Java programming language and when this code is compiled, the
intermediate state contains .class file. When .class file is compiled, it is converted into .dex format (Dalvik executable files) which is an
optimized byte code for Android applications. The application is assembled and disassembled using Backsmali tool. This module
extracts classes .dex application code, disassembled it and smali files are generated as an output. Java code of Android application
resides in smali files using smali language. Suspicious and restricted API calls are extracted from the smali code files along with
command signatures.

3.2. Feature extraction (static analysis)

A robust Android malware detection framework is basically based on some representative features. Reverse engineering process is
applied to extract some important features from applications such as user permissions, provider & receiver information, process name,
intent filter and binaries. First java code file is attained by converting APK file into it then after some modification it is again converted
back to APK file. During this reverse engineering some features are extracted from Android manifest .xml file that are constant strings,
area binaries and permissions, providers, receiver, intent filter and process.
When reverse engineering of APK file is performed, the binaries are stored in a folder in the form of constant strings. This constant
string may be attacked by hacker and attempts to make changes using reverse engineering of application. The decompiling of APK file
results in providing the source code of application which may also be attacked by the attacker/hacker by making changes into it (such
as [const-string v1, ‘Device Not Rooted’]). Then it is uploaded on Google play store for downloading at Android platform. Hence every
Android app has manifest .xml file that contains requested permissions. Such permissions need to be accepted by the user before
installation of apps.
Fig. 2 shows that proposed framework is divided into three main modules which is data collection, feature extraction and Behavior
analysis. First of all, malware and benign applications data is collected and then these files are decompressed. Then after data
collection, important features are extracted from these files using reverse engineering process. Static analysis is performed on these
features to identify malware applications. After that dynamic analysis is performed, this is basically behavior analysis. Multiple
machine learning classifiers are used. Also ensemble based methods are used and classifier training and testing is performed. Trained

6
S. Bashir et al. Pervasive and Mobile Computing 97 (2024) 101859

Fig. 2. Detailed Framework of Proposed Android Malware Classification.

classifiers are now able to perform detection of malware applications at run time.
Keywords (manifest features) are also extracted from application and are used for malware detection. Each keyword has specific
purpose and is used in code file. For example, ‘Read SMS’ is used to receive message and read it. ‘Read Phone State’ is used to read
phone number etc. Malicious applications also hold such keywords. In order to identify malicious and legitimate file, all features
extracted from manifest are compared with proposed keyword list. Then, malignancy of application is calculated using threshold value

7
S. Bashir et al. Pervasive and Mobile Computing 97 (2024) 101859

and then finally machine learning algorithms are applied.


Table 3 presents the keywords list of malicious manifest.xml file. By the support of these extracted features, malicious applications
are detected. Each keyword has specific meaning like “Read SMS”, “Send SMS”, “Receive SMS”, “Write SMS” are communication
keywords of Android user. Similarly “RED_PHONE_STATE”, READ_PROFILE, WRITE_PROFILE, READ_CONTACT, WRITE_CONTACT,
READ_SMS, RECEIVE_SMS, WRITE_SMS, READ_CALL_LOG, WRITE_CALL_LOG and many more keywords are listed in Table 3 which
are commonly used in set up of malicious applications.
To analyze malicious applications, all extracted features are related to the legitimate and malicious applications manifest.xml file.
After that, malignancy scores of applications are calculated by applying different machine learning algorithms. Table 4 indicates the
constant string, system calls and API calls that are integral binaries of android applications.

3.3. Behavior analysis (dynamic analysis)

Dynamic analysis reflects the behavior of file contents by tracking data flows, recording data functions, executing machine learning

Table 3
Extracted Keywords List from Manifest (.xml) file of Android Applications
Permissions 53. RECEIVE_WAP_PUSH 106. ACCESS_MOCK_LOCATION
1. SEND_SMS 54. CALL_PRIVILEGED 107.SET_PREFERRED_ APPLICATIONS
2. GET_ACCOUNTS 55. READ_USER_DICTIONARY 108. ACCESS_WIFI_STATE
3. READ_PHONE_STATE 56. SET_TIME 109. CLEAR_APP_CACHE
4. RECEIVE_SMS 57. PROCESS_OUTGOING_CALLS 110. MODIFY_PHONE_STATE
5. READ_SMS 58. WRITE_SOCIAL_STREAM 111. READ_CONTACTS
6. USE_CREDENTIALS 59. WRITE_SETTINGS 112. HARDWARE_TEST
7. MANAGE_ACCOUNTS 60. BATTERY_STATS 113. DISABLE_KEYGUARD
8. WRITE_SMS 61. REBOOT Intent (Action)
9. READ_SYNC_SETTINGS 62. ACCESS_COARSE_LOCATION 1. PACKAGE_REPLACED
10. AUTHENTICATE_ACCOUNTS 63. BLUETOOTH_ADMIN 2..SEND_MULTIPLE
11. WRITE_HISTORY_BOOKMARKS 64. READ_SOCIAL_STREAM 3. TIME_SET
12. INSTALL_PACKAGES 66. WRITE_GSERVICES 4. PACKAGE_REMOVED
13. CAMERA 67. KILL_BACKGROUND_PROCESSES 5. TIMEZONE_CHANGED
14. READ_HISTORY_BOOKMARKS 68. STATUS_BAR 6. ACTION_POWER_DISCONNECTED
15. INTERNET 69. PERSISTENT_ACTIVITY 7. PACKAGE_ADDED
16. WRITE_SYNC_SETTINGS 70. CHANGE_NETWORK_STATE 8. ACTION_SHUTDOWN
17. RECORD_AUDIO 71. RECEIVE_MMS 9. PACKAGE_DATA_CLEARED
18. NFC 72. SET_TIME_ZONE 10. PACKAGE_CHANGED
19. BIND_REMOTEVIEWS 73. ADD_VOICEMAIL 11. NEW_OUTGOING_CALL
20. READ_PROFILE 74. BIND_APPWIDGET 12. SENDTO
21.ACCESS_LOCATION_EXTRA_ COMMANDS 75. BIND_ACCESSIBILITY_SERVICE 13. CALL
22. MODIFY_AUDIO_SETTINGS 76. CALL_PHONE 14. SCREEN_ON
23. BROADCAST_STICKY 77. BROADCAST_WAP_PUSH 15. BATTERY_OKAY
24. BLUETOOTH 78. CONTROL_LOCATION_UPDATES 16. PACKAGE_RESTARTED
25. WAKE_LOCK 79. FLASHLIGHT 17. CALL_BUTTON
26. RESTART_PACKAGES 80. SET_PROCESS_LIMIT 18. SCREEN_OFF
27. WRITE_APN_SETTINGS 81. READ_LOGS 19. RUN
28. READ_SYNC_STATS 82. INSTALL_LOCATION_PROVIDER 20. SET_WALLPAPER
29. RECEIVE_BOOT_COMPLETED 83. ACCESS_SURFACE_FLINGER 21. BATTERY_LOW
30. READ_EXTERNAL_STORAGE 84. MOUNT_FORMAT_FILESYSTEMS 22. ACTION_POWER_CONNECTED
31. SUBSCRIBED_FEEDS_WRITE 85. SYSTEM_ALERT_WINDOW Receiver
32. READ_CALL_LOG 86. BIND_TEXT_SERVICE 1. ON_BOOT_RECEIVER
33. VIBRATE 87. READ_FRAME_BUFFER 2.AUTORUN_BROADCAST_ RECEIVER
34. READ_CALENDAR 88. INTERNAL_SYSTEM_WINDOW 3. REMOTE
35. ACCESS_NETWORK_STATE 89. CHANGE_WIFI_STATE 4. SMS_RECEIVER
36. WRITE_CALENDAR 90. BROADCAST_SMS 5. SECURITY_RECEIVER
37. SUBSCRIBED_FEEDS_READ 91. CHANGE_CONFIGURATION 6. REPEATING_ALARM_SERVICE
38.CHANGE_WIFI_MULTICAST_ STATE 92. EXPAND_STATUS_BAR 7. CHECKER
39. MASTER_CLEAR 93. CLEAR_APP_USER_DATA 8. AD_NOTIFICATION
40. WRITE_PROFILE 94.MOUNT_UNMOUNT_ FILESYSTEMS 9. GCM_BROADCAST_RECEIVER
41. WRITE_CALL_LOG 95. SET_ACTIVITY_WATCHER 10. SEND
42. GLOBAL_SEARCH 96. WRITE_CONTACTS 11. MESSAGE_RECEIVER
43. GET_TASKS 97. BIND_VPN_SERVICE 12. ON_LOG_ALARAM_RECEIVER
44. REORDER_TASKS 98. WRITE_SECURE_SETTINGS 13. DATE_TIME_RECEIVER
45. DELETE_CACHE_FILES 99. DEVICE_POWER 14. AUTO_ANSER_RECEIVER
46. SET_WALLPAPER 100. ACCESS_FINE_LOCATION 15. SMS_SENDING_RECEIVER
47. DELETE_PACKAGES 101. WRITE_EXTERNAL_STORAGE 16. WIDGET-EVENT_ER
48. UPDATE_DEVICE_STATS 102.CHANGE_COMPONENT_ ENABLED_STATE 17. ACION_RECEIVER
49. WRITE_USER_DICTIONARY 103. GET_PACKAGE_SIZE 18. SCURITY_RECEIVER
50. BIND_INPUT_METHOD 104. SET_ORIENTATION 19. BIND_DEVICE_ADMIN
51. BIND_WALLPAPER 105. SET_WALLPAPER_HINTS
52. DUMP

8
S. Bashir et al. Pervasive and Mobile Computing 97 (2024) 101859

Table 4
Extracted Built-in Binaries of Android Applications.
API Calls

1. transact
2. onServiceConnected
3. bindService
4. attachInterface
5. ServiceConnection
6. android.os.Binder
7. android.telephony.SmsManager
8. android.content.pm.Signature
9. Ljava.lang.Class.getCanonicalName
10. Ljava.net.URLDecoder
11. Ljava.lang.Class.cast
12. Ljava.lang.Class.getMethods
13. getBinder
14. ClassLoader
15. System.loadLibrary
16. Ljava.lang.Class.getField
17. Landroid.content.Context.unregisterReceiver
18. Ljava.lang.Class.getDeclaredField
19. getCallingUid
20. Ljavax.crypto.spec.SecretKeySpec
21. DexClassLoader
22. android.content.pm.PackageInfo
23. HttpGet.init
24. SecretKey
25. KeySpec
26. TelephonyManager.getLine1Number
27. Landroid.content.Context.registerReceiver
28. Ljava.lang.Class.getMethod
29. android.intent.action.BOOT_COMPLETED
30. createSubprocess
31. Ljavax.crypto.Cipher
32. Runtime.getRuntime
33. TelephonyManager.getSubscriberId
34. android.telephony.gsm.SmsManager
35. Ljava.lang.Class.forName
36. Binder
37. IBinder
38. android.os.IBinder
39. android.intent.action.SEND
40. URLClassLoader
41. abortBroadcast
42. TelephonyManager.getDeviceId
43. getCallingPid
44. TelephonyManager.getCallState
45. TelephonyManager.getSimSerialNumber
46. Runtime.load
47. PathClassLoader
48. Ljava.lang.Class.getPackage
49. Ljava.lang.Class.getDeclaredClasses
50. TelephonyManager.getSimCountryIso
51. sendMultipartTextMessage
52. PackageInstaller
53.TelephonyManager.isNetworkRoaming
54. Ljava.lang.Class.getClasses
55. sendDataMessage
56. HttpPost.init
57. HttpUriRequest
58. divideMessage
59. Runtime.exec
60. TelephonyManager.getNetworkOperator
61. MessengerService
62. IRemoteService
63. SET_ALARM
64. ACCOUNT_MANAGER
65. TelephonyManager.getSimOperator
66. Ljava.lang.Class.getResource
67. Process.start
68. Context.bindService
(continued on next page)

9
S. Bashir et al. Pervasive and Mobile Computing 97 (2024) 101859

Table 4 (continued )
API Calls

69. ProcessBuilder
70. onBind
71. defineClass
72. findClass
73. Runtime.loadLibrary
Command Signature
1. Mount
2. Chmod
3. remount
4. chown
5. /system/bin
6. /system/app

algorithms and tracking the dynamic binaries. The dynamic analysis will also keep on observing the behavior and action of executing
code (software) even it is working under controlled environment.
As classification and behavior analysis is performed by executing the code, therefore dynamic analysis is performed by classifying
the code into malware and benign. Multiple classification algorithms such as SVM, KNN, Naïve Bayes, Bagging and Random forest are
used for the classification of malware in Android Apps. Static analysis is to extract features from the binary code of program and then
these features are used to create models for behavior analysis.
Conclusively, features extraction is performed by not executing the code itself (static analysis) whereas model execution by using
these extracted features and malware classification using machine learning algorithms is done by executing the files which is dynamic
analysis. As feature extraction (static analysis) is part of data preprocessing, and the core part of proposed methodology is algorithms
selection, execution and analysis (dynamic analysis) therefore the proposed methodology is a hybrid approach for malware classifi­
cation of android apps.
The binaries that are extracted contain built in constant strings, API calls and system calls. Similarly, keywords have broadcast
receiver. Next stage is to perform training and testing of proposed model in order to achieve the required results and to perform
dynamic analysis. 10-fold cross validation is performed before classification. From the manifest.xml applications of Android, database
was created by the collections of searching services, broadcast, activities, and broadcast receivers. The analysis of previously existed
malware sample and frequent keywords is performed and then data structure is created. Following classification algorithms and their
combinations are used for training and testing.

3.3.1. Classification algorithms

a Support Vector Machine

Support Vector Machines (SVMs) are used to evaluate the data which is used for regression and classification analysis [33]. The
point X in the function region installed into the hyperplane is described with this option of a hyperplane by the relationship.

αi k(xi , x) = constant (1)
i

Notice that where k (x, y) is smaller as y goes further than x, the closeness of degree measure in the sum each term test point x and
the point xi corresponding the data base. Sequential minimal optimization (SMO) is a Quadratic Programming (QP) problem solving
algorithm which occurs while training SVM is used to solve the quadratic programming problem. Consider an issue of binary data
classification (x1, y1), ., (xn, yn), in which input variable is xi and yi {-1, +1] is a corresponding binary mark. The solution of a
quadratic programming problem is trained in a soft margin support vector machine.

n
1∑ n ∑ n
( )
max αi − yi yj K xi , xj αi αj (2)
i=1
2 i=1 j=1

Subject to:
0 ≤ αi ≤ C, for i = 1, 2, ……………., n (3)


n
yi αi = 0 (4)
i=1

The kernel function K(xi, xj) and C which is an SVM hyper parameter both supplied by the user and the alpha variables are
Lagrange.

a K-Nearest Neighbor

10
S. Bashir et al. Pervasive and Mobile Computing 97 (2024) 101859

K-NN is a non-parametric algorithm that is used in regression and classification method. The output of K-NN depends on the
regression and classification method [40].
Suppose we have pairs
(X1 , Y1 ), (X2 , Y2 ), ……………….., (Xn , Yn ) (5)

taking values in

Rd × {1, 2} (6)

where Y belong to the X class label, in order that


X|Y = r ∼ Pr (7)
d
Given some norm ||. || on R , let
(X1 , Y1 ), (X2 , Y2 ), …………………………..(Xn , Yn ) (8)
Such that reallocation of training dataset
⃒⃒ ⃒⃒ ⃒⃒ ⃒⃒
⃒⃒X(1) − x⃒⃒ ≤ ………. ≤ ⃒⃒X(n) − x⃒⃒ (9)

Mostly, K-NN classification accuracy significantly can be enhanced by a distance matrix algorithm such as Neighborhood com­
ponents analysis and Large Margin Nearest Neighbor.

a Naïve Bayesian

Naïve Bayes is a machine learning classification method based on probabilistic mechanism and is built on built on Bayes theorem
[35]. Following equations are used to calculate the Naïve Bayesian classification.
P(m|n) P(n)
P(n|m) = (10)
P(m)

P(n|m) = P(m1 |n) ∗ P(m2 |n) ∗ … ∗ P(mn |m) ∗ P(n) (11)

where P(n|m) is posterior probability, P(n) is prior, P(m|n) is likelihood and P(m) is evidence. The probability equation will be used to
calculate the posterior probability for the detection and classification of malware data. Dataset will be divided into malicious and
benign data based on 0’s and 1’s where 0 presents the benign and 1 presents the malicious application.

a Bagging

Bagging is also known as bootstrap aggregation. In bagging ensemble 10 base estimators are used for training the data. They are
trained on random subset of original dataset and then their individual results are aggregated by averaging/voting and a final result is
generated to determine either the application is malware or benign [8].

T
f (x) = fi (x) (12)
i=1

where f(x) is average of fi for i=1…………T samples.



T
f (x)= sign( fi (x) (13)
i=1

( )

T
f (x) = sign sign(fi (x) (14)
i=1

where f(x) is calculated by the majority vote of all classifiers.


There are number of advantages for using this algorithm such as stability, robustness, parallel training of estimators and time
reduction, reduction of variance etc.

a Random Forest

Random forest is an ensemble classification algorithm that evade over-adjustment with a least cost [29]. After training, all indi­
vidual regression trees make predictions and then their results are averaged using the following equation:

11
S. Bashir et al. Pervasive and Mobile Computing 97 (2024) 101859

∑B
̂f = 1 fb (x′) (15)
B b=1

where x is training set and B is bagging which is performed repeatedly. The uncertainty of prediction can be estimated by taking the
standard deviation of all the predictions made by individual decision trees by using the following equation.
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
∑B ′ ̂
b=1 (fb (x ) − f )
σ= (16)
B− 1

Individual trees perform classification and the class selected by most trees is selected as output. Therefore, random forest out­
performs decision tree and has high classification accuracy.

a Stacking

It is also an ensemble algorithm that combines various classification classifiers through a meta-model. As base model is comprised
of distinct algorithms therefore the stacking ensemble is considered heterogeneous. [20] In stacking, meta-classifier is trained on the
predictions made by base classifiers, therefore high accuracy results are generated. The meta-classifier can be any classifier.

a Boosting

The algorithm defines the most used type of boosting called AdaBoost [20]. The y1(x) is the first base classifier that is trained
through a weighting coefficient that assign equally to all. Hence, the alpha quantity represents the weighting coefficient that assigns
the greater weight to the most correct classifiers. Mathematically:
( )
∑T
H(x) = αt Ht (x) (17)
t=1

where x is the input and ht(x) is the output of weak classifier t.


Weight assigned to t classifier is represented by αt and it is calculated as follows:
( )
1− E
αt = 0.5 ∗ ln (18)
E

where E is the error rate and is based on weight of classifier.

3.4. Proposed algorithm

The first stage of proposed framework is Data collection where three different datasets are obtained from data repositories for
malware classification and analysis.
Then, the features are extracted from malicious and benign applications Android dataset by the proposed technique. The files that
are extracted from benign and malicious applications are binaries and manifest files. The binaries built in files consist of constant
string, API calls and system calls while permissions, broadcast receiver, intent filter, provider, and process name are combined in
keywords. The malware samples are analyzed and Android manifest file (keywords) are collected whereas constant string is gathered
from benign and malicious applications. The analysis of malware is performed by creating a sample database which contains services,
activities, broadcast, and broadcast receiver from manifest.xml file of Android applications. The most frequent occurrences of mali­
cious keywords is analyzed from the data structure by the malware model.
Next stage is behavior analysis. The high accuracy of proposed malware detection system is achieved using various parameters and
ensemble algorithms. Training and testing in 10-fold cross validation is performed using authentic sampling technique. For this
purpose, stratified sampling method or shuffles sample method is used in the threshold.
The next stage after implementation of supervised algorithm and parameter setting is training and testing of proposed model. The
results are obtained and analyzed after testing. Classification algorithms are used for classification and then different evaluation
metrics such as accuracy, precision, recall and f-measure are used to evaluate the results utilizing 10-fold cross validation.
Step by step elaboration of proposed algorithm is given as follows:-

Step 1: The APK file of Android application is decompressed into keywords that help in detection of malware.
Step 2: The permission of application obtained after decompression of the APK file is used to distinguish the behavior of malware
and benign permissions.
Step 3: Apply the machine learning classifiers for classifications utilizing all keywords.
Step 4: SVM is applied on the extracted permission to detect the malicious and benign keywords.
Step 5: Naïve Bayes algorithm is used for classification of the benign and malicious applications.

12
S. Bashir et al. Pervasive and Mobile Computing 97 (2024) 101859

Step 6: K-nearest neighbor algorithm is applied to find the benign and malicious keywords
Step 7: Ensembles are then applied to check the performance of algorithms.
Step 8: Analyze the result of single classifier, combination of classifiers and ensembles.
Step 9: The training time of machine learning classifier is calculated.
Step 10: Calculate the testing time of all classifiers.

4. Experiments, results, evaluation and discussion

The proposed technique is tested on malware detection dataset that is comprised of different types of Android applications like
selfie camera, weather app, torch app, game app, Map apps, health etc. The framework is tested against existing malware families that
are DroidKungFu, Plankton, FakeInstaller, Ginmaster, Basebridge, Geinimi, Nandrobox, Nisev, imlog, Adrd, Fakerun, smFrow,
BaseBridge, DroidDream, Opfake, Kmin and many others. The proposed hybrid approach detected all above listed malware families
with high accuracy.
The machine learning algorithms are separately applied on the Android permissions, receivers, intent filters, broad cast receivers
and providers which are known as keywords in proposed technique. Three different machine learning classifiers and four ensembles
are applied on benign and malign applications. All classifiers are assessed based on permissions, API calls and intents.
Three different real world Android apps dataset are collected that include benign and malicious apps. The datasets are comprised of
applications which are .apk files. First is Kaggle dataset randomly collected from Google. It contains 50000 benign applications and
50000 malware applications that have 35 different types. Second dataset is Drebin-215 dataset that is comprised of 5560 malware and
9476 benign applications based on several families. Drebin dataset is freely available for research. Third dataset is Malgenome dataset
which consists of 18851 benign and 9998 malware applications that have been collected in period of October 2019. According to
Android market each of these malware and benign applications are linked to 30 groups that are defined in Table 5.
The proposed model is evaluated using 10-fold cross validation. During validation, data is divided into 10 equal sizes of sub-
samples. For training K-1 data folds are used and for testing one fold is used. The model evaluation matrix is used for executing the
algorithms and calculating Accuracy, Precision, Recall, F-measure, True Positive Rate and False Positive Rate [41]. Table 6 shows the
evaluation metrics along with their formulas. TP, TN, FP, FN represents True Positives, True Negatives, False Positives and False
Negatives respectively. Table 7 demonstrates the results of all classifiers on Kaggle dataset. The ensemble boosting classifier attains
highest TPR and FPR results. Overall, RF gains highest accuracy of 97.83%t and F-measure of 97.1% as compared to other classifiers in
detection of malware. It has minimum false positive rate (FPR) which shows the correctness of proposed algorithm. RF outperforms all
single as well as ensemble classifiers for Kaggle dataset. The lowest accuracy of 59.13% is attained by stacking with lowest TPR of
77.2%.
Table 8 shows the results on Drebin dataset. The highest achieving rate of machine learning classifiers is 100% whereas the lowest
false positive rate is 0%. The single learning model K-NN, the combination of SVM+NB, SVM+K-NN, Bagging and RF attained highest
accuracy that is 100%. Overall, boosting classifier gained lowest TPR and accuracy as compared to other classifiers in detection of
malware. Table 9 shows the execution of classifiers on Malgenome dataset. The analysis of results indicates that RF classifier attained
highest accuracy results of 98.89% as compared to other machine learning models for detection of malware. The analysis of results
indicates that three datasets Kaggle, Malgenome and Drebin are evaluated using 10 fold cross validation to inspect the performance of
single as well as ensemble classifiers. Table 14 shows the evaluation results on all 3 datasets. It shows that highest F-measure of 97.1%
is achieved by RF with 97.83% accuracy for Kaggle dataset. Drebin dataset has attained excellent results with F-measure 100% with a
similar 100% accuracy for multiple classifiers i.e K-NN, Bagging, RF, SVM+NB and SVM+K-NN. Similarly, detection results on
Malgenome dataset show F-measure of 98.5% with a good accuracy of 98.89%. The analysis indicates that Drebin dataset achieved
outstanding results on detection of malware along with negligible (0%) false positive rate as compared to other results. To validate the
consequence of presented technique, its performance is compared with state-of-art detection systems. The numerous types of Android
malware detection systems are examined from the general classification based methods to the machine-based methods as shown in
below Tables 10–14. The proposed machine learning classifiers are compared with previously proposed classifiers in detection system.
The features that equated to the previous framework are Accuracy, F-measure, Recall and Precision. Furthermore, comparison of the F-
measure, Recall, Precision and detection Accuracy of proposed framework verifies the performance increase as compare to existing
methods. The high detection accuracy of proposed framework over the previously designed methods shows the significance of the
performance based on the machine learning classifiers. The performance of proposed method dominates other models in term of run
time detection also.
The least execution time of SVM classifier is noted 0.25s in single classifiers although NB, KNN, RF, bagging, boosting, stacking are
also executed in less time as shown in Table 15 while other algorithms also accurately detected malicious and legitimate applications
with suitable execution time.
The analysis of results indicates that Random Forest outperforms as compared to other single and ensemble classifiers for all
datasets. It has achieved 100% accuracy, precision, recall and f-measure for malware detection and classification. It is also observed
that execution time of Random Forest is minimum when compared with other ensembles which show the effectiveness of proposed
model. The proposed model consisting of proposed feature selection and Random Forest classifier can be used for malware detection
and classification for android apps in real time. State of the art comparison also indicates that proposed approach is effective in terms of
accuracy, precision, recall and f-measure.
Moreover, the previous techniques utilized four list of features whereas the proposed technique extended it to eight features such as
permission, intent-filter (action), intent filter (category), process name, provider, intent-filter (scheme), receiver and intent-filter

13
S. Bashir et al. Pervasive and Mobile Computing 97 (2024) 101859

Table 5
Benign and Malware App Categories.
Category Benign Malware Category Benign Malware

Arcade & Action 770 737 Medical 511 42


Books & Reference 559 403 Music & Audio 628 279
Brain & Puzzle 682 501 News & Magazine 552 215
Business 535 248 Personalization 1475 631
Cards & Casino 598 183 Photography 560 157
Casual 860 363 Productivity 707 588
Comics 508 99 Racing 583 128
Communication 617 403 Shopping 540 201
Education 516 364 Social 606 235
Entertainment 674 820 Sports 603 245
Finance 538 253 Sport Games 593 143
Health & Fitness 553 243 Tools 808 1103
Libraries & Demo 509 37 Transportation 513 111
Lifestyle 567 419 Travel & Local 548 513
Media & Video 570 227 Weather 567 108

Table 6
Evaluation Metrics for Machine Learning Classifiers.
Performance Measure Calculations Description

False Positive Rate (FPR) FP Incorrectly categorized benign instances from benign examples
FP =
TN + FP
True Positive Rate (TPR)/Recall TP The ratio of malicious applications that are correctly classified from total applications
TP =
TP + FN
Precision TP Fraction of relevant instances retrieved
P =
TP + FP
F-Measure FP ∗ FN The measure merges the Recall and Precision rate
F = 2∗
FP + FN
Accuracy Tp + Tn Percentage of correctly identified all the information from the given dataset
Acc =
Tp + Tn + Fp + Fn

Table 7
Evaluation of classifiers on Kaggle Dataset.
Algorithm FPR Precision Recall F-Measure Accuracy

SVM 72.8 69.6 88.4 77.9 67.18


NB 75.9 67 81.8 73.7 61.85
K-NN 74.7 69.1 88.7 77.7 66.74
SVM+NB 97.6 65.6 98.7 78.8 65.31
SVM + KNN 97.6 65.6 98.7 78.8 65.31
NB+KNN 74.5 69 87.9 77.3 66.25
Bagging 86.2 67.3 94 78.4 66.18
Boosting 100 65.3 100 79 65.34
Stacking 75 66 77.2 71.2 59.13
RF 1.4 97.7 96.5 97.1 97.83

Table 8
Evaluation of classifiers on Drebin Dataset
Algorithm FPR Precision Recall F-Measure Accuracy

SVM 1.4 97.7 96.5 97.1 97.83


NB 2.8 97.2 99.5 98.3 98.32
K-NN 0 100 100 100 100
SVM+NB 0 100 100 100 100
SVM + KNN 0 100 100 100 100
NB+KNN 0 100 99.7 99.9 99.86
Bagging 0 100 100 100 100
Boosting 1.9 97.9 89.9 93.7 93.95
Stacking 0 100 100 100 99.99
RF 0 100 100 100 100

14
S. Bashir et al. Pervasive and Mobile Computing 97 (2024) 101859

Table 9
Evaluation of Classifiers on Malgenome Dataset.
Algorithm FPR Precision Recall F-Measure Accuracy

SVM 1.4 97.7 96.5 97.1 97.84


NB 22.8 70.8 94 80.8 83.45
K-NN 1.1 98.1 98.5 98.3 98.75
SVM+NB 1.4 97.7 96.5 97.1 97.83
SVM+K-NN 1.4 97.7 96.5 97.1 97.84
NB+K-NN 13.6 80.7 97.1 88.1 90.31
Bagging 1.4 97.7 98.2 98 98.49
Boosting 5.6 90.4 89.2 89.8 92.49
Stacking 1.9 96.8 96 96 97.32
RF 0.3 99.4 97.6 98.5 98.89

Table 10
Malware Detection on Drebin, Malgenome and Kaggle Dataset.
Datasets Precision Recall F-Measure Accuracy

Kaggle 97.7% 96.5% 97.1% 97.83%


Drebin 100% 100% 100% 100%
Malgenome 99.4% 97.6% 98.5% 98.89%

Table 11
Comparison of SVM with other State of the Art Frameworks.
Ref/Year Dataset Accuracy Precision Recall F-Measure

Feng et al. [20] (2018) M1: 8806 benign and 5213 malicious M2: 5000 malicious and benign. 96.27% 96.16% - 94.92%
Sumaya et al. [36] (2019) MalGenome 1,260 malware apps - 92% 92% 92%
Rehman et al.[17] (2018) MODROID 204 benign & 197 malicious 83.96% 80.5% - -
Yousefi-Azar et al. [16] (2018) WinAppPE and PEShare dataset 97.78% 98.32% - 97.78%
Burnap et al. [37] (2018) 345,000 observations 67.37% 67.9% 67.4% 67.0%
Morales-Molina et al.[29] (2018) CSDMC_API_Train - 89.3% 98.3% 94.16%
Kumar et al.[35] (2019) Google Play Store & Chinese App store. 95.0% 5% - -
Singh et al.[24] (2017) 278 benign and 216 malicious 96.96% 93.89% 99.54% -
Proposed Approach
(SVM) Kaggle 67.18% 69.6% 88.4% 77.9%
Drebin 97.83% 97.7% 96.5% 97.1%
Malgenome 97.84% 97.7% 96.5% 97.1%

Table 12
Comparison of RF with other State of the Art Frameworks
Ref/Year Dataset Accuracy Precision Recall F-
measure

Morales-Molina et al.[29] CSDMC_API_Train - 96.0% 98.72% 96.54%


(2018)
Pektas et al.[31] (2017) Benchmark dataset obtained from VirusSharing platform 92% 92% 92% 92%
Arshad et al.[30] (2018) SAMADroid and Drebin 99.07% 99.9% - -
Feng et al. [20] (2018) M1: 8806 benign and 5213 malicious apps. The M2 contained 5000 malicious 96.34% 96.59% - 94.87%
and benign.
Jian et al. [10] (2018) 43,822 benign and 8,454 malicious apps 99.12% 99.1% - 99.1%
Sumaya et al. [36] (2019) MalGenome 1,260 malware apps - 93% 93% 93%
Yousefi-Azar et al. [16] WinAppPE and PEShare dataset 96.72% 97.85% - 96.68%
(2018)
Burnap et al. [37] (2018) 345,000 observations 98.54% 98.5% 98.5% 98.5%
Idrees et al. [5] (2017) 1,745 real world applications - 98.5% 98.5% 98.5%
Singh et al.[24] (2017) 278 benign and 216 malicious 95.15% 97.64% 95.83% -
Proposed Approach
(RF) Kaggle 97.83% 97.7% 96.5% 97.1%
Drebin 100% 100% 100% 100%
Malgenome 98.89% 99.4% 97.6% 98.5%

15
S. Bashir et al. Pervasive and Mobile Computing 97 (2024) 101859

Table 13
Comparison of NB with other State of the Art Frameworks.
Ref/Year Dataset Accuracy Precision Recall F-measure

Kumar et al.[35] (2019) Chinese App store and Google Play Store. 98% 5% - 98%
Feng et al. [20] (2018) M1: 8806 benign, 5213 malicious M2: 5000 malicious and benign 84.36% 70.47% - 82.58%
Arshad et al.[30] (2018) SAMADroid and Drebin 91.6% .91.2% - -
Pektas et al.[31] (2018) Benchmark dataset obtained from VirusSharing platform 82% 83% 82% 83%
Jian et al. [10] (2018) 43,822 benign and 8,454 malicious apps 60.05% 74.7% - 60.5%
Burnap et al. [37] (2018) 345,000 observations 82.90% 83.2% 82.9% 82.9%
Idrees et al. [5] (2017) 1,745 real world applications - 98.9% 98.9% 98.8%
Al Ali et al. [25] (2017) M0droid contains 115 malware and 200 benign - 81.98% 75.31% 76.81%
Alqatawna et al. [26] (2017) 1635 instances 88.6% 85.5% 92.9% -
Mahindru et al. [42] (2017) DroidKin, Android botnet dataset 98.88% 98.8% 98.7% 98.7%
Proposed Approach
(NB) Kaggle 61.89% 67% 81.8% 73.7%
Drebin 98.32% 97.2% 99.5% 98.3%
Malgenome 83.45% 70.8% 94% 80.8%

Table 14
Comparison of KNN with other State of the Art Frameworks.
Ref/Year Dataset Accuracy Precision Recall F-measure

Zhang et al. [34] (2019) Google Play store & contagion project 96.56% 95.85% 96.30% 96%
Feng et al. [20] (2018) M1: 8806 benign, 5213 malicious. M2: 5000 malicious and benign. 87.56% 75.25% - 85.57%
Jian et al. [10] (2018) 43,822 benign, 8,454 malicious 97.66% 97.7% - 97.7%
Sumaya et al. [36] (2019) MalGenome 1,260 malware apps - 87% 87% 87%
Yousefi-Azar et al. [16] (2018) WinAppPE and PEShare dataset 97.29% 96.90% - 97.30%
Kumar et al.[35] (2019) Google Play Store & Chinese App store 92.0% 46.4% - 91%
Singh et al.[24] (2017) 278 benign and 216 malicious - 96.56% 95.85% 96.30%
Zhao et al.[43] (2018) 516 benign and 528 malicious apps 92.0% 93% - -
Darus et al.[44] (2018) 300 malware and 300 benign apps. 80.69% 80% 80% 80%
Al Ali et al. [25] (2017) Modroid 115 malware & 200 benign - 89.41% 89.57% 89.18%
Proposed Approach (KNN) Kaggle 66.74% 69.1% 88.7% 77.7%
Drebin 100% 100% 100% 100%
Malgenome 98.75% 98.1% 98.5% 98.3%

Table 15
Execution Time of Classifiers for Benign and Malicious Applications Classification.
Algorithms Execution time(ms) Algorithms Execution time(ms)

SVM 0.25 Bagging 1.75


NB 0.55 Boosting 3.05
K-NN 0.75 Stacking 1.35
SVM+NB 1.17 RF 0.75
SVM + KNN 1.05 NB+KNN 1.34

(priority). With the enhancement of manifest keywords high accuracy is attained with well malware applications detection.
Furthermore, proposed machine learning model has also upgraded security for Android users.

5. Conclusions and future work

The proposed approach considers a robust and lightweight based on machine learning model to identify the malware behavior. The
proposed approach utilizes the features of Android application that classified into permission, manifest.xml file keyword and appli­
cation files of strings. Hence, the proposed technique categorizes malware from benign apps by using extracted features as input on
different classifiers. Firstly, the proposed framework classified the apps through binaries extracted features, then keywords and
manifext.xml file features are used to identify the presence of malware. Finally, all extracted features are merged into one feature to get
highly accurate results. Similarly, the feature extraction malware detection method increased the accuracy of proposed technique.
Different classifiers like SVM, Naïve Byes, K-NN and Random Forest are used to notice the malware families through accuracy, pre­
cision, f-measure and recall rates. Thus, these classifiers better work with binaries to improve the accuracy. The proposed approaches
defeat the previous solutions, as they are not proficient to identify any malicious solutions without appropriate signature released. The
proposed scheme has advantage to provide safety and privacy to Android users against cyber criminals. In the future, it is aimed to
extend the proposed framework to update the intended data structure of malicious keywords for detection of recently created malware.
Another research focus is to combine the string features and structural features into a uniform feature space that can be simultaneously

16
S. Bashir et al. Pervasive and Mobile Computing 97 (2024) 101859

fed into learning algorithm. Moreover, deep learning can be used for the classification task in order to further enhance the proposed
framework.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to
influence the work reported in this paper.

Data availability

Publically available dataset have been used.

References

[1] W.C. Wu, S.H. Hung, DroidDolphin: a dynamic Android malware detection framework using big data and machine learning, in: Proceedings of the Conference
on Research in Adaptive and Convergent Systems, 2014, pp. 247–252.
[2] K.A. Talha, D.I. Alper, C. Aydin, APK Auditor: Permission-based Android malware detection system, Digit. Investig. 13 (2015) 1–14.
[3] A. Damodaran, F. Di Troia, C.A. Visaggio, T.H. Austin, M. Stamp, A comparison of static, dynamic, and hybrid analysis for malware detection, J. Comput. Virol.
Hack. Tech. 13 (1) (2017) 1–12.
[4] M. Fan, J. Liu, X. Luo, K. Chen, Z. Tian, Q. Zheng, T. Liu, Android malware familial classification and representative sample selection via frequent subgraph
analysis, IEEE Trans. Inf. Forensics Secur. 13 (8) (2018) 1890–1905.
[5] F. Idrees, M. Rajarajan, M. Conti, T.M. Chen, Y. Rahulamathavan, PIndroid: a novel Android malware detection system using ensemble learning methods,
Comput. Secur. 68 (2017) 36–46.
[6] F.A. Narudin, A. Feizollah, N.B. Anuar, A. Gani, Evaluation of machine learning classifiers for mobile malware detection, Soft Comput. 20 (1) (2016) 343–357.
[7] A.T. Kabakus, I.A. Dogru, An in-depth analysis of Android malware using hybrid techniques, Digit. Investig. 24 (2018) 25–33.
[8] W. Wang, Z. Gao, M. Zhao, Y. Li, J. Liu, X. Zhang, DroidEnsemble: detecting android malicious applications with ensemble of string and structural static
features, IEEE Access 6 (2018) 31798–31807.
[9] S. Chen, M. Xue, L. Fan, S. Hao, L. Xu, H. Zhu, B. Li, Automated poisoning attacks and defenses in malware detection systems: an adversarial machine learning
approach, Comput. Secur. 73 (2018) 326–344.
[10] J. Li, L. Sun, Q. Yan, Z. Li, W. Srisa-An, H. Ye, Significant permission identification for machine-learning-based android malware detection, IEEE Trans. Ind. Inf.
14 (7) (2018) 3216–3225.
[11] N. Milosevic, A. Dehghantanha, K.K.R. Choo, Machine learning aided Android malware classification, Comput. Electr. Eng. 61 (2017) 266–274.
[12] S. Wu, P. Wang, X. Li, Y. Zhang, Effective detection of android malware based on the usage of data flow APIs and machine learning, Inf. Softw. Technol. 75
(2016) 17–25.
[13] R. Kozik, Distributing extreme learning machines with Apache Spark for NetFlow-based malware activity detection, Pattern Recognit. Lett. 101 (2018) 14–20.
[14] V. Moonsamy, J. Rong, S. Liu, Mining permission patterns for contrasting clean and malicious android applications, Futur. Gener. Comput. Syst. 36 (2014)
122–132.
[15] A. Calleja, A. Martín, H.D. Menéndez, J. Tapiador, D. Clark, Picking on the family: disrupting android malware triage by forcing misclassification, Expert Syst.
Appl. 95 (2018) 113–126.
[16] M. Yousefi-Azar, L.G. Hamey, V. Varadharajan, S. Chen, Malytics: a malware detection scheme, IEEE Access 6 (2018) 49418–49431.
[17] Z.U. Rehman, S.N. Khan, K. Muhammad, J.W. Lee, Z. Lv, S.W. Baik, I. Mehmood, Machine learning-assisted signature and heuristic-based detection of malwares
in Android devices, Comput. Electr. Eng. 69 (2018) 828–841.
[18] M.A. Kumara, C.D. Jaidhar, Leveraging virtual machine introspection with memory forensics to detect and characterize unknown malware using machine
learning techniques at hypervisor, Digit. Investig. 23 (2017) 99–123.
[19] A. Narayanan, M. Chandramohan, L. Chen, Y. Liu, Context-aware, adaptive, and scalable android malware detection through online learning, IEEE Trans.
Emerg. Top. Comput. Intell. 1 (3) (2017) 157–175.
[20] P. Feng, J. Ma, C. Sun, X. Xu, Y. Ma, A novel dynamic Android malware detection system with ensemble learning, IEEE Access 6 (2018) 30996–31011.
[21] W. Niu, R. Cao, X. Zhang, K. Ding, K. Zhang, T. Li, OpCode-level function call graph based android malware classification using deep learning, Sensors 20 (13)
(2020) 3645.
[22] D.S. Keyes, B. Li, G. Kaur, A.H. Lashkari, F. Gagnon, F. Massicotte, EntropLyzer: android malware classification and characterization using entropy analysis of
dynamic characteristics, in: Proceedings of the Reconciling Data Analytics, Automation, Privacy, and Security: A Big Data Challenge (RDAAPS), IEEE, 2021,
pp. 1–12.
[23] L. Wei, W. Luo, J. Weng, Y. Zhong, X. Zhang, Z. Yan, Machine learning-based malicious application detection of android, IEEE Access 5 (2017) 25591–25601.
[24] L. Singh, M. Hofmann, Dynamic behavior analysis of android applications for malware detection, in: Proceedings of the International Conference on Intelligent
Communication and Computational Techniques (ICCT), IEEE, 2017, pp. 1–7.
[25] M. Al Ali, D. Svetinovic, Z. Aung, S. Lukman, Malware detection in android mobile platform using machine learning algorithms, in: Proceedings of the
International Conference on Infocom Technologies and Unmanned Systems (Trends and Future Directions)(ICTUS), IEEE, 2017, pp. 763–768.
[26] J.F. Alqatawna, H. Faris, Toward a detection framework for android botnet, in: Proceedings of the International Conference on New Trends in Computing
Sciences (ICTCS), IEEE, 2017, pp. 197–202.
[27] A. Saracino, D. Sgandurra, G. Dini, F. Martinelli, Madam: effective and efficient behavior-based android malware detection and prevention, IEEE Trans. Depend.
Secure Comput. 15 (1) (2016) 83–97.
[28] S. Sun, X. Fu, H. Ruan, X. Du, B. Luo, M. Guizani, Real-time behavior analysis and identification for Android application, IEEE Access 6 (2018) 38041–38051.
[29] C.D. Morales-Molina, D. Santamaria-Guerrero, G. Sanchez-Perez, H. Perez-Meana, A. Hernandez-Suarez, Methodology for malware classification using a random
forest classifier, in: Proceedings of the IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC), IEEE, 2018, pp. 1–6.
[30] S. Arshad, M.A. Shah, A. Wahid, A. Mehmood, H. Song, H. Yu, Samadroid: a novel 3-level hybrid malware detection model for android operating system, IEEE
Access 6 (2018) 4321–4339.
[31] A. Pektaş, T. Acarman, Ensemble machine learning approach for android malware classification using hybrid features, in: Proceedings of the International
Conference on Computer Recognition Systems, Springer, Cham, 2017, pp. 191–200.
[32] J. Li, Z. Wang, T. Wang, J. Tang, Y. Yang, Y. Zhou, An android malware detection system based on feature fusion, Chin. J. Electron. 27 (6) (2018) 1206–1213.
[33] D. Du, Y. Sun, Y. Ma, F. Xiao, A novel approach to detect malware variants based on classified behaviors, IEEE Access 7 (2019) 81770–81782.
[34] F. Zhang, H.A.D.E. Kodituwakku, J.W. Hines, J. Coble, Multilayer data-driven cyber-attack detection system for industrial control systems based on network,
system, and process data, IEEE Trans. Ind. Inf. 15 (7) (2019) 4362–4369.
[35] R. Kumar, X. Zhang, W. Wang, R.U. Khan, J. Kumar, A. Sharif, A multimodal malware detection technique for Android IoT devices using various features, IEEE
Access 7 (2019) 64411–64430.

17
S. Bashir et al. Pervasive and Mobile Computing 97 (2024) 101859

[36] U.S. Jannat, S.M. Hasnayeen, M.K.B. Shuhan, M.S. Ferdous, Analysis and detection of malware in android applications using machine learning, in: Proceedings
of the International Conference on Electrical, Computer and Communication Engineering (ECCE), IEEE, 2019, pp. 1–7.
[37] P. Burnap, R. French, F. Turner, K. Jones, Malware classification using self organising feature maps and machine activity data, Comput. Secur. 73 (2018)
399–410.
[38] L. Massarelli, L. Aniello, C. Ciccotelli, L. Querzoni, D. Ucci, R. Baldoni, AndroDFA: android malware classification based on resource consumption, Information
11 (6) (2020) 326.
[39] S. Hosseini, A.E. Nezhad, H. Seilani, Android malware classification using convolutional neural network and LSTM, J. Comput. Virol. Hack. Tech. (2021) 1–12.
[40] D.T. Dehkordy, A. Rasoolzadegan, A new machine learning-based method for android malware detection on imbalanced dataset, Multimedia Tools Appl. (2021)
1–22.
[41] M. Hossin, M.N. Sulaiman, A review on evaluation metrics for data classification evaluations, Int. J. Data Min. Knowl. Manag. Process 5 (2) (2015) 1.
[42] A. Mahindru, P. Singh, Dynamic permissions based android malware detection using machine learning techniques, in: Proceedings of the 10th Innovations in
Software Engineering Conference, 2017, pp. 202–210.
[43] C. Zhao, W. Zheng, L. Gong, M. Zhang, C. Wang, Quick and accurate android malware detection based on sensitive APIs, in: Proceedings of the IEEE
International Conference on Smart Internet of Things (SmartIoT), IEEE, 2018, pp. 143–148.
[44] F.M. Darus, N.A.A. Salleh, A.F.M. Ariffin, Android malware detection using machine learning on image patterns, in: Proceedings of the Cyber Resilience
Conference (CRC), IEEE, 2018, pp. 1–2.

18

You might also like