A Comparative Analysis of Malware
A Comparative Analysis of Malware
Anomaly Detection
Priynka Sharma, Kaylash Chaudhary, Michael Wagner, M.G.M Khan
[email protected], [email protected],
[email protected], [email protected]
1 Introduction
adfa, p. 1, 2011.
© Springer-Verlag Berlin Heidelberg 2011
and are foreseen to hit $6 trillion each year by 2021 [4]. Malware is software designed
to infiltrate or harm a computer framework without the proprietor's informed assent.
Many strategies have been used to safeguard against different malware.
Among these, Malware Anomaly Detection (MAD) is the most encouraging strategy to
shield from dynamic anomaly practices. MAD System groups information into differ-
ent classifications known to be as typical and bizarre [5]. Different classification algo-
rithms have been proposed to plan a powerful Detection Model [6] [7]. The exhibition
of a classifier is a significant factor influencing the performance of MAD Model. Thus,
the choice of precise classifier improves the performance of malware detection frame-
work. In this work, classification algorithms have been assessed using WEKA tool.
Four different classifiers have been estimated through Accuracy, Receiver Operating
Characteristics (ROC) esteem, Kappa, Training time, False Positive Rate (FPR) and
Recall esteem. Positions have additionally been appointed to these algorithms by ap-
plying Garret's positioning strategy [8].
The rest of the paper is dependent by sections as follows. The next two sub-
sections discuss anomaly detection and classification algorithms. Section 2 discusses
related work, whereas Section 3 presents the chosen data-set and describes WEKA tool
and different classification algorithms. Results and discussions are obtainable in Sec-
tion 4. Finally, Section 5 leads with the conclusion of this research.
Anomaly detection stages can fall into the particulars of data identification where di-
minutive peculiarities that cannot be seen by users observing datasets on a dashboard.
Therefore, the best way to get continuous responsiveness to new data examples is to
apply a machine learning technique.
Bayes Classifier: Also known as Belief Networks, has a place with the group of prob-
abilistic Graphical Models (GM'S) which are used to state in learning about uncertain
areas. In the graph, nodes mean random factors and edges are probabilistic conditions.
Bayes classifier depends on foreseeing the class based on the estimation of individuals
from the highlights [14].
Function Classifier: Develops the idea of a neural network and relapse [I]. Eighteen
classifiers fall under this classification. Radial Basis Function (RBF) Network and Se-
quential Minimal Optimization (SMO)are two classifiers which perform well the da-
taset used in this paper. RBF classifiers can present any nonlinear function effectively,
and it does not utilise crude information. The issue with RBF is the inclination to over
train the model [15].
Lazy Classifier: Requests to store total training information. While building the model,
new examples are not incorporated into the training set by these classifiers. It is mostly
utilised for classification on information streams [16].
Meta Classifier: Locates the ideal set of credits to prepare the base classifier. The pa-
rameters used in the base classifier will be used for predictions. There are twenty-six
classifiers in this category [8].
Mi Classifier: There are twelve Multi-Instance classifiers. None fits the dataset used in
this paper. This classifier is a variation of the directed learning procedure. These kinds
of classifiers are initially made accessible through a different programming bundle [17].
Misc or Miscellaneous Classifier: Three classifiers fall under Misc Classifier. Two
classifiers, Hyperpipes and Voting Feature Interval (VFI), are compatible with our da-
taset [8].
Rules Classifier: Association standards are used for the right expectations of class
among all the behaviours, and it is linked with the level of accuracy. They may antici-
pate more than one end. Standards are fundamentally unrelated. These are learnt one at
a time [17].
Trees: Famous classification procedures where a stream graph like tree structure is cre-
ated in which every hub signifies a test on characteristics worth and each branch ex-
presses the result of the test. Moreover, it is known as Decision Trees. Tree leaves char-
acterise to the anticipated classes. Sixteen classifiers fall under this category [8].
2 Related Work
Numerous analysts have proposed different strategies and algorithms for anomaly de-
tection on data mining classification methods.
Lei Li et. al. [19] presents a rule-based technique which exploits the compre-
hended examples to recognise the malignant attacks [18]. Fu et. al. [20] discusses the
use of data mining in the anomaly detection framework. It is a significant course in
Intrusion Detection System (IDS) research. The paper shows the improved affiliation
anomaly detection dependent on Frequent-Pattern Growth and Fuzzy C Means (FCM )
network anomaly detection. Wenguang et al. proposed a smart anomaly detection
framework dependent on web information mining, which is, contrasted with other con-
ventional anomaly detection frameworks [20]. However, for a total detection frame-
work, there are still some work left such as improving information mining algorithms,
best handling the connection between information mining module and different mod-
ules, improving the framework's versatile limit, accomplishing the representation of test
outcomes, improving continuous proficiency and precision of the framework. Like-
wise, Panda M, & Patra M. [22] present the study of certain information mining sys-
tems, for instance, Machine learning, feature selection, neural network, fuzzy logic,
genetic algorithm, support vector machine, statistical methods and immunological
based strategies [21].
Table 1 presents an overview of papers that applied machine learning techniques for
Android malware detection.
The android malware dataset from figshare, consists of 215 attributes feature vec-
tors detached from 15,036 applications (5,560 malware applications from Drebin ven-
ture and 9,476 amiable applications). Also, this dataset has been used to create multi-
level classifier fusion approach for [1]. Table 2 shows that the dataset contains two
classes, mainly, Malware and Benign. There are 5560 instances of Malware and 9476
instances of Benign.
WEKA is a data analysis tool developed in the University of Waikato, New Zealand in
1997 [22]. It consists of several machine learning algorithms that can be used to mine
data and extract meaningful information. This tool is written in Java language and con-
tains a graphical user interface to connect with information Files. It contains 49 infor-
mation pre-preparing tools, 76 classification algorithms, 15 trait evaluators and 10 quest
algorithms for highlight choice. It has three Graphical User Interfaces: "The Explorer",
"The Experimenter" and "The Knowledge Flow." WEKA bolsters information placed
in Attribute Relation File Format (ARFF) document group. It has a lot of boards that
can be utilised to perform explicit errands. WEKA gives the capacity to create and in-
corporate a new Machine Learning algorithm in it.
3.2 Cross-Validation
The cross-validation is equivalent to a single holdout validation set to evaluate the mod-
el's predictive performance on hidden data. Cross-validation does this more robustly,
by iterating the trial multiple times, using all the various fragments of the training set
as the validation sets. This gives an increasingly exact sign of how well the model sums
up to inconspicuous data, thus avoiding overfitting.
1. False Positive (FP): The model predicted a benign class as a malware attack.
2. False Negative (FN): It means wrong expectation or prediction. The prediction
was benign, but it was a malware attack.
3. True Positive (TP): The model predicted a malware attack, and it was a
malware attack.
4. True Negative (TN): The model predicted as benign, and it was benign.
Accuracy = (TP+TN)/n
True Positive Rate (TPR) = (TP+TN)/n
False Positive Rate (FPR) =FP/(TN+FP)
Recall = TP/(TP+ FN)
Precision = TP/(TP+FP)
Figure 4 shows the predictive model evaluation using knowledge flow. The arff
loader was used to load the dataset. The arff loader was associated with "ClassAs-
signer" (permits to pick which segment or column to be the class) component from the
toolbar and was eventually set on the layout. The class value picker picks a class value
to be considered as the "positive" class. Next was the "CrossValidationFoldMaker"
component from the Evaluation toolbar as described in Section 3.2. Upon completion,
the outcomes were acquired by option show results from the pop-up menu for the
TextViewer part. Tables 3 and 5 illustrates the evaluation results of the classifiers.
Classifier ROC FPR Accuracy Kappa MAE Recall Precision Training Rank
Area Time(sec)
IBK 0.994 0.013 98.76 0.9733 0.013 0.988 0.988 0.01 1
Rotation 0.997 0.020 98.51 0.9679 0.0333 0.985 0.985 98.63 2
Forest
SMO 0.976 0.027 97.84 0.9535 0.0216 0.978 0.978 34.46 3
Logistic 0.995 0.027 97.81 0.953 0.0315 0.978 0.978 22.44 4
Table 3 shows the quantity of "statistically significant wins", each algorithm has against
all the other algorithms on the malware detection dataset used in this paper. A win
implies an accuracy that is superior to the accuracy of another algorithm, and the
difference was statistically significant. However, we can agree with the results table
that IBK has a notable success when compared to RF, SMO, and Logistic.
Table 4. Predictive Model Evaluation Using Knowledge Flow
Each algorithm was executed ten times. The mean and standard deviation of the
accuracy is shown in Table 4. Therefore, the difference between the three accuracy
scores is significant for RF, SMO, and Logistic and is less than by 0.05, indicating that
these three techniques compared to IBK is statistically different. Henceforth, IBK leads
the algorithm accuracy level in determining the malware anomaly detection on the
Drebin dataset.
6. Bibliography
1. Yerima Y, & Sezer S, et al. (2018) DroidFusion: A Novel Multilevel Classifier Fusion Approach
for Android Malware Detection. Journal IEEE Transactions on Cybernetics 49: 453 – 466.
2. You I, & Yim K. (2010) Malware Obfuscation Techniques: A Brief Survey. In: Proceedings of
the 5th International Conference on Broadband, Wireless Computing, Communication and Appli-
cations, Fukuoka, Japan, 4-6 November.
3. Grcar J. (2011) John von Neumann’s Analysis of Gaussian Elimination and the Origins of Modern
Numerical Analysis. Journal Society for Industrial and Applied Mathematics 53: 607–682.
4. John P & Mello J. (2014) Report: Malware Poisons One-Third of World's Computers. Retrieved
June 6, 2019, from Tech News World: https://www.technewsworld.com/story/80707.html.
5. Guofei G, & Porras A, et al. (2015) Method and Apparatus for Detecting Malware Infections.
Patent Application Publication, United Sates. 1-6.
6. Shamili A, & Bauckhage C, et al. (2010) Malware Detection on Mobile Devices using Distributed
Machine Learning. In: Proceedings of the 20th International Conference on Pattern Recognition,
Istanbul, Turkey, 4348-4351.
7. Hamed Y, & AbdulKader S, et al. (2019) Mobile Malware Detection: A Survey. Journal of Com-
puter Science and Information Security 17: 1-65.
8. India B, & Khurana S. (2014) Comparison of classification techniques for intrusion detection
dataset using WEKA. In: Proceedings of the International Conference on Recent Advances and
Innovations in Engineering, Jaipur, India, 9-11 May.
9. Goldstein M, & Uchida S. (2016) A Comparative Evaluation of Unsupervised Anomaly Detection
Algorithms for Multivariate Data. Journal of PLOS ONE 11: 1-31.
10. Ruff L, & Vandermeulen R, et al. (2019) Deep Semi-Supervised Anomaly Detection. ArXiv 20:
1-22.
11. Schlegl T, & Seeböck P, et al. (2017) Unsupervised Anomaly Detection with
12. Generative Adversarial Networks to Guide Marker Discovery. In: Proceedings of the International
Conference on Information Processing in Medical Imaging, Boone, United States, 25-30 June.
13. Patch A, & Park J. (2007) An Overview of Anomaly Detection Techniques: Existing Solutions
and Latest Technological Trends. The International Journal of Computer and Telecommunica-
tions Networking 51: 3448-3470.
14. Chandola V, & Banerjee A. (2009) Anomaly Detection: A Survey. Journal of ACM Computing
Surveys 50: 1557-7341.
15. Bouckaert R. (2004). Bayesian network classifiers in Weka. (Working paper series. University of
Waikato, Department of Computer Science. No. 14/2004). Hamilton, New Zealand: University of
Waikato: https://researchcommons.waikato.ac.nz/handle/10289/85.
16. Mehata R, & Bath S, et al. (2018) An Analysis of Hybrid Layered Classification Algorithms for
Object Recognition. Journal of Computer Engineering 20: 57-64.
17. Kalmegh S. (2019) Effective classification of Indian News using Lazy Classifier IB1And IBk
from weka. Journal of information and computing science 6: 160-168.
18. Pak I, & Teh P. (2016) Machine Learning Classifiers: Evaluation of the Performance in Online
Reviews. Journal of Science and Technology 45: 1-9.
19. Li L, & Yang D, et al. A novel rule-based Intrusion Detection System using data mining. In Pro-
ceeding of the International Conference on Computer Science and Information Technology.
Chengdu, China, 9-11 July.
20. Fu D, & Zhou S, et al. (2009) The Design and Implementation of a Distributed Network Intrusion
Detection System Based on Data Mining. In Proceeding of the WRI World Congress on Software
Engineering, Xiamen, China 19-21 May.
21. Chai W, & Tan C, et al. (2011) Research of Intelligent Intrusion Detection System Based on Web
Data Mining Technology. In Proceedings of the International Conference on Business Intelligence
and Financial Engineering, Wuhan, China, 17-18 October.
22. Panda M, & Patra M. (2009). Evaluating Machine Learning Algorithms for Detecting Network
Intrusions. Journal of Recent Trends in Engineering 1: 472-477.