0% found this document useful (0 votes)
12 views13 pages

18th Aug

Uploaded by

richa010phd23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views13 pages

18th Aug

Uploaded by

richa010phd23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Computers & Security 116 (2022) 102659

Contents lists available at ScienceDirect

Computers & Security


journal homepage: www.elsevier.com/locate/cose

FeSA: Feature selection architecture for ransomware detection under


concept drift
Damien Warren Fernando∗, Nikos Komninos
Department of Computer Science, School of Mathematics, Computer Science and Engineering, City, University of London, UK

a r t i c l e i n f o a b s t r a c t

Article history: This paper investigates how different genetic and nature-inspired feature selection algorithms operate in
Received 12 April 2021 systems where the prediction model changes over time in unforeseen ways. As a result, this study pro-
Revised 30 November 2021
poses a feature section architecture, namely FeSA, independent of the underlying classification algorithm
Accepted 11 February 2022
and aims to find a set of features that will improve the longevity of the machine learning classifier. The
Available online 15 February 2022
feature set produced by FeSA is evaluated by creating scenarios in which concept drift is presented to
Keywords: our trained model. Based on our results, the generated feature set remains robust and maintains high
Ransomware detection rates of ransomware malware. Throughout this paper, we will refer to the true-positive rate
Concept-drift of ransomware as detection; this is to clearly define what we focus on, as the high true positive rate
Detection for ransomware is the main priority. Our architecture is compared to other nature-inspired feature se-
Learning-algorithms lection algorithms such as evolutionary search, genetic search, harmony search, best-first search and the
Features
greedy stepwise feature selection algorithm. Our results show that FeSA displays the least degradation
on average when exposed to concept drift. FeSA is evaluated based on ransomware detection rate, recall,
false positives and precision. The FeSA architecture provides a feature set that shows competitive recall,
false positives and precision under concept drift while maintaining the highest detection rate from the
algorithms it has been compared to.
Crown Copyright © 2022 Published by Elsevier Ltd. All rights reserved.

1. Introduction generated approximately $320million (De Groot, 2017). Overall it is


estimated that in 2020, organisations will pay up to $11 billion in
In recent years ransomware has emerged as one of the most paying ransoms or dealing with the damage caused by ransomware
potent malware threats out there. Ransomware uses tactics to attacks (Sanders, 2020). Popular and infamous ransomware like
reduce the victim’s access to their system or prevent files by Petya encrypts the Master Boot Record of a Windows system in
encrypting them. Victims pay for various reasons, whether it is a terms of behavioural diversity. Modern ransomware variants like
business that needs access to its files and does not have sufficient Maze encrypts files, steals sensitive information from companies
backups (Cook, 2020a) or a single person who has ”lost” personal and then exposes it if organisations and individuals do not pay the
files due to a ransomware attack. There are two types of ran- ransom (Saxena, 2018). Ransomware malware evolves to become
somware, the first being locker ransomware. Locker-ransomware more dangerous and damaging, as history has shown us. In the
will stop users from accessing their systems by displaying a lock context of machine-learning detection systems, the constant evo-
screen when they log into their systems. The second type of lution of malware can be classed as concept drift, a phenomenon
ransomware is crypto-ransomware that our research is focused that means the rules and logic learned by the classifier to classify
on. Crypto-ransomware is a highly sophisticated malware type, a malware becomes outdated and incorrect.
more common form of ransomware used in ransomware attacks.
Crypto-ransomware will use complex encryption schemes to en-
1.1. Malware evolution
crypt a victim’s files, rendering them unusable and unrecoverable
unless the ransom is paid and the attacker provides the subse-
According to DataProt (Jovanovic, 2019), around 980 million
quent decryption keys with a decryption tool. A popular example
malware programs on the internet today and 350,0 0 0 new mal-
of crypto-ransomware CryptoWall appeared in 2014, and it has
ware pieces are detected every day. The recent boom in mal-
ware evolution is traced back to 2013, in which the number

Corresponding author. of malicious files on the web doubled, this growth may have
E-mail address: [email protected] (D.W. Fernando). slowed, but it has not stopped. The statistics show that malware

https://doi.org/10.1016/j.cose.2022.102659
0167-4048/Crown Copyright © 2022 Published by Elsevier Ltd. All rights reserved.
D.W. Fernando and N. Komninos Computers & Security 116 (2022) 102659

is not only emerging at a rapid rate; this is also acknowledged in sifier expects shows true evolutionary characteristics, as the dy-
Hayes et al. (2009) recognised the diversity in malware in 2008, namic behaviour of the malware has changed. The transcend sys-
which implies malware has been evolving and changing for years. tem (Jordaney et al., 2018) acknowledges that malware can evolve
Singh et al. described three types of malware evolution, the first in ways that make it difficult for even machine learning detection
being a natural evolution, the second being environmental evolu- systems to detect; this evolution is described as concept drift in
tion, and the third being polymorphic evolution. Most aspects of a machine learning system. If malware’s behavioural patterns and
malware evolution are due to adaption to avoid anti-virus (AV) de- statistical properties change beyond the scope of what a machine
tection. Environmental evolution occurs when software develop- learning system defines as malicious behaviour, detection rates will
ment changes, such as compiler changes. If malware uses differ- start to decrease. Changes in statistical properties and dynamic be-
ent libraries to fulfil its goals, its behaviour may appear signifi- haviour during execution is what we would classify as true mal-
cantly different from what detection systems expect. The defini- ware evolution, as even machine learning systems would strug-
tion of environmental change depends heavily on compiler and gle to detect them. An example of the differences between evo-
library changes as defined in Singh et al. (2012), which means lution and zero-day is the WannaCry ransomware attack, consid-
these changes will be far less frequent than natural evolution. ered a zero-day threat. The attack was carried out using the eternal
Polymorphic evolution occurs in the form of transformation and blue and double pulsar exploits. These two exploits are windows
obfuscation (Singh et al., 2012). The use of packers and protec- SMB and privilege based and allow the ransomware to execute, the
tors create an artificial diversity that is designed to evade detec- zero-day aspect of this attack. If WannaCry was loaded into a sys-
tors. Packing will not help track drift, as the packers will encrypt tem using eternalBlue; however, it behaved the same as previous
and compress code; drift tracking should be carried out on un- ransomware, its characteristics would not be considered evolution-
packed malware. Malware evolution poses a large threat to sys- ary, only that it had been propagated using zero-day exploits, eter-
tems due to the rate of evolution not slowing down, according nalBlue and doublePulsar. WannaCry could be considered evolved
to Symantec (Cook, 2020b). Enterprise ransomware like SamSam because of the way it encrypted files and propagated itself through
and Dharma are coordinated hits on organisations using a man- networks; these aspects of the ransomware were behavioural evo-
ual attack methodology (Whitepaper, 2019). Doxxing is also a new lutions and would display dynamic behavioural characteristics not
methodology in ransomware attacks, threatening to expose sen- previously associated with ransomware.
sitive data of attack victims (Goodchile, 2020), yet another ex-
ample of ransomware’s dangerous evolution. When high diversity 1.3. Contributions of this paper
and evolution rates exist in a destructive malware type like ran-
somware, the consequences for victims become severe. · Behavioural analysis of ransomware characteristics that change
or ”evolve” over time.
1.2. Motivation · Proposal of a feature selection architecture, which provides an
optimal feature set showing promising performance when
Our main motivation for this research is the need for robust exposed to concept drift. FeSA’s feature set remains robust
features which will allow ransomware detection systems to remain over time; the main element is maintaining a slower degra-
effective when exposed to concept drift. It is observable that fea- dation rate in detection rate.
tures can quickly be rendered ineffective by concept drift; there-
fore, this creates the need for an architecture that can create ro- 1.4. Paper organisation
bust feature sets for ransomware detection under which will not
degrade excessively under concept drift. A zero-day vulnerability is The remainder of this paper is as follows: Section 2 covers
a software vulnerability that attackers discover before the software work related to our research. Section 3 covers our proposal and the
vendor is aware of it. A zero-day exploit is a method of exploit- background information that accompanies our work. Section 4 de-
ing a zero-day vulnerability (Kaspersky, 2021). A zero-day malware scribes our experiments, and Section 5 discusses the results of the
threat is a threat that has not been seen by the detection sys- experiments shown in Section 4. Section 6 concludes and expands
tem before and can be a variant or malware type for which no our work.
signatures exist (Carson, 2007). Machine learning classifiers have
been proven effective in detecting zero-day malware threats, as 2. Related work
shown in Shaukat and Ribeiro (2018) and Sgandurra et al. (2016b);
however, zero-day ransomware is not necessarily an example of This section explores related work which has influenced our
ransomware that has evolved and will be difficult for a classifier research. Section 2.1 investigates studies that apply concept drift
to identify correctly. Machine learning detection systems like ran- with pros and cons. This study also investigates evolutionary algo-
somwall (Shaukat and Ribeiro, 2018) and the system designed by rithms and how they can be used in concept drift detection and
Sgandurra et al. use dynamic features like API calls to differentiate adaption. This study also investigates the use of machine learning
ransomware from benign files; therefore, machine learning detec- detection for ransomware and how these systems tackle zero-day
tion systems effectively detect ransomware rapidly without relying threats. The related studies identify the gaps in ransomware detec-
on signatures or heuristics. Machine learning systems will be able tion and concept drift in ransomware detection systems, and the
to identify patterns and statistical properties of malware that dis- genetic algorithms in Section 2.2 point us towards possible solu-
tinguish them from benign files, hence why they are effective at tions.
detecting zero-day threats. Zero-day ransomware may be consid-
ered a zero-day due method of delivery or how it is obfuscated to 2.1. Concept drift
evade anti-virus detection; however, once it begins executing, its
behaviour determines whether it is an evolved variant or not. A Ransomware variants that display evolutionary qualities that
ransomware variant may be delivered via a zero-day attack that are different from their predecessors are always emerging, which
exploits a new vulnerability, but its behavioural patterns during may be difficult for ransomware detection systems to identify.
execution may not deviate much or at all from the patterns of pre- Good examples of evolving ransomware are the MedusaLocker and
vious ransomware. Ransomware that displays behavioural patterns WannaCry ransomware families. MedusaLocker is a ransomware
during execution that differ from what the machine learning clas- variant that targets antivirus and ransomware detection modules

2
D.W. Fernando and N. Komninos Computers & Security 116 (2022) 102659

to turn them off and disable them from running in safe mode ing URL matches its allocated classification. Thus, if a concept drift
(Collins, 2019); this variant of ransomware is extremely evasive is detected, the system will be immediately retrained.
and effective in disabling endpoint protection and preventing ran-
somware detection modules from working. The WannaCry ran- 2.2. Genetic algorithms
somware variant was propagated through a Windows SMB vulner-
ability that the public had not seen before the infection, although A genetic algorithm is a search heuristic that takes Charles
it was known to the NSA at the time. The two variants mentioned Darwin’s natural evolution theory (Fatima et al., 2019). This algo-
behaved vastly differently from the ransomware before them and rithm mimics the process of natural selection, which will select the
made it clear that a zero-day that showed characteristics far be- strongest to survive and produce offspring. A genetic algorithm will
yond the current behavioural profile can cause detection systems apply this logic to a dataset and can be used to produce an op-
to fail. timal feature set. The system proposed in Vivekanandan and Ne-
The Transcend System proposed in Jordaney et al. (2018) is a dunchezhian (2011) uses a genetic algorithm to produce an opti-
framework that can work with any machine learning algorithm to mal feature set for malware detection. A typical genetic algorithm
output confidence values for predictions. Predictions can be mod- will repeat its evaluation and crossover phase, creating numerous
elled differently; confidence values can be extracted from a ran- features to obtain optimal features. This research uses a genetic en-
dom forest depending on how many trees vote for the chosen pre- gineering approach to reduce the number of generations and fea-
diction. Confidence values can be extracted from a Support Vec- tures needed to produce the optimal feature set, otherwise known
tor Machine by measuring a prediction’s distance from the hyper- as a feature set. The structure of a genetic algorithm is shown be-
plane. Obtaining confidence from a clustering approach would in- low.
volve measuring the distance of a prediction from a centroid. The
· Fitness Function: The fitness function determines the ability
transcend system aims to identify how similar the classified in-
each individual has to compete, in the context of a detection
stance is to the rest of the instances in its class and how simi-
system, this would be determined by how accurate a feature
lar the instance is to samples in the other classes. The transcend
set is.
system measures the confidence of a prediction and combines the
· Population Generation: The initial population of individuals
value with the confidence the predictor has in other classes to de-
is generated randomly from the pool of available chromo-
termine how credible the prediction is. Predictions that fall below
somes; in most cases, chromosomes represent features that
the credibility threshold will have to be investigated manually by
will create a feature set.
an IT team or some administrative presence. Transcend does not
· Selection The selection phase is designed to take the fittest
use any evolutionary feature selection algorithms to train algo-
individuals and allow them to pass their genes onto the next
rithms; however, the framework proposed by them uses a similar
generation. In the context of a detection system, these would
structure and approach to an evolutionary algorithm.
be feature sets that achieve the highest accuracy.
The system proposed in Kantchelian et al. (2013) combines
· Crossover: Crossover is the process of two selected individuals
human intervention with underlying machine algorithms to ad-
being mated to produce a child, which will be a combina-
dress concept drift in an adversarial machine learning scenario;
tion of both parents. This phase can be repeated with the
this system attempts to classify adversarial learning as an evo-
offspring and so forth but can be limited to a select number
lutionary family of the training dataset. The system proposed in
of generations.
Kantchelian et al. (2013) stresses the need for retraining and hu-
· Mutation: Genes of the offspring can be subject to mutation
man interaction to handle concept drift effectively. The type of
with a low random probability, in the context of a feature
concept drift addressed in Kantchelian et al. (2013) is a type of
selection algorithm, this can mean inheriting a random fea-
drift introduced by adversarial techniques that are not addressed
ture that does not exist in either parent.
in any other related studies referenced in this study. The system
proposed in Maggi et al. (2009) uses anomaly detection to dis- The StreamGP algorithm (Folino et al., 2007) constructs an
tinguish between genuine changes in a web application and ma- ensemble Genetic Programming and a boosting algorithm. The
licious changes; however, this system also relies on retraining to StreamGP system generates decision trees that are trained on dif-
adapt to concept drift and reduce false-positive rates. The system ferent parts of a data stream. StreamGP has a concept drift detec-
proposed in Maggi et al. (2009) is unique because it looks specifi- tion system inbuilt, which, once triggered, will build a new clas-
cally for malign and benign changes; despite this, the retraining of sifier using CGPC, the cellular genetic programming method de-
parts of the model is necessary to adapt to the detected changes. scribed in Folino et al. (2007). The populations of data in this algo-
The systems that address concept drift seem to rely on retrain- rithm are sets of individual data blocks which are initially drawn
ing and human intervention instead of having a specifically con- randomly. The newly created classifier is added to the ensemble,
structed mechanism to counteract the effects of concept drift. and the weights of each classifier are then updated; this system
The system used in Singhal et al. (2020) uses the Heterogeneous creates a new classifier when concept drift is detected rather than
Euclidean Overlap Metric (HEOM) to detect concept drift in detect- constantly adapted to the newest block of data like the EACD pro-
ing malicious web URLs. The system combines Gradient Boosted posed in Ghomeshi et al. (2019).
Trees to detect malicious URLs and the GTB algorithm with the The EACD system (Ghomeshi et al., 2019) proposes a genetic al-
HEOM measurement. The concept drifts detection component of gorithm approach to combatting concept drift. This evolutionary
the system in Singhal et al. (2020) attempts to identify the differ- algorithm is multi-layered with a base and a genetic layer; both
ences between the data distribution between the old training data layers act as a natural selection mechanism to find the strongest
and the new incoming data. The distance between the training set feature set. The base layer will select a set number of features
and the newer data is calculated using the HEOM. The research and save them as feature sets. These feature sets are saved and
presented in Tan et al. (2018) attempts to detect concept drift in evaluated. The highest performing feature sets are passed into a
malicious URL detection systems and uses the Wilcoxon Rank-Sum. secondary genetic layer that will ”breed” feature sets by randomly
The Wilcoxon Rank-Sum test is a non-parametric test that allows crossing strong feature sets to create strong offspring. This breed-
the user to determine whether two samples are from the same ing step is carried out until the overall system’s accuracy is higher
population. In the context of malicious web URLs, the Wilcoxon on the newest data. The number of repetitions of the breeding step
Rank-Sum test allows the system to determine whether the incom- is defined by the maximum number of generations the system will

3
D.W. Fernando and N. Komninos Computers & Security 116 (2022) 102659

allow. This genetic approach produces promising results, finding 3.1.1. Concept drift
optimal feature sets for systems that model scenarios that present Concept drift is defined as the change in relationships be-
concept drift. tween inputs and output data in the underlying problem over time
The Online Genetic Algorithm (OGA) Folino et al. (2007) is (Brownlee, 2017). Concept drift will make classifiers degrade over
a rule-based learner that updates its ruleset based on the data time leading to more incorrect classifications. Incorrect classifica-
stream’s evolution. Like the base layer in the EACD system, the tions in the context of malware detection can cause problems. A
initial rulesets are chosen randomly, and the genetic algorithm is malware analysis team would have high standards for abandon-
applied when a new block of data is encountered to update the ing an ageing classification model (Jordaney et al., 2018). In the
rulesets. This process is repeated until the end of the data stream. context of a ransomware classification, a model would have to be
Each block of data is a different iteration, which leads to a large constantly monitored for signs of concept drift due to the damage
number of iterations. OGA does not limit the number of iterations one ransomware infection can cause. Concept drift can occur grad-
the algorithm can go through, which means it can become very ually over time or artificially to cause classifiers’ errors, as stated
expensive. in Kantchelian et al. (2013).
Concept drift can fall into the following three categories;

2.3. Ransomware detection · Gradual Concept Drift: A gradual change over time.
· Cyclical Concept Drift: A recurring or cyclical change.
Ransomware detection research that integrates concept drift is · Abrupt Concept Drift: A sudden or abrupt change.
a rarity in the research space. The Elderan system described in
Sgandurra et al. (2016b) considers zero-day attacks and tests on The relationship between a classifier and its predictions is de-
samples that the model has not been trained on. The Elderan sys- fined as p(y|x ) and concept drift can be defined as changes in
tem’s accuracy drops from 96 to 93% when exposed to zero-day p(x, y ) (Zuhair et al., 2019). The changes in this joint probability
threats; however, it is unclear if the zero-day threats are more than can be identified through its components, suggesting that different
a couple of months ahead of the training set. The explicit test- detection aspects can cause concept drift.
ing on zero-day threats is explored in Takeuchi et al. (2018) and The FeSA system is built to adapt to sudden concept drift and
VinayKumar et al. (2017), similar to the Elderan system, which can gradual concept drift. Sudden concept drift is the type of concept
be considered testing under concept drift; however, the zero-day drift that poses the biggest threat to a malware detection system.
samples are not guaranteed to display concept drift in regards to The sudden appearance of new ransomware which does not con-
the training samples. form to a model’s current configuration is a problem that cannot
The RansHunt system described in Hasan and Rah- be solved by retraining unless the retraining is done before the
man (2017) attempts to predict future ransomware trends by system is exposed to the new ransomware. FeSA is effective when
training on ”Ransomwall”, a ransomware hybrid that authors dealing with gradual concept drift because the system is built us-
predicted to be a future ransomware type. According to the cre- ing features from different distributions. Using different distribu-
ators of RansHunt, a worm component would be used to spread tions to build the FeSA feature set allows the system to capture the
ransomware through a compromised network. This prediction best possible feature set, which applies to ransomware from differ-
approach and preparation for future trends could prevent models ent eras. Capturing common features from many different types of
from degrading under concept drift. The system explored in [35]) ransomware from different periods gives FeSA the best chance of
explores using a generative adversarial system to produce vari- having features that will remain relevant in the future.
ations of ransomware that might deceive ransomware detection
systems; this approach is designed to highlight the need for 3.2. FeSA architecture
ransomware detection systems to be reinforced.
We propose FeSA, a feature selection architecture for ran-
3. FeSA- feature selection architecture somware detection under concept-drift. The FeSA architecture is
shown in Fig. 1 and is comprised of three main components. FeSA
The previous sections in this study discussed malware evolution architecture is built following the structure of a genetic algorithm.
and ransomware. This section introduces our proposal to combat The FeSA architecture needs to be provided with an initial fea-
the concept drift in ransomware detection systems. The FESA sys- ture pool to create feature sets with. The number of features in
tem proposes using an architecture, which generates feature sets this initial feature pool is user-defined. The larger the number
for ransomware detection systems through information gain and of features in the initial feature pool, the larger the number of
a genetic algorithm. Our approach relies on the user’s underlying unique and diverse feature sets the base layer can create. The fea-
machine learning algorithm but is compatible with any machine ture ranker selects a set of ”important” features from the feature
learning approach. The underlying machine learning algorithm will pool to pass onto the feature base layer. The base layer gener-
be the classifier trained on the feature set produced by the FeSA ates a set of random feature sets from the feature pool, ensuring
architecture. Genetic algorithms are proven effective for concept these feature sets include the important features. The feature sets
drift scenarios when used by the systems described in Folino et al. in the base layer are evaluated, and their detection rate and over-
(20 06, 20 07) and Vivekanandan and Nedunchezhian (2011); the all accuracy are calculated. The feature sets that achieve accuracy
obtained results lead us to FeSA, which does not entirely rely on and detection rates above the average accuracy and detection rates
the natural selection mechanism to produce an optimal feature set. of all of the feature sets in the base layer are defined as high-
performance and passed onto the genetic layer. The genetic layer
performs a breeding crossover procedure involving selecting two
3.1. Preliminaries high-performance feature sets from the base layer and combining
them to produce a new feature set; the user defines the number
This section contains necessary background information on of times the crossover process is repeated. In theory, the combina-
concept drift and genetic algorithms. This section also presents tion of high-performance feature sets from the base layer should
Table 1, which gives the notation of the symbols used throughout produce new feature sets which can achieve higher accuracy and
the paper. detection rates than feature sets combined to create them.

4
D.W. Fernando and N. Komninos Computers & Security 116 (2022) 102659

Table 1
Notations.

Symbol Explanation

xi A feature in a feature set.


x The feature x does not appear
IG(xi ) Information Gain for a feature xi
ci The classification of an instance into category i
p( c i | x ) Conditional probability of the ith category given the feature x appears.
p( c i | x ) The conditional probability of the ith category given the feature x does not appear.
|Z | The size of the set of important features. The important feature set is added to every feature set produced by FeSA.
a The proportion of features from the feature set which meet the requirements for being important features.
T(f) Total features in the initial feature set.
|N | The size of a feature set generated by the base layer, this feature set is part of the first generation of feature sets produced.
r A proportion of the original feature pool.
H High performance feature sets.
m Maximum feature set limit.
dr Average detection rate of feature sets in the base layer.
ar Average accuracy of feature sets in the base layer.
Yi A feature set produced in the base layer.
Hrand1 · A selected parent feature set in the genetic layer.
Hrand2 A selected parent feature set in the genetic layer.
Oi An offspring feature set in the genetic layer.
T A set of offspring feature sets.

Fig. 1. FeSA: Feature Selection Architecture.

3.2.1. FeSA feature ranker algorithm appear.


The initial population of feature sets is randomly generated;
however, the FeSA architecture uses a feature ranker to identify 
m 
m

the highest information gain features. Information gain reduces the IG(xi ) = − p( c i ) + p( x ) p(ci |x ) · log p(ci |x )
i=1 i=1
complexity of the generated important features because random
feature selection requires multiple selections to find the optimal 
m

set. FeSA controls the base each feature set is built upon, ensur-
+ p( x ) p(ci |x ) · log p(ci |x ) (1)
i=1
ing strong feature sets. Before generating the initial population,
a feature ranking algorithm is proposed to decide which features
The feature set, taken from the feature ranker, is defined in Eq.
are most important. The feature ranker is the base component of
(2). The variable a is dependent on the features defined by the
our system because it ranks features in order of their importance
feature ranker as essential. T ( f ) represents the total features in
and attaches a numerical value to this ranking. Information gain
the original feature pool. Our FeSA implementation chooses fea-
is calculated according to Eq. (1). The feature importance step is
tures with an information gain equal to or greater than 0.5 as ”im-
designed to provide the initial ”building blocks” for each feature
portant” features. The decision to set 0.5 as the threshold value
set. The ranker algorithm uses information gain to isolate the most
was based on the fact that information gain is a reduction in en-
important features; it determines information gain and then ranks
tropy, a measure of randomness; therefore, features were chosen,
features to gain information. Information gain is the reduction in
which took away at least half of the data’s randomness. Based on
entropy after a dataset is split on an attribute. Entropy is defined
experimental observations, very few features exceeded or matched
as a measure of randomness in information; therefore, the higher
this value during our experiments which meant a threshold of 0.5
the entropy, the harder it is to draw any conclusions from the data
would mean only some features are selected as ”important”. Z is
Zhou (2019).
defined as the set of important features, which every feature set
Information gain (IG ) is a reduction in entropy when splitting
must contain, |Z | is the cardinality of this set. An algorithmic rep-
on an attribute and is calculated in Eq. (1), ci represents the ith
resentation of the feature ranker is shown in Algorithm 1. In addi-
class category i.e. ransomware or benign, and p(ci ) is the proba-
tion to defining the key features included in each feature set, the
bility of ith category. p(ci |x ) is the conditional probability of the
ranker eliminates features deemed to provide 0 information gain.
ith category given the feature x appears. and p(ci |x ) is the condi-
The FeSA system uses the ranker to identify features that present
tional probability of the ith category given the feature x does not
zero information gain and excludes them from the base layer and

5
D.W. Fernando and N. Komninos Computers & Security 116 (2022) 102659

Algorithm 1 FeSA Feature Ranker to avoid massive computational costs. FeSA evaluates the feature
sets that have been generated using the initial population with a
Input: Initial features x0 , ..., xi
random forest classifier. The user’s use of the underlying algorithm
Output: Important feature set Z
is flexible and determined based on their features and data. The
1: for Initial Features x0 to xi do
random forest performed best with our features; therefore, it is
2: Calculate Information Gain for each feature using:IG(xi ) =
 m m chosen as our underlying algorithm. The feature sets will be eval-
− m i=1 p(ci ) + p(x ) i=1 p(ci |x ) · log p(ci |x ) + p(x ) i=1 p(ci |x ) · uated on overall accuracy and ransomware detection rate; there-
log p(ci |x )
fore, only the most accurate feature sets with the highest detec-
3: if IG(xi )  0.5 then
tion rates are passed onto the next phase. The highest perform-
4: Add xi to important feature set Z
ing feature sets are determined by calculating the average accuracy
5: end if
and ransomware detection rates of all feature sets and passing on
6: Return important feature set Z
the feature sets with accuracy and detection rate above the aver-
7: end for
age. The structure of the feature set generation layer is shown in
Algorithm 2. The abbreviations used in Algorithm 2 are as follows,
subsequent genetic layer.
Algorithm 2 FeSA Base Layer
a
|Z | = (2) Input: Initial features x0 , ..., xi ,Important feature set Z
T(f)
Output: High performance feature sets H
Algorithm 1 shows the operation of the feature ranker. The fea- 1: m → maximum feature sets
ture ranker takes an initial set of features x0 to xi and calculates 2: Average detection rate dr = 0
each feature’s information gain IG. If the feature xi has an informa- 3: Average accuracy ar = 0
tion gain value above or equal to 0.5, it is added to the important 4: Total detection rate td = 0
feature set Z. Each feature in the important feature set is denoted 5: while feature set count< m do
as zi . 6: Generate feature set Yi
7: Add important features Z to Yi
3.2.2. FeSA fitness function 8: end while
The fitness function used by FeSA calculates the average detec- 9: Calculate detection rate of Yi using:
tion rate and accuracy amongst all feature sets in the current gen- 10:
TP
True Positive Rate (TPR) = T P+ FN
eration. The highest performing feature sets which display above 11: Calculate accuracy of Yi using:
average detection rates and accuracy are passed onto the next gen- 12:
T PR
Accuracy (ACC) = T PR+T N+F P+F N
eration by the fitness function. The fitness function is used in the 13: Calculate average detection rate dr using:

base layer and the subsequent genetic layer. Our fitness function m
T PR (Y )
14: dr = i=0 m i
uses the ranker’s values, but indirectly as opposed to directly in
15: Calculate average accuracy ar using:
its calculations. The ranker will enforce features with the highest  m
ACC (Y )
information gain and eliminate features with no information gain, 16: ar = i=0 m i
thus ensuring that the feature sets produced in the base and ge- 17: for Y0 ..., Yi do
netic layers will provide as much information as possible while re- 18: if detection rate & accuracy of Yi > dr & ar then
moving excess features that provide no information. 19: Add Yi to high performance feature sets H
20: end if
3.2.3. FeSA base layer 21: end forreturn H
The FeSA base layer acts as the initial population generation re-
quired by a genetic algorithm. The base layer randomly generates True Positive (TP), False Positives (FP), False Negatives (FN), True
feature sets from a pool of initial features. The initial population Negatives(TN).
is a requirement of a genetic algorithm and is needed in order to Algorithm 2 shows the operation of the FeSA architecture base
generate strong feature sets in the genetic layer; the main differ- layer. The initial features are taken, and new feature sets are gen-
ence between the base layer and a regular population layer is that erated, including the important feature sets Z, denoted as Yi . The
the ranker has already defined a set of features that are enforced important features from the feature ranker are added to each
in each generated feature set. The ranker enforcing important fea- generated feature set. The average detection rate and accuracy
tures in the base layer feature sets means the base layer feature of every feature set generated, Yi , is calculated, and if it shows
sets will already have higher accuracy than if the feature sets were above-average performance, it is placed in the H, the set of high-
randomly generated. The base layer follows on from the ranker. performance features.
r
|N | = ( · T ( f )) + |Z | (3)
100 3.2.4. FeSA genetic layer algorithm
The number of features selected per feature set is shown in Eq. The FeSA genetic layer acts as the crossover phase in a ge-
(3), where r is the proportion of the initial feature pool in each netic algorithm. The genetic layer is needed to produce strong
generated feature set and N is a feature set generated by the base feature sets. The feature sets are expected to reach optimal per-
layer. N is calculated as r divided by 100, which obtains a propor- formance after the crossover phase has been completed multi-
tion of T ( f ), T ( f ) being all the of the features in the initial fea- ple times. The feature sets produced in the genetic layer will be
ture pool plus |Z | which is the important feature set chosen by the candidates for the optimal feature set. The genetic layer contains
ranker. the high-performance feature sets from the base layer and will
This process of generating feature sets is repeated as often as combine these high-performance feature sets using the uniform
the user defines and will define the population size for each gen- crossover method to yield more accurate feature sets. The genetic
eration. The number of repetitions will be balanced with a defined layer has the advantage of enforcing important features in each
size for each feature set. There are many features in the feature feature set; therefore, the iterations needed for the feature sets to
pool; therefore, possible combinations should be heavily regulated reach optimal performance is reduced in theory.

6
D.W. Fernando and N. Komninos Computers & Security 116 (2022) 102659

The genetic selection layer is a breeding mechanism for the 4. Experimental setup
highest performing feature sets taken from the initial feature selec-
tion layer. The genetic layer is made up of ”parent” and ”offspring” Our experiments’ main aim was to test the FeSA architecture’s
feature sets. The ”parent” feature sets are the high-performance effectiveness and compare it to a genetic and an evolutionary-
feature sets from the base layer. The ”offspring” feature sets are based feature selection algorithm. The architecture is evaluated in
produced by choosing two-parent feature sets and combining them scenarios where the test samples display concept drift and do not
with a crossover function. High performing ”parent” feature sets behave according to what the classifier expects and compare it
will be combined using uniform crossover, generating ”offspring” with other similar feature selection algorithms. FeSA has also com-
feature sets. In theory, the offspring feature sets will display a pared a greedy stepwise algorithm, genetic search, evolutionary
higher performance level than the preceding generation. Uniform search, best-first search and harmony search. Table 2 shows the
crossover takes two-parent feature sets and combines them. For formulas used to calculate the performance metrics. We refer to
each corresponding feature in each parent feature set, the feature the TPR of ransomware as detection rate throughout this paper.
the offspring feature set receives is determined by a coin-flip; this
is a probability of 0.5. The crossover function used by FeSA is user- 4.1. Testbed
defined; however, for our purpose and need to enforce particular
features into feature sets, the uniform crossover function proved to 4.1.1. Environment
be the most efficient. The resulting offspring feature sets are eval- Our test environment consists of a Cuckoo sandbox analysis en-
uated, and the feature set with the highest average detection rate vironment that generates the data used for our datasets. Each ran-
and overall accuracy is chosen as the optimal feature set. An im- somware and benign executable in our dataset was executed in a
portant factor in this phase is that only one generation generates virtual machine running Windows 7 in VirtualBox. The virtual ma-
the optimal feature set. The structure of the genetic layer is shown chines were cloaked and hardened by VMCloak and Paranoid Fish.
in Algorithm 3. The virtual machines were hardened to make them look and be-
have as close to a physical machine as possible. The hardening pro-
Algorithm 3 FeSA Genetic Layer cess was undertaken due to modern malware using anti-sandbox
technology to prevent proper execution in a sandbox environment.
Input: High performance feature sets H
Each execution was limited to two minutes; this was the default
Output: Optimal feature set Oi
for Cuckoo Sandbox. The machine learning platform used for the is
1: m → Max feature set count WEKA(Waikato Environment for Knowledge Analysis), a collection
2: n → Current feature set count of machine learning algorithms for data analysis.
3: Offspring feature set T
4: while n < m do 4.1.2. Data
5: Select random base feature set 1 Hrand1 from set H The ransomware samples used are from 2013 to 2019; the sam-
6: Select random base feature set 2 Hrand2 from set H ples from 2013 to 2015 were gathered using the Elderan dataset
7: Perform uniform crossover using Hrand1 and Hrand2 & Gener- (Sgandurra et al., 2016a), which contained a list of hashes for each
ate mixed feature set Oi ransomware sample they used. These samples are used as there
8: if Duplicate features are detected thenReplace duplicate fea- was a wide range of ransomware from 2013 to 2015. The samples
ture with a random feature xi from feature pool from 2016 to 2019 were gathered based on popularity and how
9: end if much each ransomware family has made in ransoms. The dataset
10: Add Oi to offspring set T consists of 639 ransomware files and 531 benign files; the benign
11: end while files are a mix of windows executables that includes legitimate
12: return Optimal Oi ∈ T which has the highest average detection software that behaves similarly to ransomware, such as AxCrypt,
rate & accuracy. Bitlocker 7zip and VerCrypt. Our experiments are carried out using
a random forest and 10-fold cross-validation. The random forest al-
gorithm is used because it is the algorithm that performs the best
Algorithm 3 shows the genetic layer of the FeSA architec-
with our API features. Our base data set contains 400 benign and
ture. The High-performance feature sets from the base layer. Two
531 ransomware samples. New ransomware files and new benign
random high-performance feature sets are selected Hr and1 and
files were added for each round of experiments in concept drift
Hr and2, and uniform crossover is carried out to mix the two high-
conditions. Each round of experiments uses the most prominent
performance feature sets to create a new feature set Oi . The pro-
ransomware samples from 2017 to 2019. API (Application Program-
cess of mixing the high-performance feature sets is repeated m
ming Interface) data is extracted for each sample. The API calls dic-
times until completion. The best performing of these newly gen-
tate how an executable interacts with the OS and what functions
erated feature sets is stored in set T . The best performing feature
an executable invokes.
set out of the newly generated feature sets in T is selected as op-
This research uses a nonconformity measure to prove concept
timal.
drift exists in the datasets used. The credibility p-value of each
prediction is to measure the credibility of each prediction a clas-
3.3. Mutation sifier makes. The p-value measures the proportion of instances,
which are as different or more different from the rest of the in-
The mutation function in a genetic algorithm is the introduction stances in the dataset as the new instance z. A high credibility
of diversity. A mutation would mean an offspring feature set in- value means that z is very similar to the objects in the class chosen
heriting a feature not present in either parent feature set in a fea- by the classifier, and low credibility would imply the opposite. The
ture selection context. During the crossover phase, duplicate fea- experiments were carried out using the random forest classifier;
tures are prohibited from being in a feature set. If there is a fea- therefore, the prediction probabilities extracted from the random
ture set with duplicate features, duplicates will be replaced with forest are used to calculate the p-values needed to prove drift. For
a random feature from the feature pool, leading to a 0.01% muta- example, it can be observed that, when trained on data from 2013–
tion rate. The low mutation rate is used to eliminate unnecessary 15, the average credibility of predictions on ransomware from 2015
randomness from the FeSA architecture. was 0.9. When the 2015 model is tested on ransomware from

7
D.W. Fernando and N. Komninos Computers & Security 116 (2022) 102659

Table 2
Performance metrics.

Metric Calculation Value


TP
TPR (True Positive Rate) / Recall Correct classification of Ransomware.
(T P + F N )
FP
False Positive Rate (FPR) Benign software classed as Ransomware.
FP + TN
False Negative Rate (FNR) 1-TPR Ransomware classed as benign.
TP
Precision Proportion of ransomware classifications, that are actually ransomware.
TP + FP

2016–17, predictions’ credibility drops to an average of 0.74. The 4.2.1. Detection phase
drop in credibility is present in every scenario in the series of ex- The detection phase of the framework is tested in the experi-
periments carried out. There is an average drop in the credibility mental test phase. The optimal feature set chosen by FeSA is tested
of predictions of 0.21. The drop in credibility shows that the clas- on a dataset made up of ransomware and benign files from a fu-
sifier becomes more uncertain of its predictions, which indicates ture time to the training data. Our detection phase simulates the
the data is behaving in a way it is not prepared for; this indicates system coming into contact with ransomware which displays con-
concept drift. cept drift and is from a different distribution and may behave dif-
ferently to what the classifier expects of ransomware. The break-
down of the datasets is described in Section 4.2. The detection
4.1.3. Features
phase is repeated for the algorithms FeSA is compared with.
The feature pool of 320 features consisted of API calls used by
Windows programs during execution. The 320 features fall into API
4.2.2. Genetic search algorithm
16 call categories, which the feature set sizes are based on. We
The genetic algorithm used as a benchmark for our experiments
aim to capture two features per category on average; however, this
was a basic genetic algorithm that followed the structure described
does not always prove the case due to the natural selection mech-
in section 3.1. The experiments used the configuration suggested
anism. We choose to capture two features from each category to
by WEKA genetic search algorithm (2020), a generational and pop-
limit the feature size and complexity of the crossover phase.
ulation limit of 20, a crossover probability of 0.6, and a muta-
tion probability of 0.033. The default settings in both the genetic
4.2. Experiments search and evolutionary meant these algorithms had a significant
advantage over FeSA in population generation and feature set size.
Our experiments are set up to test the strength of the feature The machine learning algorithm used with the Genetic Search fea-
sets produced by FeSA in concept drift scenarios. The experiments ture selection algorithm is the random forest to maintain consis-
compare the performance of the FeSA feature sets with other tency and fairness compared with the FeSA algorithm. The ge-
nature-inspired feature selection algorithms. The datasets used in netic algorithm used in the experiments used both overall accu-
these experiments are structured to display real-life concept drift racy and overall information gain as fitness functions as provided
scenarios, and the validity of the concept drift in these datasets are by WEKA and found minimal difference between the two; there-
tested by p-values, as mentioned in Section 4. The concept drift ef- fore, the overall accuracy was chosen as it was closest to the fit-
fect is achieved by having ransomware and benign software from ness function used by FeSA.
different periods. Each classifier is tested on data produced after
the data the classifier is trained on. The process runs on our base 4.2.3. Evolutionary search algorithm
dataset that contains ransomware and benign samples from 2013 As presented in WEKA, the evolutionary algorithm also fol-
to 2015. The optimal feature set is produced, and FeSA trains a ran- lowed a similar structure to the genetic algorithm described in
dom forest using 10-fold cross-validation and observes the results. Section 3.1; however, it uses a different configuration. The evolu-
The benchmark algorithms are tuned and run by us. The settings tionary algorithm used a tournament selection method with a mu-
used for the benchmark algorithms are tuned to compare them to tation probability of 0.1. The tournament selection approach en-
FeSA as fairly as possible. The closest algorithms to FeSA, the ge- sures that the fittest feature sets are passed onto the next gen-
netic search and the evolutionary search, are given an advantage eration. The generation and population limit were set to 20, re-
over FeSA, in which they use more generations to generate their spectively, like the genetic algorithm. The machine learning algo-
feature sets. The underlying classification algorithm used with FeSA rithm used with the evolutionary algorithm feature selection is the
and the benchmark feature selection algorithms is the random for- random forest to maintain consistency and fairness compared with
est classifier. FeSA’s results are compared to the results obtained the FeSA algorithm. The fitness function used for the evolutionary
from the greedy stepwise algorithm, genetic search, evolutionary algorithm is the overall accuracy to maintain consistency with the
search, best-first search and harmony search. After observing re- FeSA algorithm.
sults on data the classifiers would see as up to date data, FeSA tests
how the classifiers perform on ransomware and benign data from 5. Experimental results and discussion
2016 and 2017. Ransomware and benign samples from 2016 and
2017 would represent concept drift as their behavioural patterns 5.1. Experimental results
are different, as per our observations. The process of observing de-
tection rates under concept drift is repeated by training on data This section explores and elucidates our results. In our experi-
from 2013–2017 and testing on data from 2018 and again, repeated ments, the FeSA architecture used the random forest classifier with
for data up to 2019. It is observed how the feature sets produced 10-fold cross-validation. Our implementation used the WEKA API
by the FeSA architecture perform compared to the feature sets pro- and an in-house feature extraction program to create our dataset.
duced by the greedy stepwise algorithm, genetic search, evolution- The experimental findings are presented in Tables 3, 4 and Fig. 2.
ary search, best-first search and harmony search. The results of our Table 3 shows the key statistics of the detection algorithms tested,
experiments are shown in Section 5. including the FeSA architecture when concept drift is not applied.

8
D.W. Fernando and N. Komninos Computers & Security 116 (2022) 102659

Table 3
Experimental results without concept drift.

FeSA Best First Evolutionary Genetic Search Greedy Stepwise Harmony Search
Search

Time O(n ) + O(gnm )) O(n · Log(n )) O(gnm )) O(gnm )) O ( n )2 O (n )


Complexity
Feature Count 32 20 110 106 20 33
Trained and · Detection: · Detection: · Detection: · Detection: · Detection : · Detection:
tested on 13–15 96.3% · FPR: 5.8% 92.0% · FPR: 6.7% 95.7% · FPR: 4.4% 93.2% · FPR: 6.2% 90.4% · False 93.0% · FPR: 5.9%
· Precision: 0.942 · Precision: 0.940 · Precision: 0.955 · Precision: 0.942 Positive Rate: · Precision: 0.944
· Recall: 0.941 · Recall: 0.940 · Recall: 0.955 · Recall: 0.942 4.4% · Precision: · Recall: 0.944
0.963 · Recall:
0.936
Trained and · Detection: · Detection : · Detection: · Detection: · Detection: · Detection:
tested on 13–17 96.7% · FPR: 5.8% 90.1% · FPR: 8.0% 95.3% · FPR: 5.2% 95.3% · FPR: 5.2% 90.1% · FPR: 8.0% 93.0% · FPR: 5.9%
· Precision: 0.942 · Precision: 0.931 · Precision: 0.948 · Precision: 0.948 · Precision: 0.931 · Precision: 0.944
· Recall: 0.941 · Recall: 0.932 · Recall: 0.948 · Recall: 0.948 · Recall: 0.932 · Recall: 0.944
Trained and · Detection: · Detection: · Detection: · Detection: · Detection: · Detection:
tested on 13–18 96.3% · FPR: 5.9% 91.5% · FPR: 7.0% 94.7% · FPR: 5.4% 91.5% · FPR: 6.5% 91.5% · FPR: 7.7% 95.0% · FPR: 5.5%
· Precision: 0.941 · Precision: 0.935 · Precision: 0.946 · Precision: 0.927 · Precision: 0.927 · Precision: 0.945
· Recall: 0.940 · Recall: 0.936 · Recall: 0.946 · Recall: 0.927 · Recall: 0.927 · Recall: 0.945

Table 4
Experimental results under concept drift.

FeSA Best First Evolutionary Genetic Search Greedy Stepwise Harmony Search
Search

Feature Count 32 20 110 106 20 33


13–15 Tested on · Detection: · Detection: · Detection: · Detection: · Detection: · Detection:
2016/17 93.2% · FPR: 7.4% 77.4% · FPR: 2.7% 86.7% · FPR: 1.4% 73.5% · FPR: 4.4% 79.2% · FPR: 2.4% 83.0% · FPR: 1.4%
· Precision: 0.913 · Precision: 0.940 · Precision: 0.968 · Precision: 0.961 · Precision: 0.946 · Precision: 0.961
· Recall: 0.846 · Recall: 0.942 · Recall: 0.968 · Recall: 0.919 · Recall: 0.948 · Recall: 0.962
13–17 Tested on · Detection: 100% · Detection: · Detection: 100% · Detection: 100% · Detection: 100% · Detection:
2018 Data · FPR: 3.4% · 97.8% · FPR: · FPR: 5.5% · · FPR: 1.4% · · FPR: 2.4% · 54.0% · FPR: 1.4%
Precision: 0.917 · 2.22% · Precision: Precision: 0.943 · Precision: 0.989 · Precision: 0.982 · · Precision: 0.922
Recall: 0.786 0.979 · Recall: Recall: 0.936 Recall: 0.988 Recall: 0.979 · Recall: 0.926
0.976
13–18 Tested on · Detection: · Detection: · Detection : · Detection : · Detection: · Detection:
2019 Data 93.5% · FPR: 3.4% 87.1% · FPR: 90.3% · FPR: 1.4% 87.1% · FPR: 1.4% 87.1% · FPR: 3.1% 83.9% · FPR: 1.7%
· Precision: 0.917 11.9% · Precision: · Precision: 0.979 · Precision: 0.975 · Precision: 0.963 · Precision: 0.969
· Recall: 0.944 0.968 · Recall: · Recall: 0.978 · Recall: 0.975 · Recall: 0.960 · Recall: 0.969
0.966

Fig. 2. Detection rates drop-off.

9
D.W. Fernando and N. Komninos Computers & Security 116 (2022) 102659

Table 5
Experimental results with imperial dataset.

FeSA Best First Evolutionary Genetic Search Greedy Stepwise Harmony Search
Search

Feature Count 16 258 160 106 14 36


Trained and · Detection: · Detection: · Detection: · Detection: · Detection : · Detection:
tested on 12–13 99.3% · FPR: 5.1% 98.0% · FPR: 6.9% 95.5% · FPR: 9.6% 92.2% · FPR: 5.6% 98.0% · False 81.0% · FPR:
· Precision: 0.931 · Precision: 0.921 · Precision: 0.920 · Precision: 0.955 Positive Rate: 12.7% · Precision:
· Recall: 0.918 · Recall: 0.910 · Recall: 0.920 · Recall: 0.955 6.9% · Precision: 0.904 · Recall:
0.921 · Recall: 0.901
0.909
Trained and · Detection: · Detection : · Detection: · Detection: · Detection: · Detection:
tested on 2014 97.5% · FPR: 8.4% 83.9% · FPR: 92.4% · FPR: 6.2% 89.0% · FPR: 9.5% 84.7% · FPR: 95.8% · FPR:
· Precision: 0.918 13.2% · Precision: · Precision: 0.939 · Precision: 0.905 13.5% · Precision: 11.1% · Precision:
· Recall: 0.911 0.871 · Recall: · Recall: 0.939 · Recall: 0.907 0.866 · Recall: 0.893 · Recall:
0.870 0.866 0.883
Trained and · Detection: · Detection: · Detection: · Detection: · Detection: · Detection:
tested on 2015 97.1% · FPR: 94.7% · FPR: 96.2% · FPR: 9.5% 95.1% · FPR: 94.7% · FPR: 94.3% · FPR:
12.8% · Precision: 16.1% · Precision: · Precision: 0.927 13.6% · Precision: 16.1% · Precision: 14.4% · Precision:
0.912 · Recall: 0.881 · Recall: · Recall: 0.926 0.898 · Recall: 0.881 · Recall: 0.889 · Recall:
0.909 0.879 0.897 0.879 0.888

Table 6
Imperial dataset with concept drift.

FeSA Best First Evolutionary Genetic Search Greedy Stepwise Harmony Search
Search

Feature Count 16 258 160 106 14 36


Trained 2013 · Detection: · Detection: · Detection: · Detection: · Detection : · Detection:
and tested on 96.6% · FPR: 95.8% · FPR: 8.5% 87.1% · FPR: 7.0% 89.8% · FPR: 5.3% 95.8% · False 94.1% · FPR: 9.8%
2014 11.0% · Precision: · Precision: 0.151 · Precision: 0.941 · Precision: 0.956 Positive Rate: · Precision: 0.902
0.895 · Recall: · Recall: 0.911 · Recall: 0.953 · Recall: 0.951 8.5% · Precision: · Recall: 0.899
0.883 0.911 · Recall:
0.915
Trained on 2014 · Detection: · Detection : · Detection: · Detection: · Detection: · Detection:
and tested on 91.4% · FPR: 57.4% · FPR: 86.6% · FPR: 9.4% 81.3% · FPR: 59.3% · FPR: 89.5% · FPR:
2015 16.0% · Precision: 21.1% · Precision: · Precision: 0.899 10.9% · Precision: 19.4% · Precision: 15.8% · Precision:
0.868 · Recall: 0.789 · Recall: · Recall: 0.891 0.88 · Recall: 0.805 · Recall: 0.861 · Recall:
0.867 0.709 0.862 0.726 0.861

Table 7
Navarra university dataset with concept drift.

FeSA Best First Evolutionary Genetic Search Greedy Stepwise Harmony Search
Search

Feature Count 32 97 114 109 106 111


Tested on · Detection: · Detection: · Detection: · Detection: · Detection : · Detection:
Training 99.9% · FPR: 0.4% 99.4% · FPR: 0.4% 99.8% · FPR: 0.4% 99.9% · FPR: 0.4% 99.6% · FPR: 0.4% 99.8% · FPR: 0.4%
Distribution · Precision: 0.998 · Precision: 0.998 · Precision: 0.998 · Precision: 0.998 · Precision: 0.996 · Precision: 0.998
· Recall: 0.998 · Recall: 0.998 · Recall: 0.998 · Recall: 0.999 · Recall: 0.996 · Recall: 0.998
Tested on · Detection: · Detection: · Detection: · Detection: · Detection : · Detection:
Zero-Day 85.1% · FPR: 78.9% · FPR: 78.1% · FPR: 78.8% · FPR: 87.7% · FPR: 78.1% · FPR:
Distribution 14.9% · Precision: 21.1% · Precision: 21.9% · Precision: 21.2% · Precision: 12.3% · Precision: 21.9% · Precision:
0.999 · Recall: 0.989 · Recall: 0.989 · Recall: 0.956 · Recall: 0.998 · Recall: 0.998 · Recall:
0.999 0.989 0.989 0.951 0.998 0.998

The experiments’ first aim was to ensure that FeSA was a viable sults showed an increase in the detection rate. The detection rate
feature extraction without considering concept drift. FeSA must is specifically the rate of correctly identified ransomware samples.
function as a normal feature selection algorithm before it is tested The false-positive rate, precision, and recall are based on the sys-
on an evolving concept. Based on the results in Table 3, FeSA ar- tems’ overall performance, including the benign samples.
chitecture produces features robust in time. Table 4 shows the per- Table 4 shows the average reduction in detection rate under
formance of FeSA and the algorithms FeSA is compared with when concept drift; it is observed that a feature selection algorithm that
under concept drift. The experiments aimed to demonstrate that pinpoints distinguishing features can help significantly reduce the
the FeSA architecture will be superior when exposed to concept effects of concept drift on a classifier. Table 3 shows that FeSA ar-
drift, and we conclude this has been achieved. Our initial obser- chitecture maintains a detection rate above 96% and a false pos-
vations are as expected, that the effect of concept drift degrades itive rate close to the greedy stepwise algorithm, genetic search,
a ransomware classifier, as it would degrade any other classifier evolutionary search, best-first search and harmony search. The first
which works in a rapidly changing environment. Our second ob- set of experiments simulates a scenario where the samples ad-
servation is that using nature-based feature selection algorithms here to the current concept, and the test samples do not stray
helps slow the degradation of detection rate and accuracy caused from the statistical rules the model has created for differentiat-
by concept drift. Fig. 2 provides a visual representation of the clas- ing ransomware and benign software. Our results presented in
sifiers’ performance degradation trained and tested under concept Table 3 show that FeSA is a viable feature selection algorithm for
drift. Fig. 2 does not consider the testing on 2018 data, as the re- training a system and is not necessarily viable for systems prone

10
D.W. Fernando and N. Komninos Computers & Security 116 (2022) 102659

to concept drift. The FeSA architecture achieved stronger detection The statistic to be improved is the false-positive rate. FeSA main-
rates across the periods it has been deployed in and consistently tains a false positive rate between 5.8 and 5.9%, whereas evolu-
outperforms the feature selection algorithms it has been compared tionary search achieves a false positive rate as low as 4.4% for the
to. Most importantly, it can be observed that the FeSA architec- 2013–2015 data. FeSA has a more competitive false-positive rate in
ture outperforms the evolutionary search and the genetic search 2013–2017 and 2013–2018 data sets, but it does not achieve the
while using significantly fewer features and generations of feature lowest in either test, as evolutionary search achieves the lowest
sets. There is an emphasis on detection because it is the most im- false-positive rate in each test; however, in all tests demonstrated
portant statistic; however, our false-positive rates are competitive in Table 3, FeSA yields the highest detection rate maintaining a de-
with other feature selection algorithms. It can be observed that the tection rate above 96% consistently, this is a positive sign as the de-
FeSA architecture can generate a strong feature set for a random tection rate is what is viewed as most important for ransomware
forest classifier while using only two generations, compared to the detection. Once again, the algorithms compared to the evolutionary
20 generations used by the genetic and evolutionary search algo- search achieves a consistent detection rate close to FeSA. Fig. 2 and
rithms. Table 4 show that the performance drop off is reduced compared
The experimental results for a detection scenario that intro- to other popular feature selection algorithms when encountering
duces concept drift is presented in Table 4. A concept drift scenario concept drift. Table 4 shows that the FeSA architecture produces
is when the test samples do not adhere to the statistical rules and a feature set that achieves the highest or joint-highest detection
properties the classifier has learned to differentiate ransomware rate during the three concept drift scenarios tested. In the con-
and benign software, in Table 4. It is observed that the most con- cept drift scenarios, FeSA is consistent and maintains a minimal
sistent algorithms are the genetic search, evolutionary search and and consistent drop off in detection rate, whereas the algorithms
our feature selection architecture, FeSA. Our experiments use the FeSA is compared to behave erratically, showing a much steeper
same data and concept drift scenario for each algorithm while ob- drop-off in detection rate. An example of erratic behaviour would
serving each classifier’s performance changes. The classifier’s accu- be the harmony-search dataset, which displays an initial detec-
racy and detection rate fell in each scenario except being trained tion rate of 93% for 13–17 data, as seen in Table 3. However,
on data from 2013 to 2017 and tested on data from 2018. The fea- when exposed to data from 2018, the detection rate plummets to
ture set generated by FeSA maintains a detection rate above 93% in 54%. Table 4 shows that the FeSA false positive rate remains com-
all our concept drift scenarios, which is higher than all the other petitive, yet it is greater than the false-positive rates of genetic
approaches it is compared to. The results presented in Table 4 gives search and evolutionary search. We believe the false-positive rate
us a promising base to further build on for dealing with concept increases due to the limits on feature set sizes and the genera-
drift in ransomware detection systems. The discrepancy in the year tion of feature sets produced. Evolutionary and genetic search that
2018 may explain the behavioural changes in ransomware inadver- FeSA competes with do not have constraints on feature set size
tently benefiting our API based feature pool. The key observation and have a higher limit on generations of feature sets it can pro-
made from Table 4 is that the average reduction in detection rate duce, allowing them to produce feature sets with higher accuracy
in concept drift is the lowest in our feature selection algorithm. and reduced false-positive rates. The constraints on FeSA might
The other feature sets suffer a higher average reduction in detec- limit its ability to capture a full picture of benign behaviour; how-
tion rate, which is the key statistic. Fig. 2 shows the average reduc- ever, this can be improved in the future. Fig. 2 shows the drop-
tion in detection rate by each approach; this does not consider the off in the detection rate under concept drift for a random for-
anomalous behaviour in the 2018 dataset. In terms of false posi- est trained on the feature sets suggested by different feature se-
tives, our approach appears to struggle marginally more than the lection algorithms. It is observed that the random forest trained
other approaches, which would require further research. Our ap- on the FeSA features experiences an average reduction in the de-
proach appears highly effective when generating a feature set that tection rate of 3%. The x-axis of Fig. 2 shows which algorithms
can identify ransomware. The maintenance of the high detection have been tested with and against, and the y-axis shows the av-
rate is the key statistic in malware, especially ransomware. Besides erage reduction in the detection rate under concept drift. The high
ours, the best performing algorithm is the evolutionary search; detection rates that FeSA shows in Table 3 and the maintenance
however, it required 20 generations to reach its optimal solution of a high detection rate shown in Table 4 are because FeSA en-
with a population size of 20 per generation. Our solution uses an forces fundamental features in distinguishing ransomware from be-
initial population of 32 feature sets and one generation of offspring nign software while being combined with the already proven pro-
with a population size of 64 feature sets. Our feature sets are also cess of natural selection via a genetic algorithm. The detection
significantly smaller than the genetic and evolutionary search al- rate’s importance is stressed because of the damaging effects ran-
gorithms’ optimal feature sets. somware can have on any system it infects. False positives, in the
Our initial population’s average accuracy and detection rate are case of ransomware, are significantly more damaging than false
77%, and the first generation’s average accuracy and detection rate negatives. The experiments use the random forest classifier be-
rise to an average of 94%. The accuracy and detection are not ex- cause this algorithm, in particular, works well with the API-based
pected to increase similarly with increased feature set generations; feature set; it can distinguish the difference between benign files
however, a marginal increase is expected if the number of gener- and ransomware effectively and consistently. Our research has ex-
ations created is higher than one generation. Our experiments use plored the use of different algorithms, similar to the approaches
one initial population and one generation to demonstrate the po- used in Chen et al. (2019a); Clement (2019); Hwang et al. (2020);
tential of this scenario. Khan et al. (2020); Seong et al. (2019); Sgandurra et al. (2016a);
Shaukat and Ribeiro (2018); Zuhair et al. (2019), but the random
5.2. Discussion forest proved the most effective to use as an underlying algorithm.
Our use of API calls can be expanded to optimise the capabilities of
Our results show promise that the FeSA architecture can pro- a genetic approach by incorporating the use of static and network
vide effective and accurate machine-learning detection for evolv- features. Regardless of how expansive the feature set is, the main
ing ransomware. Regarding situations that do not involve concept shortcoming of this approach is that it cannot actively react to con-
drift, FeSA proves to be an effective feature selection algorithm. cept drift and will need further work to incorporate a mechanism
Table 3 demonstrates that the feature set provided by FeSA yields that allows the system to react to concept drift. In the system’s
competitive figures in terms of false positives, recall and, precision. current state, it is proactive to combat concept drift and shown its

11
D.W. Fernando and N. Komninos Computers & Security 116 (2022) 102659

effectiveness; however, a mechanism that allows the system to be tion rates under concept drift as expected under normal conditions
reactive to concept drift is necessary. Overall, FeSA achieved what with low false-positive rates when building on our system.
it sets out to do; the use of FeSA will provide a machine learning
detection system with a robust feature set that will show consis- Declaration of Competing Interest
tent performance under concept drift. FeSA reduces the need for
constant re-training and can increase the time intervals between We declare that this research paper is not under considera-
re-training an intrusion detection machine-learning system. tion for publication elsewhere, that its publication is approved by
all authors and tacitly or explicitly by the responsible authorities
5.3. Alternative datasets where the work was carried out, and that, if accepted, it will not
be published elsewhere in the same form, in English or in any
We have carried out experiments with our framework other language, including electronically without the written con-
on two alternative datasets produced by the researchers in sent of the copyright-holder.
Sgandurra et al. (2016a) and Carson (2007). This dataset produced
by Sgandurra et al. contains ransomware files found between 2012 CRediT authorship contribution statement
and 2015 and has a feature count of over 30,0 0 0. The feature set
produced by Sgandurra et al. contained static strings and directory Damien Warren Fernando: Conceptualization, Methodology,
specific features, which meant some features were exclusive to the Validation, Formal analysis, Writing – original draft, Writing – re-
test machines used by the researchers. The feature set also con- view & editing, Visualization, Supervision, Project administration,
tained API calls and drops, which we retained to carry out exper- Funding acquisition. Nikos Komninos: Conceptualization, Method-
iments, as API calls and drops were not exclusive to the machine ology, Software, Validation, Formal analysis, Resources, Data cura-
the ransomware had run on. The experimental results of the im- tion, Writing – original draft, Writing – review & editing, Visual-
perial dataset are presented in Tables 5 and 6. The experimental ization.
results demonstrate that the binary nature of the feature set is not
optimal, and the rate of false positives is high for all the feature Acknowledgments
selection algorithms used in the experiments. FeSA performs the
best overall, crucially maintaining performance from year to year We would like to acknowledge the following contributors:
instead of the majority of the other feature selection algorithms. Funding: This work was supported by Department of Computer
The dataset produced by Berrueta et al. contains data from 70 ran- Science, School of Mathematics, Computer Science and Engineer-
somware strains, with features constructed from network data. The ing, City, University of London.
ransomware strains are taken from 2015 to 2019, and the dataset Reviewers: We would like to thank the anonymous reviewers
is structured to allow testing to be done on zero-day ransomware at Elsevier for providing feedback on the initial submission of this
strains. The experiments on this data are shown in Table 7, and we research paper.
observe that FeSA performs well compared to all of the alterna-
tive feature selection algorithms besides the Greedy-Stepwise ap- References
proach. We observe FeSA achieves strong detection results on zero-
Brownlee J.. A gentle introduction to concept drift in machine learning. 2017.
day ransomware in this dataset with significantly fewer features [2017] [Accessed 10/10/2020] Available at: https://machinelearningmastery.com/
and performs consistently well across the three datasets that we gentle-introduction-concept-drift-machine-learning/.
have evaluated. Carson H.. Cyberhawk. 2007. [2007] [Accessed 08/11/2021] Available at: http://www.
kickstartnews.com/reviews/utilities/cyberhawkzerodaythreadetection.html.
Chen, L., Yang, C.Y., Paul, A., Sahita, R., 2019a. Towards Resilient Machine Learning
6. Conclusion and future work For Ransomware Detection. KDD, Alaska. 2019, Aug 04-08
Clement J.. Ransomware - statistics & facts. 2019. [2019] [Accessed 05/10/2020]
Available at: https://www.statista.com/topics/4136/ransomware/.
To conclude, Our research has demonstrated that using a fea- Collins T.. Medusalocker ransomware will bypass most antivirus software. 2019.
ture selection algorithm can combat the effects of concept drift [2019] [Accessed 18/06/2021] Available at: https://www.secplicity.org/2020/05/
in a classification system. Our research has also demonstrated that 19/medusalocker-ransomware-will-bypass-most-antivirus-software/.
Cook S.. 2018–2020 ransomware statistics and facts. 2020a. [2020] [Ac-
FeSA is an effective feature selection algorithm for ransomware de-
cessed 05/10/2020] Available at: https://www.comparitech.com/antivirus/
tection under concept drift. Our research uses a wide array of be- ransomware-statistics/.
nign and ransomware files to simulate concept drift, showing its Cook S.. Malware statistics and facts for 2020. 2020b. [2020], [Accessed 10/10/20],
existence in the ransomware detection space and how to remedi- Available at: https://www.comparitech.com/antivirus/malware-statistics-facts/.
De Groot J.. A history of ransomware attack: the biggest and
ate its effects. Our system is evaluated realistically, and the results worst ransomware attack of all time. 2017. [2018] [Ac-
produced are promising, with the FeSA system outperforming the cessed 22/11/2018] Available at: https://digitalguardian.com/blog/
genetic search and evolutionary search algorithms. The FeSA sys- history-ransomware-attacks-biggest-and-worst-ransomware-attacks-all-time.
Fatima A., Maurya R., Dutta M.K., Burget R., Masek J.. Android malware detection us-
tem can maintain a high-performance level with fewer offspring ing genetic algorithm based optimized feature selection and machine learning.
feature sets and smaller set sizes. We acknowledge that the system In: Proceedings of the 42nd International Conference on Telecommunications
proposed would be a part of a system that would work as a com- and Signal Processing (TSP). 2019..
Folino, G., Pizzuti, C., Spezzano, G., 2006. GP ensembles for large-scale data classifi-
plete ransomware detection system. The feature set generation is a cation. IEEE Trans. Evol. Comput. 10 (5), 604–616.
proactive measure for concept drift in a detection system; however, Folino G., Pizzuti C., Spezzano G.. An adaptive distributed ensemble approach to
FeSA requires a mechanism to react to concept drift as no prepara- mine concept-drifting data streams. In: Proceedings of the 19th IEEE Interna-
tional Conference on Tools With Artificial Intelligence, ICTAI 2IEEE; 2007. p.
tion can fully prepare for every way ransomware may evolve. Our
183–188.
future work on this system will have to take the results and data Ghomeshi, H., Gaber, M.M., Kovalchuk, Y., 2019. EACD: evolutionary adaptation to
from this feature engineering approach and incorporate them into concept drifts in data streams. Data Min. Knowl. Discov. 33, 663–694. 2019
Goodchile J.. This is not your father’s ransomware. 2020. [2020] [Accessed
a concept drift adaption system. In the future, the aim is to com-
03/10/2020] Available at: https://www.darkreading.com/edge/theedge/
bine our feature selection algorithm with a mechanism that can this- is- not- your- fathers- ransomware/b/d- id/1337484.
classify unknown and drifting samples using the measured concept Hasan M., Rahman M.. Ranshunt: a support vector machines based ransomware
drift of a sample. A secondary objective is to use non-conformity analysis framework with integrated feature set. 2017.
Hayes, M., Walenstein, A., Lakhotia, A., 2009. Evaluation of malware phylogeny
and similarity measures to aid classifiers when the classifier is un- modelling systems using automated variant generation. J. Comput. Virol. 5 (4),
certain in its predictions. Our goal is to maintain the same detec- 335–343.

12
D.W. Fernando and N. Komninos Computers & Security 116 (2022) 102659

Hwang, J., Kim, J., Lee, S., 2020. Two-stage ransomware detection using dy- Vivekanandan, P., Nedunchezhian, R., 2011. Mining data streams with concept drifts
namic analysis and machine learning techniques. Wirel. Pers. Commun. 112, using genetic algorithm. Artif. Intell. Rev. 36 (3), 163–178.
2597–2609. WEKA genetic search algorithm. 2020. [2020] [Accessed 20/10/2020], Available at:
Jordaney R., Sharad K., Dash S.K., Wang Z., Papini D., Cavallaro L.. Transcend: detect- https://weka.sourceforge.io/doc.stable/weka/attributeSelection/GeneticSearch.
ing concept drift in malware classification models. In: Proceedings of the 26th html.
USENIX Security Symposium August 16–18, 2017. Vancouver; 2018.. Sophos Whitepaper. The rise of enterprise ransomware. 2019.
Jovanovic B.. Malware statistics - you’d better get your computer vaccinated. Zhou V.. A simple explanation of information gain and entropy. 2019. [2019] [Ac-
2019. [2019], [Accessed 11/10/20], Available at: https://dataprot.net/statistics/ cessed 10/11/2020], Available at: https://victorzhou.com/blog/information-gain/.
malware-statistics/. Zuhair, H., Selamat, A., Krejcar, O., 2019. A multi-classifier network-based crypto
Kantchelian A., Afroz A., Huang S., Islam A., Miller B., Tschantz M., Greenstadt R., ransomware detection system: a case study of locky ransomware. IEEE Access
Joseph A.D., Tygar J.D.. Approaches to adversarial drift. in AISec’13. In: Proceed- 7, 47053–47067.
ings of the 2013 ACM Workshop on Artificial Intelligence and Security. Berlin,
Germany: Co-located with CCS 2013; 2013. p. 99–110. November 4, 2013.
Kaspersky. What is a zero-day attack? - Definition and explanation. 2021. Further reading
[2021] [Accessed 08/10/2021] Available at: https://www.kaspersky.co.uk/
resource-center/definitions/zero-day-exploit. Gao, J., Fan, W., Han, J., Yu, P.S., 2007. A general framework for mining concept–
Khan, F., McNube, C., Lakshmana, R., Kadry, S., Nam, Y., 2020. A digital DNA se- drifting data streams with skewed distributions. In: Proceedings of the Seventh
quencing engine for ransomware detection using machine learning. IEEE Access SIAM International Conference on Data Mining. Minneapolis, Minnesota, USA.
8, 119710–119719. April 26-28
Maggi, F., Robertson, W.K., Krugel, C., Vigna, G., 2009. Protecting a moving tar- Mallawaarachchi V.. Introduction to genetic algorithms. 2017. [2017] [Ac-
get: addressing web application concept drift. In: Recent Advances in Intru- cessed 20/10/2020], Available at: https://towardsdatascience.com/
sion Detection, 12th International Symposium, RAID 2009. Saint-Malo, France, introduction-to-genetic-algorithms-including-example-code.
pp. 21–40. September 23-25, 2009. Proceedings Berrueta, E., Morato, D., Magaña, E., Izal, M., 2020. Open repository for the evalu-
Sanders A.. 15 (CRAZY) malware and virus statistics. Trends & Facts2020;[2020], ation of ransomware detection tools. IEEE Access 8, 65658–65669. doi:10.1109/
[Accessed 10/10/20], Available at: https://www.safetydetectives.com/blog/ ACCESS.2020.2984187. in
malware-statistics/. Thomas K., Grier C., Ma J., Paxon V., Song D.. Design and evaluation of a real-time
Saxena P.. Breed of MBR infecting ransomware - an analysis by quick heal security URL spam filtering service. In: Proceedings of the 32nd IEEE Symposium on Se-
labs. 2018. [2018] [Accessed 02/10/2020] Available at: https://blogs.quickheal. curity and Privacy, S&P 201122–25 May 2011. California, USA: Berkeley; 2011. p.
com/breed- mbr- infecting- ransomware- analysis- quick- heal- security- labs/. 447–462.
Seong, B., Gyu, L., Eul Gyu, I., 2019. Ransomware detection using machine learning
algorithms. Concurr. Comput. Pract. Exp.. Version September 28, 2020 submitted Damien Warren Fernando Damien W. Fernando received
to IoT 54 of 54 a M.Sci. in Computer Science and Cyber Security in 2017
Sgandurra, D., Munoz-Gonzalez, L., Mohsen, R., Lupu, E., 2016a. Automated dy- from City, University of London. Having worked at City
namic analysis of ransomware: Benefits. Limitations and use for Detection. CoRR University of London as a Teaching assistant since late
abs/1609.03020 2017, he is now a 3rdyear Ph.D. student at City, Uni-
Sgandurra D., Munoz-Gonzalez L., Mohsen R., Lupu E.. Automated dynamic anal- versity of London with an interest in researching Ran-
ysis of ransomware: Benefits Limitations and use for Detection. 2016b;CoRR somware. While being a part time employee of City, Uni-
abs/1609.03020. versity of London as a Teaching Assistant to the Cyber
Shaukat S., Ribeiro V.. Ransomwall: a layered defence system against cryp- Security course, Damien provides teaching support along
tographic ransomware attacks using machine learning. Proceedings of the with managing and upgrading the Cyber Security pene-
10th International Conference on Communication Systems and Network tration testing environment used by students.
(COMSNETS)2018;:356–363.
Singh A., Walentstein A., Lakhotia A.. Tracking concept drift in malware families.
Proceedings of the 5th ACM Workshop on Security and Artificial Intelligenceoc- Dr Nikos Komninos Dr Nikos Komninos received his
tober AISec ’122012;:81–92. Ph.D. in 2003 from Lancaster University (UK) in Infor-
Singhal S., Chawla U., Shorey R.. Machine learning & concept drift based approach mation Security. He is currently a Lecturer (US System:
for malicious website detection. In: Proceedings of the 12th International Con- Assistant Professor) in Cyber Security in the Department
ference on Communication Systems & Networks (COMSNETS)Bengaluru, India. of Computer Science at City University London. Between
2020.. 2003 and 2007, he was an honorary research fellow of the
Takeuchi Y., Sakai K., Fukumoto S.. Detecting ransomware using support vector ma- Department of Communication Systems at the University
chines. In: Proceedings of the 47th International Conference on Parallel Process- of Lancaster. He was also a visiting faculty member at the
ing Companion, ICPP ’18 August 13-16,. New York, NY, USA: Eugene, OR, USA. University of Cyprus and a faculty member at Carnegie
ACM; 2018. 6 pages. Mellon University in Athens (Athens Information Technol-
Tan G., Zhang P., Liu Q., Liu X., Zhu C., Dou F.. Adaptive malicious URL detection: ogy), between 2005 and 2013. Part of his research has
learning in the presence of concept drifts. In: Proceedings of the 17th IEEE In- been patented and used in mobile phones by Telecommu-
ternational Conference On Trust, Security And Privacy In Computing And Com- nication companies; in crypto-devicesby Defense compa-
munications/12th IEEE International Conference On Big Data Science And Engi- nies; and in healthcare applications by National Health Systems. Since 20 0 0, he has
neeringNew York, NY, USA. 2018.. participated, as a researcher or principal investigator, in a large number of European
VinayKumar, R., Soman, K.P., Senthil Velan, K.K., Ganorkan, S., 2017. Evaluating shal- and National R&D projects in the area of information security, systems and network
low and deep networks for ransomware detection and classification. In: Pro- security. He has authored and co-authored more than sixty journal publications,
ceedings of the International Conference on Advances in Computing. Communi- book chapters and conference proceedings publications in his areas of interest. He
cations and Informatics (ICACCI), Manipal, Karnataka. has been invited to give talks at conferences and Governmental Departments

13

You might also like