Machine Learning Algorithms and Frameworks in Ransomware Detection

This paper provides a review of machine learning algorithms and frameworks that are used for ransomware detection. It discusses common ransomware types and challenges in detecting ransomware. The paper analyzes existing ransomware detection frameworks that utilize machine learning algorithms, including the algorithms used, year of creation, results and challenges. It aims to serve as a reference for future ransomware detection work by consolidating information on current frameworks.

Uploaded by

Gnan Deep Reddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views14 pages

Machine Learning Algorithms and Frameworks in Ransomware Detection

Uploaded by

Gnan Deep Reddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Received 24 September 2022, accepted 27 October 2022, date of publication 1 November 2022, date of current version 11 November 2022.

Digital Object Identifier 10.1109/ACCESS.2022.3218779

Machine Learning Algorithms and Frameworks

in Ransomware Detection
DARYLE SMITH , SAJAD KHORSANDROO , AND KAUSHIK ROY
Department of Computer Science, North Carolina A&T State University, Greensboro, NC 27411, USA
Corresponding author: Daryle Smith ([email protected])

ABSTRACT Ransomware has been one of the biggest cyber threats against consumers in recent years. It can
leverage various attack vectors while it also evolves in terms of finding more innovative ways to invade
different cyber security systems. There have been many efforts to detect ransomware within the workforce
and academia leveraging machine learning algorithms, which has shown promising results. Accordingly,
there is a considerably large body of literature addressing various solutions on how ransomware threats
can be detected and mitigated. Such large and rapidly growing scientific and technical materials start to
make it difficult in knowing the actual ML algorithm(s) being used. Hence, the aim of this paper is to give
insight about ransomware detection frameworks and those ML algorithms which are typically being used to
extract ever-evolving characteristics of ransomware. In addition, this study will provide the cyber security
community with a detailed analysis of those frameworks. This will be augmented with information such as
datasets being used along with the challenges that each framework may be faced with in detecting a wide
variety of ransomware accurately. To summarize, this paper delivers a comparative study which can be used
by peers as a reference for future work in ransomware detection.

INDEX TERMS Artificial Neural Network (ANN), cyber security, deep convolutional neural network
(DCNN), deep neural network (DNN), Hardware Performance Counter (HPC), Long Short Term Memory
(LSTM), machine learning (ML), ransomware, Recurrent Neural Network (RNN), Sum of Product (SOP),
Support Vector Machine (SVM), Term Frequency and Inverse Document Frequency (TF-IDF), The Onion
Routing (TOR).

I. INTRODUCTION locker-ransomware, which is designed to lock the victims’

Ransomware has been a threat against typical end users, busi- computer, to prevent them from using it; Second, and most
ness units, and the government in recent years. For example, common nowadays, is crypto-ransomware, which encrypts
it has targeted medical centers, schools [1], universities [2], personal files to make them inaccessible to its victims [55].
and police departments [3], to name a few. It was even pre- Frameworks applying static and dynamic analysis, as well as,
dicted that ransomware would account for around $20 billion ML algorithms, have been aiding with ransomware detection,
in loss alone towards organizations in 2021 [4]. Ransomware and due to the nature of executing the ransomware, a high
is a form of malware designed to control access to data or a accuracy rate would be expected. However, analysis takes
system until a requested ransom amount from the attacker is a relatively long time, leaving gaps where the malicious
satisfied [5]. Detection of ransomware is tricky and a resource payload can intrude the sandbox system without detection.
hungry task because it is hidden within the application layer This entire process alone is overly complicated. In this paper,
payload. Mitigation can also be difficult because of the use we focus on those ML algorithms that are mostly used in
of encryption against the application. Though more studies ransomware detection. We also provide a brief review of com-
and evaluations have been involved in other areas of malware, mon ransomware frameworks using such algorithms, along
ransomware specifically has not been the focal point, and with their results.
the push to improve security measures and discovery have When it comes to ransomware attacks, cybercriminals have
been stagnant [5]. There are two types of ransomware: first, perfected these techniques over the years. However, both
academia and industry have been trying to address these
The associate editor coordinating the review of this manuscript and threats and protect victims by learning from past experi-
approving it for publication was Jun Wu. ences and utilizing technological advancements over time [6].
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
VOLUME 10, 2022 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ 117597
D. Smith et al.: Machine Learning Algorithms and Frameworks in Ransomware Detection

Nonetheless, these attacks are growing daily. The underly- created by cybercriminals to target web pages with the
ing reason is that combating ransomware is challenging [7]. use of JavaScript. One such example is Ransom32, which
Ransomware typically relies on strong encryption that first appeared in late 2015. Until its discovery, no other
is easy to accommodate due to the vast amounts of ransomware attack was used with that programming lan-
open-source implementations. It also makes use of most guage [9]. This type of ransomware-as-a-service is unique
infection techniques that are employed by modern malware because, being written in JavaScript, it uses a web browser to
families. Ransomware benefits from the common elusive initiate its attack. The impact of this threat is far superior in
methods utilized by modern malware and it frequently uses nature because it can be used theoretically on any operating
application programming interfaces (APIs) to carry out mali- system where a web browser exists. This grants Ransom32
cious actions that make it difficult to differentiate from benign so-called ‘‘write-once-infect-all’’ capabilities [9]. Nonethe-
applications. Furthermore, it uses TOR networks (The Onion less, Ransom32 has only been detected on a Windows-based
Routing networks) to keep its communication anonymous, platform thus far. It can be found on most underground TOR
and unregulated payment techniques like cryptocurrencies sites and can be downloaded by the affiliated user. To down-
to get paid without easily disclosing the identity of the load the executable, one must have a bitcoin address, as this
attackers [8]. is the way payments of ransom are made.
The remaining structure of this paper is organized as fol- The developers of the Ransom32 software take a 25% cut
lows and investigated perspectives can be found in Figure 1: of any ransom made, and the rest goes to the user of the
Section 2 reviews required research background and provides affiliated program [9]. When the Ransom32 executable runs,
a comprehensive review of different ML algorithms used in it extracts several files. During this process, a shortcut is
ransomware detection. Section 3 will provide an analysis created in the start menu, and the ransomware will start at
of ransomware frameworks that use ML algorithms, their login which guarantees the malware will be executed every
challenges, evaluations, and results. This section also expands time the system is started. The shortcut points to a chrome.exe
on the importance of this research, consolidating all the men- executable file that is typically an NW.js package. This pack-
tioned frameworks in a table, providing a description of each age contains JavaScript code used for encryption using AES
framework’s name, the algorithm(s) of choice, the year it and extracts to folders such as %AppData% and %Temp%.
was created, overall results, and challenges. Section 4 will Furthermore, this is the piece that contributes to performing
speak about some future concerns and defense topics of the harmful events towards the compromised system [10].
ransomware, ending with concluding the paper in section 5. With NW.js being a legitimate framework and application,
antivirus coverage in this area is still very weak in nature.
Any black hat or white hat developer can use this executable
to create and distribute native apps that run just like nor-
mal executables [11]. Furthermore, when looking closer into
Ransom32, it runs under the context of the user without hav-
ing any administrative rights or permissions. Figure 2 gives a
general idea of how a member can join the affiliate network,
then ultimately be granted access to download the malicious
code for use. The member would also be able to see statistics
related to the software such as the number of payments that

FIGURE 1. An overview of discussed ransomware.

II. RESEARCH BACKGROUND

This section will briefly cover different types of ransomware
that are common across the cyber security community. It will
also cover typical machine learning algorithms used in ran-
somware detection.
A. Ransom32
With the emergence of social media and its popularity in
the younger generations, new ransomware families are being FIGURE 2. Ransom32 membership and attack process flows.