0% found this document useful (0 votes)
30 views

Preventing Crypto Ransomware Using Machine Learning

The document discusses using machine learning algorithms to classify and prevent ransomware variants. It describes identifying behavioral attributes for optimal classification accuracy and then classifying ransomware samples. It also discusses implementing a prevention mechanism for cryptographic ransomware families.

Uploaded by

Jitti Annie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Preventing Crypto Ransomware Using Machine Learning

The document discusses using machine learning algorithms to classify and prevent ransomware variants. It describes identifying behavioral attributes for optimal classification accuracy and then classifying ransomware samples. It also discusses implementing a prevention mechanism for cryptographic ransomware families.

Uploaded by

Jitti Annie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

IJCSN - International Journal of Computer Science and Network, Volume 8, Issue 3, June 2019

ISSN (Online) : 2277-5420


www.IJCSN.org
Impact Factor: 1.5

Preventing Crypto Ransomware Using Machine


Learning
1 Jitti Annie Abraham; 2 Susan M George
1 CSE Department, MBCCET

APJ Abdul Kalam Technological University, Kuttikkanam, Peermade, Idukki, Kerala, India.
2
CSE Department, MBCCET
APJ Abdul Kalam Technological University, Kuttikkanam, Peermade, Idukki, Kerala, India.

Abstract- Ransomware is a kind of malware that forestalls or confines clients from getting to their framework, either by locking the
framework's screen or by locking the clients' records except if a payoff is paid. Due to the changing conduct of ransomware, conventional
type and detection techniques do not correctly stumble on new variants of ransomware. Our data set includes some of the most up-to-date
ransomware samples available, providing an assessment of the category accuracy of device studying algorithms on the present day evolving
repute of ransomware. Two primary parts of this work are identification of the behavioral attributes which can be used for choicest class
accuracy and type of ransomware the using machine learning classification algorithms. After classifying the ransomware editions, a
prevention mechanism is also completed to the cryptographic ransomware variants.

Keywords- Classification, Machine Learning Ransomware, Ransomware prevention

1. Introduction
procedure. The potential of system getting to know to
ansomware attacks are becoming a serious cyber analyze with facts at some stage in the system of type,
makes them appealing and powerful for malware

R threat to organizations and individuals around the


world. Ransomware is a sort of malicious software
program from cryptovirology that threatens to put
up the sufferer's records or continuously block get
admission to it unless a ransom is paid. While a few easy
classification [1].

Classification of malware samples based on their behavior


requires implementation of algorithms that are capable to
ransomwares may additionally lock the device in a manner, produce models and learn through the classification
which is not always tough for an informed individual to process. The ability of machine learning to learn with data
opposite, extra advanced malware makes use of a technique during the process of classification, makes them attractive
referred to as cryptoviral extortion, in which it encrypts the and effective for malware classification. Using machine
sufferer's files, making them inaccessible, and needs a learning classification algorithms, ransomware samples can
ransom payment to decrypt them. be identified with different behaviors from other samples
that are part of the same family. The reason behind this
Regularly, recent types of malware are not recognized from study is to identify new modified variants of ransomware
their ancestors because of the impediments of order based on their behavior using machine learning algorithms.
frameworks depending just on static investigation. Two main parts of study are identification of the behavioral
Accordingly, methods like static based, signature-based and attributes which can be used for optimal classification
design coordinating methods for malware investigation are accuracy and classification of ransomware. After
ending up less viable to identify and order new variations of classifying the ransomware variants, a prevention
ransomware and give knowledge data about the risk, mechanism is also done to the cryptographic ransomware
objectives and practices of ransomware. Efforts have been families.
made to increase behavior-based totally type techniques.
Classification of malware samples primarily based on their The rest of this paper is organized as follows. A brief
behavior calls for implementation of algorithms that are literature review of existing works on classification and
successful to supply models and research via the type prevention mechanism of ransomware variants is given in
Section 2. The proposed method is explained in Section 3.

285

Copyright (c) 2019 International Journal of Computer Science and Network. All Rights Reserved.
IJCSN - International Journal of Computer Science and Network, Volume 8, Issue 3, June 2019
ISSN (Online) : 2277-5420
www.IJCSN.org
Impact Factor: 1.5

The experimental results are discussed in Section 4. Finally, returned-up and repair files. The future threats of
Section 5 concludes the paper. ransomware include rootkit-primarily based ransomware,
obfuscation, white-field cryptography, socio-technical
2. Related Work assaults.

M. I. Jordan, T. M. Mitchell, described that Artificial D.Nieuwenhuizen [5] performs a prediction that
ransomware is a type of malicious software program
Intelligence is anywhere. Possibility is that the use of it in a
(malware) that once finished on a pc machine, hinders the
single way or the opposite and also you don’t even know
user from using the laptop or its facts, annoying an amount
approximately it. One of the famous programs of AI is
of cash (ransom) for the recuperation of the computer.
Machine Learning, wherein computer systems, software,
Currently, ransomware attacks preclude laptop operation in
and devices perform through cognition which could be very
three approaches: through blockading gaining access to the
just like human mind. Machine mastering a subject of
computer, this form of ransomware is referred to as locker
synthetic intelligence that makes use of statistical strategies
ransomware; through making person facts unusable with
to give computer systems the capability to "research” from
the aid of employing encryption algorithms, known as
data, without being explicitly programmed [2]. Some of the
crypto ransomware and an aggregate of locker/crypto
trending applications of device mastering includes: virtual
ransomware where a person is blocked from the usage of
personal assistants, predictions at the same time as
their pc even as their records is being encrypted. This paper
commuting, social media offerings, electronic mail spam
gives inspiration to the utilization of machine-learned
and malware filtering, seek engine end result refining and
conduct for ransomware identification. Ransomware
product tips. Machine mastering might be going to be a
assaults impede PC task in three different ways: by blocking
standout among the most transformative innovations of the
getting to the PC (storage ransomware), by making client
21st century.
information unusable by methods for utilizing encryption
calculations (crypto ransomware) and mix of storage/crypto
Sandhya Ndhage, Charanjeet Kaur Raina [3], Machine
ransomware. The procedures portrayed in this paper are
learning is a multi displinary field in artificial intelligence,
utilized in RansomFlare which is a ransomware
likelihood insights data hypothesis, reasoning, human
counteractive action operator that uses dynamic (social)
science, and neurobiology. Machine learning tackles this
examinations and AI strategies. Here demonstrates that
present reality issues by building a model that is great and
signature based recognition methods have demonstrated an
valuable estimation to the information. The study on
insufficient resistance. Additionally, the static-based
machine learning taking in has developed from the
recognition is compelling against known ransomware.
endeavours of investigating regardless of whether computer
could figure out how to imitate the human mind,
R.Vijaya Kumar Reddy, Dr. U. Ravi Babu, A classification
furthermore, a field of measurements to a wide control that
is a technique of predicting comparable facts from the fee
has created central factual computational speculations of
of an express target or express elegance variable [6]. It is a
learning forms. The fundamental objective and
useful method for any sort of statistical data. These
commitment of this audit paper is to display the diagram of
algorithms are used for diverse functions like photo
machine learning and gives machine-learning procedures.
category, Predictive modelling, facts mining method and so
Additionally, paper surveys the benefits and limitations of
on. The primary reason of supervised learning is to
different machine learning algorithm in diverse
construct an easy and unambiguous version of the allocation
methodologies. of sophistication labels in terms of predictor capabilities.
The classifiers are then used to categories elegance labels of
Ziya Alper Gen, Gabriele Lenzini, Peter Y.A. Ryan, the checking out times where the values of the predictor
discussed that ransomware is a category of malware whose features are known, to the price of the magnificence label
aim is to extort money. At the point while brought on a that is unknown. In this paper here illustrate numerous class
framework, a ransomware encodes files or squares techniques used in supervised gadget getting to know.
functionalities and whilst the interest is achieved it requests
a get better. In this paper the survey present day barrier Smruti Saxena, Hemant Kumar Soni [7] Ransomware is
techniques for ransomware, talking about their stable and now grow to be a horrific tool to earn cash, theft records,
powerless focuses [4]. Here describe current techniques to hack the gadget or to stop the normal functioning of the
mitigate ransomware and speak their boundaries. The gadget. Ransomware is a malware that breaches the security
current ransomware mitigation systems are built upon the of the machine by means of the use of malicious codes. It
evaluation of gathered samples that is they with the encrypts the data and available data earlier than noticing it.
exception of the inefficient and ineffective exercise to

286

Copyright (c) 2019 International Journal of Computer Science and Network. All Rights Reserved.
IJCSN - International Journal of Computer Science and Network, Volume 8, Issue 3, June 2019
ISSN (Online) : 2277-5420
www.IJCSN.org
Impact Factor: 1.5

Traditional vaccination gadget does not remedy the infected growing as traditional detection-based totally safety, along
device without acquiring data on ransomware. Since the with antivirus and anti-malware, has verified useless at
statistics is encrypted subsequently cannot be recovered stopping attacks. Additionally, this form of malware is
without encryption key. Users can keep away from the incorporating advanced encryption algorithms and
infections of ransomware via updating vaccination device expanding the range of report sorts it goals. This paper
every so often. However, this approach has confined discusses ransomware strategies of contamination,
efficacy. This method cannot trace changed ransomware technology in the back of it and what may be accomplished
with new pattern. This paper explores the various to assist save you turning into the subsequent victim. The
ransomware attack. In this paper here converse the analysis paper investigates the maximum commonplace sorts of
of ransomware and the advised movement in opposition to crypto-ransomware, numerous payload methods of
ransomware assault. This paper also discusses ransomware infection, regular behavior of crypto ransomware, its
removal and prevention methodology. techniques, how an attack is primarily completed, what files
are maximum typically targeted on a victim’s computer,
Daniel Gonzalez, Thaier Hayajneh, [8] described that and suggestions for prevention and safeguards are listed as
crypto-ransomware is a difficult danger that ciphers a user’s well.
files at the same time as hiding the decryption key till a
ransom is paid by way of the sufferer. This form of malware 3. Proposed System
is a lucrative enterprise for cybercriminals, producing tens
of millions of bucks yearly. The spread of ransomware is

Fig 1 Proposed Architecture

The proposed architecture is shown in Fig 1. The study attributes are extracted from the behavioral reports. For
consists of three main phases: data collection, extraction of ultimate classification accuracy, we perform behavioral
behavioral attributes and selection of behavioral attributes attributes selection analysis to identify behavioral attributes
for optimal classification accuracy. In the data collection which should be used for classification in the next phase.
phase, we collect behavioral reports from VirusTotal for Using the selected behavioral attributes, we evaluate
every ransomware sample. In the next step, behavioral classification accuracy of machine learning algorithms.

287

Copyright (c) 2019 International Journal of Computer Science and Network. All Rights Reserved.
IJCSN - International Journal of Computer Science and Network, Volume 8, Issue 3, June 2019
ISSN (Online) : 2277-5420
www.IJCSN.org
Impact Factor: 1.5

The main goal of the behavioral attributes extraction phase used here for identification and classification of
is to obtain a dataset which best represents the behavior of ransomwares based on their behaviors are:
a ransomware sample without missing any relevant i. Linear Regression
information. Therefore, spent a considerable time and effort ii. Adaboost
extracting the behavioral attributes from all the behavioral iii. Random Forest
reports. Identified behavioral attributes appear at least in iv. Extra Trees
one of the behavioral reports. For each of the behavioral v. Gradient Boost
attributes, based on the type of information contained in the vi. Multilayer Perceptron
behavioral reports, we determine the attribute type to be
used to assign a value to the attribute. 3.3 Modules

After calculating the accuracy of classification algorithms, In programming, a module is a piece of a program. Projects
we conclude by best accuracy algorithm. Then describes a are made out of at least one freely created modules that are
prevention mechanism for cryptographic ransomware not consolidated until the program is connected. A solitary
families using machine learning techniques. Here module can contain one or a few schedules. The work can
BitLocker Drive Encryption method is used as a prevention be combined into following modules:
mechanism. a) Data Collection
b) Classification Processing
3.1 DARPA Dataset c) Prevention

The dataset utilized for leading the test is "DARPA". The data collection method comprises feature extraction
DARPA IDS assessment dataset is valuable for testing and fitness package. Feature extraction refers to the
interruption discovery frameworks in that great execution extraction of linguistic items from the documents to provide
against it is a fundamental yet not adequate condition to a representative sample of their content. Feature extraction
showing the capacities of a propelled IDS. This dataset was begins from an underlying arrangement of estimated
built for system security examination purposes. Analysts information and constructs determined qualities
scrutinized DARPA because of issues related with the (highlights) expected to be instructive and non-excess,
counterfeit infusion of assaults and benevolent traffic. encouraging the resulting learning and speculation steps,
DARPA incorporates exercises, for example, send and get and now and again prompting better human translations.
mail, peruse sites, send and get documents utilizing FTP, Highlight extraction is identified with dimensionality
the utilization of telnet to sign into remote PCs and perform decrease. At the point when the information to a calculation
work, send and get IRC messages, and screen the switch is too huge to possibly be handled and it is suspected to be
remotely utilizing SNMP. It contains assaults like DOS, excess (for example a similar estimation in the two feet and
surmise secret key, cradle flood, remote FTP, syn flood, meters, or the monotony of pictures introduced as pixels),
Nmap, and rootkit. Sadly, it doesn't speak to genuine system at that point it very well may be changed into a diminished
traffic and contains abnormalities, for example, the arrangement of highlights (additionally named a component
nonattendance of false positives, and is obsolete for the vector). Deciding a subset of the underlying highlights is
successful assessment of IDSs on current systems as far as called include determination. The chose highlights are
assault types and system foundation. In addition, it does not relied upon to contain the pertinent data from the
have the real assault information records. information, with the goal that the ideal errand can be
performed by utilizing this decreased portrayal rather than
3.2 Classification Algorithms the total starting information.

Classification is a technique where we categorize data into The classification processing module does the classification
a given number of classes. The main goal of a classification of ransomware variants using various machine learning
problem is to identify the category or class to which a new classification. Here also calculated the accuracy of each
data will fall under. Order is strategy to sort information into algorithm in each model. Totally three models evaluated
an ideal and unmistakable number of classes where we can with different classification algorithm.
relegate mark to each class. Utilizations of classification
includes discourse acknowledgment, penmanship In the prevention module, a prevention mechanism for
acknowledgment, biometric distinguishing proof, record crypto ransomware family is implemented. The encryption
arrangement and so forth. The classification algorithms technique “BitLocker Driven Encryption” method is used
as prevention technique.

288

Copyright (c) 2019 International Journal of Computer Science and Network. All Rights Reserved.
IJCSN - International Journal of Computer Science and Network, Volume 8, Issue 3, June 2019
ISSN (Online) : 2277-5420
www.IJCSN.org
Impact Factor: 1.5

Fig 2 Feature selection of ransomware variants


3.4 BitLocker Driven Encryption
off chance that a drive is stolen. It doesn't secure a
BitLocker is a full volume encryption highlight included framework when it's running in light of the fact that the on
with Microsoft Windows renditions beginning with the web/operational/live assurance is kept up by the
Windows Vista. It is intended to secure information by working framework. BitLocker utilizes an AES encryption
giving encryption to whole volumes. Of course, it utilizes calculation with a 128-piece key or 256-piece key to
the AES encryption calculation in figure square anchoring scramble plate volumes. It secures the information when a
(CBC) or XTS mode with a 128-piece or 256-piece key. hard drive is stolen and is being utilized on another PC or
CBC isn't utilized over the entire plate; it is connected to when somebody has physical access to the drive. To get to
every individual division. BitLocker is a PC hard drive the drive in a disconnected mode, BitLocker requires a
encryption and security program discharged by Microsoft recuperation key. BitLocker is by and large pointed toward
Corporation as a local application in its Windows 7 individual clients who may fall prey to PC/PC robbery.
Enterprise and Ultimate releases, Windows Vista Enterprise
and Ultimate, and Windows Server 2008, R2 and 2012 4. Experimental Result and Discussion
working framework variants. It is a drive security and
encryption program that shields drive substance and This study is carried out to identify and classify ransomware
information from any disconnected assault. variants using machine learning classification algorithm
based on their behavior. For the implementation of the
BitLocker is basically intended to keep a client's proposed system, the model is created in Python
information from being seen, extricated or recovered on the

289

Copyright (c) 2019 International Journal of Computer Science and Network. All Rights Reserved.
IJCSN - International Journal of Computer Science and Network, Volume 8, Issue 3, June 2019
ISSN (Online) : 2277-5420
www.IJCSN.org
Impact Factor: 1.5

programming language. For getting more accurate


classification algorithm, the different algorithm uses
different features in each three models. The following graph
shows the accuracy level in each model.

Fig 4 Accuracy of Model 2

Fig 3 Accuracy of Model 1

Fig 5 Accuracy of Final Model

290

Copyright (c) 2019 International Journal of Computer Science and Network. All Rights Reserved.
IJCSN - International Journal of Computer Science and Network, Volume 8, Issue 3, June 2019
ISSN (Online) : 2277-5420
www.IJCSN.org
Impact Factor: 1.5

Fig 6 Confusion Matrix

291

Copyright (c) 2019 International Journal of Computer Science and Network. All Rights Reserved.
IJCSN - International Journal of Computer Science and Network, Volume 8, Issue 3, June 2019
ISSN (Online) : 2277-5420
www.IJCSN.org
Impact Factor: 1.5

The values of accuracy of final model can be tabulated as Encryption method for crypto ransomware families using
follows: machine learning techniques.

Algorithm Accuracy Since machine learning is an upcoming trend, in future there


may get more accurate algorithm for classifying. Malware
Linear Regression 0.259637 detection is an arms race, as defenders provide mitigations,
Adaboost 0.753700 adversaries will modify their techniques. Also can be
Random Forest 0.744700 developed as a web based application, in future because
Extra Trees 0.761200 now the study and work is windows based only.
Gradient Boost 0.766800
Multi-Layer Perceptron 0.716500 References
Table 1 Accuracy Values of Classification Algorithms
[1] Hajredin Daku, Pavol Zavarsky, Yasir Malik, “Behavioral-
From the above experimental values and figures the result Based Classification and Identification of Ransomware
can be summarized as, in case of three models Gradient Variants Using Machine Learning”, 2324-9013/18/31.00©
IEEE, 2018
Boost classification algorithm has greatest accuracy. Thus
[2] M. I. Jordan and T. M. Mitchell, “Machine Learning:
on further use we can directly choose this algorithm. Also Trends, Perspectives, and Prospects”, Science 349,255
from here, got analyses that the highest attack occurs in the 2015.
experimental dataset is “probe” attack. It occurs around [3] Sandhya Ndhage, Charanjeet Kaur Raina, “A Review On
11760 times. Probe-response attacks are a new threat for Machine Learning Techniques”, IJRITCC, ISSN: 2321-
collaborative intrusion detection systems. A probe is an 8169 Volume: 4 Issue: 3
attack which is deliberately crafted so that its target detects 395 – 399, 2016.
and reports it with a recognizable fingerprint in the report. [4] Ziya Alper Gen, Gabriele Lenzini, Peter Y.A. Ryan,“The
The attacker then uses the collaborative infrastructure to Cipher, the Random and the Ransom: A Survey on Current
and Future Ransomware”, CECC, November 2017.
learn the detector’s location and defensive capabilities from
[5] D Nieuwenhuizen, “A Behavioural-based Approach to
this report [9]. Ransomware Detection” Information Security 2017.
[6] R. Vijaya Kumar Reddy, Dr. U. Ravi Babu, “A Review on
5. Conclusion Classification Techniques in Machine Learning”,ICRTESM
March 2018.
[7] Smruti Saxena, Hemant Kumar Soni, “Strategies for
Ransomware variations are expanding step by step. They
Ransomware Removal and Prevention”, 978-1-5386-4606-
generally target client savvy and framework shrewd. The 9© IEEE, 2018.
principle point of ransomware is to take cash from the [8] Daniel Gonzalez, Thaier Hayajneh, “Detection and
person in question. Here studied the implementation of PreventionofCrypto-Ransomware”, 978-1-5386-1104-
machine learning algorithms for malware classification 3/17/$31.00 © IEEE, 2017.
based on the behavior of malware samples. Using an [9] Vitaly Shmatikov and Ming-Hsiu Wang, “Security Against
iterative approach, determined the set of behavioral Probe-Response Attacks in Collaborative Intrusion
attributes which can be used for ransomware classification Detection”, ACM 2007
to achieve the optimal classification accuracy. Moreover,
here evaluated classification accuracy of five machine Author Profile
learning algorithms. Using machine learning, identified Jitti Annie Abraham received her B.Tech (CSE)
modified variants of ransomware samples, confirming the degree from University of Kerala in 2016. She is
new trend of malware in evading classification and currently pursuing her Masters in Computer Science
detection systems by modifying their behavior. The & Engineering from APJ Abdul Kalam Technological
identified ransomware samples from evolving families with University. Her research interests areas includes
a diverse behavior compared to their predecessors. The machine learning, artificial intelligence, cyber
intention of creating malware variants with various forensics and cryptography.
behaviors might be to evade detection systems by
presenting a rare behavior on new samples, or to mislead
detection and classification systems by using a similar
behavior to other ransomware families. Then describes a
prevention mechanism named BitLocker Driven

292

Copyright (c) 2019 International Journal of Computer Science and Network. All Rights Reserved.
IJCSN - International Journal of Computer Science and Network, Volume 8, Issue 3, June 2019
ISSN (Online) : 2277-5420
www.IJCSN.org
Impact Factor: 1.5

Susan M George is working as Assistant Professor in


Computer Science and Engineering Department. She has
more than 3 years’ experience in teaching. Her research
interests focus data mining, machine learning and
artificial intelligence. She has published several papers
on different areas.

293

Copyright (c) 2019 International Journal of Computer Science and Network. All Rights Reserved.

You might also like