0% found this document useful (0 votes)
160 views47 pages

Final Project Report'''1

The document discusses ransomware detection and prevention using machine learning techniques. It describes the challenges in ransomware prevention and the need for detecting ransomware in zero-day attacks. Various machine learning techniques for ransomware detection are reviewed, including static analysis, dynamic behavior analysis and machine learning-based approaches.

Uploaded by

20-24-Kesaav NA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
160 views47 pages

Final Project Report'''1

The document discusses ransomware detection and prevention using machine learning techniques. It describes the challenges in ransomware prevention and the need for detecting ransomware in zero-day attacks. Various machine learning techniques for ransomware detection are reviewed, including static analysis, dynamic behavior analysis and machine learning-based approaches.

Uploaded by

20-24-Kesaav NA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 47

RANSOMEWARE DETECTION AND

PREVENTION USING ML

A PROJECT REPORT

Submitted by

ABINESH.S(721020104003)
KESAAV.NA(721020104024)
KIRUBAKARAN.V(721020104026)

in partial fulfillment for the award of the degree

of

BACHELOR OF ENGINEERING
in
COMPUTER SCIENCE AND ENGINEERING

NEHRU INSTITUTE OF TECHNOLOGY, COIMBATORE

ANNA UNIVERSITY: CHENNAI 600 025

MAY 2024
ANNA UNIVERSITY: CHENNAI 600 025

BONAFIDE CERTIFICATE

Certified that this project report “RANSOWAMWARE DETCTION AND


PREVENTION USING AI” is the bonafide work of “ABINESH.S
(721020104003),KESAAV.NA(721020104024),KIRUBAKARAN.V(721020104026) ”
who carried out the project under my supervision.

SIGNATURE SIGNATURE

Dr. S. Pathur Nisha, M.E., Ph.D., Dr.D. Sathish Kumar, M.E., Ph.D.,

HEAD OF THE DEPARTMENT SUPERVISOR

Computer Science and Engineering Computer Science and Engineering

Nehru Institute of Technology


Nehru Institute of Technology

Coimbatore 641105
Coimbatore 641105

RANSOMEWARE DETECTION AND PREVENTION USING ML


Submitted by

ABINESH.S(721020104003)
KESAAV.NA(721020104024)
KIRUBAKARAN.V(721020104026)

Viva voce held on ____________________ at NEHRU INSTITUTE OF


TECHNOLOGY, Coimbatore.

INTERNAL EXAMINER EXTERNAL EXAMINER

ACKNOWLEDGEMENT

First of all, we thank the almighty for giving us knowledge and courage to complete this
dissertation work successfully. We express our sincere gratitude to Dr. P. Krishna
Kumar, MBA., Ph.D., CEO & Secretary for providing us the opportunity to carry out
Under Graduate program in this reputed institution showed towards us throughout the
course. We would also take the privilege to thank our respected Principal Dr. M.
Sivaraja, M.E., Ph.D., P.D. (USA), for being a source of inspiration throughout the
course.

We would also like to extend our sincerest gratitude to, Dr. S. Pathur Nisha, M.E., Ph.D.,
Head of the department, Department of Computer Science and Engineering, for her
constant motivation, and encouragement for us to carry out the project work in a
spectacular fashion. We also thank all the faculty members of our department for their
timely supportive role and big helping thanks in the process of accomplishment of our
work.
We would like to thank Our Project Guide, Dr.D. Sathish Kumar, M.E., Ph.D.,
Professor, Department of Computer Science and Engineering, for her encouragement and
support. We also acknowledge the valiant support of our lab technicians for extending
helping hands whenever it was required.

We finally thank our parents & friends for their constant encouragement during our
college possession.

ABSTRACT

The rapid technological advancement, security has become a major issue due to the

increase in malware activity that poses a serious threat to the security and safety of both
computer systems and stakeholders. To maintain stakeholder’s, particularly, end user’s

security, protecting the data from fraud ulentefforts is one of the most pressing concerns.

A set of malicious programming code, scripts, active content, or intrusive software that is

designed to destroy intended computer systems and programs or mobile and web

applications is referred to as malware. According to a study, naive users are unable to

distinguish between malicious and benign applications. Thus, computer systems and

mobile applications should be designed to detect malicious activities towards protecting

the stakeholders. A number of algorithms are available to detect malware activities by

utilizing novel concepts including Artificial Intelligence, Machine Learning, and Deep

Learning. In this study, we emphasize Machine Learning based techniques for detecting

and preventing malware activity. We present a detailed review of current malware

detection technologies, their shortcomings, and ways to improve efficiency. Our study

shows that adopting futuristic approaches for the development of malware detection

applications shall provide significant advantages.

Keywords: Machine Learning, Malware, Detection System, Malware Prevention

Technology, Software Security.

TABLE OF CONTENTS

CHAPTER NO. TITLE PAGE NO


ABSTRACT v
LIST OF FIGURE viii
LIST OF ABBERVATION ix
1. INTRODUCTION 1
1.1ABOUT THE DOMAIN 1
1.2WIRELESS SENSOR NETWORK 1
2. LITERATURE REVIEW 2
3. SYSTEM ANALYSIS 5
3.1 EXISTING SYSTEM 5
3.2 PROPOSED SYSTEM 6
4. HARDWARE SPECIFICATION 7
4.1 RFID SENSORS 7
4.1.1 Description 7
4.1.2 RFID Module 8
4.1.3 Features 9
4.2 SCANNER SENSOR 10
4.2.1 Description 10
4.2.2 Security Scanner 11
4.2.3 Feature 12

4.3 TEMPERATURE SENSOR 13


4.3.1 Description 13
4.3.2 Features 13
4.4 IR SENSOR 15
4.4.1 Description 15
4.4.2 Features 16
4.5 GSM MODEM 17
4.5.1 Description 17
4.5.2 Features 17
4.6 LIGHT DEPENDENT RESISTOR 18
4.6.1 Description 18
4.6.2 Features 18
4.7 METALLIC SENSOR 19
4.7.1 Description 19
4.7.2 Features 19
5. PROJECT DESCRIPTION 20
5.1 OVERVIEW 20
5.2 OPERATION 20
5.3BLOCK DIAGRAM & DESCRIPTION 22
5.4 SENSORS DIAGRAM 24
5.4.1 RFID Sensor 24
5.4.2 IR Sensor 25
5.4.3 Temperature Sensor 26
5.4.4 GSM Modem 27
5.4.5 Metallic Sensor 28
5.4.6 RFID Reader 29
5.4.7RFID Tag 30
6. SYSTEM IMPLEMENTION AND TESTING 31
6.1 SYSTEM IMPLEMENTATION 31
6.2 SYSTEM TESTING 32
7. CONCLUTION AND FUTURE ENHANCEMENT 35
7.1 CONCLUSION 35
7.2 FUTURE ENHANCEMENT 35
REFERENCES

LIST OF FIGURE
TABLE NO FIGURE NAME PAGE NO

5.3 BLOCK DIAGRAM 23

5.4.1 RFID SENSOR 24

5.4.2 TEMPERATURE SENSOR 25

5.4.3 GSM MODEM 26

5.4.4 METALLIC SENSOR 27

CHAPTER 1
INTRODUCTION
1.1 DOMAIN
Cybersecurity information technology security or simply IT security, is the
practice of protecting computer systems, networks, and data from theft, damage, or
unauthorized access. It encompasses a range of technologies, processes, and practices
designed to ensure the confidentiality, integrity, and availability of information in the
digital realm. The importance of cybersecurity has grown significantly with the increasing
reliance on digital technologies in all aspects of society, including businesses,
governments, and individuals. As more data is stored and transmitted electronically, the
potential risks and threats to this data have also multiplied.
1.2 MACHINE LEARNING TECHNOQUES
Machine Learning Assigns labels to input data based on learning from labeled
examples. Popular algorithms include Support Vector Machines (SVM), Decision Trees,
Random Forests, and Neural Networks. Predicts continuous outcomes based on input data.
Linear Regression, Polynomial Regression, Support Vector Regression, and Neural
Networks are common regression techniques. Groups similar data points together based
on some similarity metric. K-Means, Hierarchical Clustering, and DBSCAN are popular
clustering algorithms. Reduces the number of features in the data while preserving most of
its variance. Principal Component Analysis (PCA) and t-distributed Stochastic Neighbor
Embedding (t-SNE) are widely used for dimensionality reduction.ombines both labeled
and unlabeled data for training. It can be particularly useful when labeled data is scarce or
expensive to obtain.

CHAPTER 2
LITERATURE REVIEW
2.1 INTRODUCTION
Preventing ransomware is challenging for several reasons. The way ransomware
functions is the same as benign software, which acts covertly. Ransomware detection in
zero-day assaults is therefore crucial at this time. The primary objectives are to avoid
ransomware-caused system damage identify zero-day malware and minimize detection,
which means reducing the number of false positives while still detecting all instances of
ransomware. False positives are instances where the system flags a harmless program or
file as ransomware leading to unnecessary alerts and actions. Ransomware can be found
using a variety of tools and methodologies. Methods based on static analysis decompose
source code without running it. They generate many false positives and cannot find
ransomware that is disguised.

Attackers frequently create new variations and modify their codes using various
packaging techniques. To solve these issues, researchers use dynamic behavior analysis
methods that monitor interactions between the executed code and a virtual environment.
However, these detection methods are cumbersome and memory-intensive. Machine
learning is ideal for analyzing any process or application’s behavior. Machine learning is
considered ideal for analyzing the behavior of processes or applications because it can
effectively learn patterns and anomalies in large datasets, which can be difficult for
humans to detect.

In the context of ransomware detection machine learning algorithms can be trained on


large datasets of both benign and malicious software to learn the behavioral characteristics
that distinguish ransomware from legitimate software. This training can be used to
identify new and previously unseen variants of ransomware, including zero-day attacks,
based on their behavioral patterns. Moreover, machine learning can be used to
continuously learn and adapt to new threats making it an effective approach to keep up
with the constantly evolving tactics of ransomware attackers.
Machine learning can also reduce false positives by accurately distinguishing between
benign software and ransomware based on their behavioral patterns. Compared with
traditional signature-based detection and static analysis methods, machine learning is
considered ideal because it can provide a more comprehensive and accurate analysis of the
behavior of software, making it a powerful tool for ransomware detection. However, it is
important to note that machine learning models need to be properly trained and validated
to ensure their effectiveness and avoid biases or errors. The following are some machine-
learning-based detection systems that follow highly traditional methodologies.

CONCLUSION

Malware or malicious applications may cause catastrophic damages to not only computer
systems but also data centers, web, and mobile applications to various industries;
particularly, financial and healthcare institutes. Ensuring the safety of stakeholders’ data
from malicious entities is a major challenge that leads us towards the concept of malware
detection and prevention. Machine Learning(ML) can be an effective solution that we can
adopt for the development of Anti Malware Systems. Having such direction, this study
presented a detailed review of malware detection techniques and approaches. At first, we
attempted to provide a clear overview of malware, artificial intelligence, and its narration.

CHAPTER 3
SYSTEM ANALYSIS
3.1 EXISTING SYSTEM
Traditional antivirus software can detect known ransomware strains based on
signatures or behavioral patterns. However, it may struggle with new or evolving variants.
EDR solutions monitor endpoint devices for suspicious activities and behaviors indicative
of ransomware. They can detect anomalies in file access patterns, process behaviors, and
network communications. NGAV solutions use advanced techniques such as machine
learning, artificial intelligence, and behavioral analysis to identify and block ransomware
threats in real-time. They can detect both known and unknown ransomware variants.
Regular backups of critical data are essential for ransomware protection. Backup solutions
with features like versioning, encryption, and off-site storage can help organizations
recover from ransomware attacks without paying the ransom.

3.2 PROPOSED SYSTEM


Collect data from various sources such as endpoint devices, network traffic, and
system logs. Utilize ML techniques like machine learning algorithms to analyze historical
data and identify patterns indicative of ransomware activity.

Train machine learning models on labeled datasets to classify normal and malicious
behavior. Use supervised learning algorithms like Random Forest, Support Vector
Machines (SVM), or deep learning models such as Convolutional Neural Networks
(CNNs) or Recurrent Neural Networks (RNNs).Continuously update and refine the
models with new data to improve detection accuracy and adapt to evolving ransomware
variants.

Incorporate AI-powered educational modules to raise user awareness about ransomware


threats and best practices for prevention. Use personalized recommendations and
interactive simulations to engage users and promote proactive cybersecurity behaviors.

3.1.2 ADVANTAGES
• Early Detection, Reduced False Positives, Automated Response
• High Security.
CHAPTER 4
Ransomware Detection
4.1 Ransomware-Detection Methods
4.1.1 Description
The two main types of ransomware-detection methods are automated and manual.
Employing technologies to identify and report ransomware attacks is a prerequisite for
automated methods. These tools are typically software programs that have the potential to
be able to stop attacks. Techniques for manual detection focus on routinely scanning data
and devices for indicators of attacks. Checking to see if a malware attack has not modified
data or stopped authorized users from accessing their devices or files includes looking at
any changes to file extensions, the accessibility of devices and files by authorized users,
and any changes to file extensions.

Manual ransomware detection refers to the process of detecting ransomware through


human analysis and intervention rather than automated systems. This approach involves
analyzing system logs, network traffic, and other indicators of compromise to identify
patterns and behaviors associated with ransomware attacks. While manual detection can
be time-consuming and resource-intensive, it can be an effective complement to
automated detection methods, as it can help identify new or unknown types of
ransomware that may not be detected by automated systems [30]. Despite its
effectiveness, manual ransomware detection has some limitations. It can be labor-
intensive and requires highly trained personnel to analyze system logs and network traffic.
Additionally, manual detection may not scale well in large organizations or networks,
where automated detection methods may be more efficient scanning
4.1.2 Automated Ransomware Detection

The current methods for detecting ransomware primarily involve monitoring the
system at the file system level. Automated approaches to detecting ransomware can be
categorized into two main groups: those based on machine learning (ML) and those that
are not based on ML. ML-based methods typically employ machine learning (ML), deep
learning (DL), and artificial neural network (ANN) techniques to detect ransomware.
Some tools utilize variations of these techniques or a hybrid approach that combines two
or more techniques to combat the threat of ransomware attacks. Non-AI methods rely on
packet inspection and traffic analysis to detect ransomware. One of the major advantages
of automated approaches is their ability to detect, block, and recover from ransomware
attacks without human intervention. Additionally, these tools are highly accurate and
reliable in terms of detecting, preventing, and recovering from ransomware attacks.

Machine learning (ML) techniques, including machine learning, deep learning, and
artificial neural networks, have been utilized for automated ransomware detection. These
techniques involve the use of behavioral techniques, as well as static and dynamic
analysis, to identify and prevent ransomware attacks. Machine learning algorithms can
learn from previous ransomware attacks and detect new variants by analyzing patterns and
behaviors. On the other hand, deep learning methods can leverage neural networks to
detect ransomware attacks by analyzing large amounts of data. Artificial neural networks
can also be used to identify ransomware by processing and analyzing multiple data
sources. These ML-based approaches offer a more efficient and reliable way to detect and
prevent ransomware attacks, reducing the potential impact on businesses and individuals.

ML based detection has several benefits, including its ability to detect new or
unknown ransomware variants that do not match existing signatures or patterns and to
adapt to changing ransomware behavior pattern solver time. Moreover, this approach is
less prone to false positives than signature-based and heuristic-based detection, as it relies
on detecting actual behavior patterns rather than static code signatures or predefined rules.
However, machine-learning-based detection is limited by its reliance on a large and
representative dataset of training samples and by its susceptibility to adversarial attacks
that can manipulate the features or behavior of the ransomware to evade detection.
4.2 Ransomware-Detection Techniques
4.2.1 Description
Ransomware detection is a critical component of cybersecurity, and various techniques
have been developed to detect ransomware attacks. This section will discuss different
ransomware-detection techniques proposed in the literature and their strengths, weak
nesses, and limitations.
Ransomware detection is a more advanced approach that identifies ransomware behavior
patterns or anomalies indicative of malicious activity. This approach is based on creating
rules or heuristics that describe typical ransomware behavior and then monitoring the
system or network for any deviations or anomalies from these rules. If such varia tions or
abnormalities are detected the ransomware is flagged as suspicious or malicious, and
appropriate. One of the advantages of heuristic-based detection is its ability to detect new
or unknown ransomware variants that do not match any existing signatures or patterns.
Moreover, this approach is less prone to false positives than signature-based detection, as
it relies on detecting actual behavior patterns rather than static code signatures. However,
heuristic-based detection is limited by its reliance on predefined rules or heuristics, which
may only capture some possible ransomware behavior patterns or anomalies. Moreover,
attackers can easily evade heuristic-based detection by modifying the behavior of the
ransomware to avoid detection.

4.2.2 Signature-Based Detection


Signature-based detection is a traditional approach that relies on identifying known
ransomware signatures or patterns in the code or behavior of the malware. This approach
is based on creating a database of known ransomware signatures or marks and scanning
the system or network for matching signatures or patterns. If a match is found, the
ransomware is flagged as malicious. One benefit of signature-based detection is its
simplicity and effectiveness in detecting known ransomware variants. However, this
approach is limited by its inability to detect new or unknown ransomware variants that do
not match existing signatures or patterns. Moreover, attackers can easily evade signature-
based detection by modifying the code or behavior of the ransomware to avoid detection
4.3 Feature Extraction and Selection
4.3.1 Description
Machine learning techniques have been increasingly used to detect ransomware due
to their ability to learn behavior patterns and detect anomalies. In this section, we will
discuss different features used for ransomware detection using machine learning and the
techniques used for feature selection, such as principal component analysis and correlation
analysis.
Behavioral analysis involves monitoring the behavior of running processes on a system to
identify anomalies that indicate malicious activity. This is typically carried out in real
time, allowing the detection of ransomware as it is executed on a system. In contrast,
dynamic analysis involves running an executable file in a controlled environment, such as
a sandbox, to observe its behavior and identify any malicious activity.

This is typically conducted prior to deploying the executable file on a production system.
The confusion between static and dynamic analysis may arise from the fact that both
approaches involve the analysis of executable files, but they do so in different ways. Static
analysis involves looking at the executable file’s source code to spot malicious activity,
while dynamic analysis involves running the executable file in a controlled environment
to observe its behavior. Dynamic analysis can be performed in real-time, but it can also be
conducted in a sandbox environment before deploying the executable file on a production
system. In a sandbox environment, the executable file is executed in a controlled
environment, allowing its behavior to be monitored and analyzed without affecting the
production system. Once the analysis is complete, the results can be used to determine
whether the executable file is malicious or benign.
4.4 Performance Evaluation of Machine Learning Models for Ransomware Detection
4.4.1 Description
Evaluating the performance of machine learning models for ransomware detection
is crucial to determine their effectiveness in detecting and preventing its spread. In this
section, we will discuss different evaluation metrics used for measuring the performance
of machine learning models for ransomware detection, including accuracy, precision,
recall, F1-score, and ROC curve.

Accuracy is the most straightforward evaluation metric, representing the percentage


of correct predictions made by the model. It is calculated as the ratio of accurate
predictions to the total number of predictions. However, accuracy can be misleading when
dealing with imbalanced datasets, where negative samples greatly outweigh the positive
models.

Out of all samples predicted to be positive (recognized as ransomware by the


algorithm), precision is the percentage of true positives (samples of successfully identified
malware). The ratio of true positives to the total of true and false positives is known as
precision. A model with a high precision score will have a low false-positive rate, making
it less likely to mistakenly label innocent files as ransomware

Recall counts the number of positive samples in the collection that are true positives. The
ratio of true positives to true and false negatives is computed. A high recall score suggests
that the model has a low incidence of false negatives, which makes it less likely to fail to
detect actual ransomware samples.
4.5 Hybrid Detection

4.5.1 Description

Hybrid detection is an approach that combines different ransomware-detection tech niques


to improve the overall detection accuracy and speed. This approach combines the
strengths of other detection techniques, such as signature-based, heuristic-based, machine
learning-based, and network-based detection, to create a more robust and effective
detection system. One of the advantages of hybrid detection is its ability to overcome the
limitations of individual detection approaches and to improve the overall detection
accuracy and speed. Moreover, this approach is less prone to false positives and negatives
than unique detection approaches, as it combines different sources of information and
analysis. However, hybrid detection is limited by its complexity and resource
requirements, as it requires integrating and coordinating other detection systems and tools.
4.6 Network-Based Detection

4.6.1 Description

Network-based detection is an approach that relies on monitoring the network


traffic for suspicious or malicious activity that may be indicative of a ransomware attack.
This approach is based on analyzing the network traffic for anomalies or patterns
characteristic of ransomware, such as large volumes of outbound traffic, unusual network
connections, or network traffic encryption.

One of the advantages of network-based detection is its ability to detect ransomware


activity even if the malware has not yet infected the system or if the ransomware is using
non-standard encryption methods. Moreover, this approach is less prone to false positives
than other detection approaches, as it relies on detecting actual network traffic patterns
rather than static code signatures or predefined rules.

However, network-based detection is limited by its reliance on network traffic analysis


tools that may not be available or may not capture all ransomware activity. Moreover,
attackers can easily evade network-based detection by encrypting their network traffic or
using stealthy communication channels.
CHAPTER 5
PROJECT DESCRIPTION
5.1 OVERVIEW
The process begins with collecting data from various sources such as endpoint
devices, network traffic, and system logs. This data provides the foundation for training
ML models. Relevant features are extracted from the raw data to capture characteristics
that are indicative of ransomware activity. These features may include file access patterns,
process behaviors, network communications, and more. Labeled datasets are prepared for
training machine learning models.
These datasets consist of examples of both normal behavior and ransomware activity.
Supervised learning algorithms, such as Random Forest, Support Vector Machines
(SVM), or deep learning models like Convolutional Neural Networks (CNNs) and
Recurrent Neural Networks (RNNs), are trained on the prepared datasets. These models
learn to differentiate between normal and malicious behavior based on the extracted
features.
The trained AI models are deployed to continuously monitor incoming data streams in
real-time. They analyze the data for signs of ransomware activity and raise alerts when
suspicious behavior is detected.
5.2 OPERATION

In this project, The process begins with collecting data from various sources within
the organization's IT infrastructure, including endpoint devices, network traffic, system
logs, and security event logs. This data serves as input for MLI-driven analysis. Before
feeding the data into AI algorithms, preprocessing steps may be required to clean and
prepare the data for analysis. This could involve tasks such as data normalization, feature
scaling, and handling missing values.

Supervised learning models are trained using labeled datasets that contain examples
of normal and malicious behavior. The models learn to classify new data instances as
either benign or potentially malicious based on the patterns they've learned during
training. Unsupervised learning techniques may also be employed for anomaly detection.

Anomaly detection techniques are used to identify deviations from normal behavior
that may indicate ransomware activity. ML algorithms detect anomalies by comparing
current data patterns to historical norms or by learning patterns from unlabeled data.

The ML-driven ransomware detection and prevention system is integrated with


existing security infrastructure, such as endpoint protection platforms, firewalls, and
SIEM solutions. This enables seamless communication and sharing of threat intelligence
to enhance overall security posture and response capabilities.
5.3 BLOCK DIAGRAM AND DESCRIPION

Fig: 5.3 Block Diagram

Machine learning (ML)is a technological phenomenon that all industries wish to


exploit to benefit from efficiency gains and cost reductions because of its capability of
replacing humans by undertaking intelligent tasks that were once limited to the human
mind. Nones et al. define ML as the rapidly growing development of computer systems
that are able to perform tasks that only human intelligence could ever accomplish.
However, from the aspects of scholars.
AI can be used for intelligence augmentation (IA) instead of being a replacement for the
human mind which gives it strategic importance with identifying as a potential key driver
of the current technological revolution.
Thus, AI can be widely used in developing projects based on intellectual processes
including the capacity for augmentation, conception, con sourness, investigation,
enthusiastic information, thinking, arranging, innovation, and problem-solving in different
sectors including Big Data, Security, Business Analytics and many more domains.

5.4. CONCEPT DIAGRAM


5.4.1 MALWARE
FIG.5.4.1 MALWARE

Malware is a contraction of malicious programming codes, scripts, active content, or


intrusive software that is designed to destroy intended computer systems and programs or
mobile and web applications using different forms including computer viruses, worms,
ransomware, rootkits, trojan, dialers, adware, spyware, keyloggers, or malicious Browser
Helper Objects (BHOs). Malware is the short form of malicious software or application
which is not limited to computer system rather extend to the internet and related fields.

5.4.2 RANSOMWARE DETECTION

FIG. 5.4.2 RANSOMWARE DETECTION


Ransomware attacks have emerged as a major cyber-security threat wherein user data is
encrypted upon system infection. Latest Ransomware strands using advanced obfuscation
techniques along with offline C2 Server capabilities are hitting Individual users and big
corporations alike. This problem has caused business disruption and, of course, financial
loss

5.4.3 RANSOMEWARE DETECTION CIRCUIT

FIG.5.4.3 RANSOMEWARE DETECTION CIRCUIT

The rise of ransomware is attributed to many different factors since it first appeared in
1989. The emergence of ransomware as a service has also increased the availability of
ransomware to potential criminals who are less technically gifted. Crypto Locker, Crypto
Wall, and Locky offer this type of service with the variant Crypto Wall, generating more
than 320 million dollars in revenue during its lifespan
5.4.4 RANSOMWARE METHODOLOGY
FIG.5.4.4 RANSOMWARE METHODOLOGY
Installation occurs after the payload has been dropped into the system. One prominent
method of installation is the download dropper. This approach uses an initial file which
involves using a small piece of code to evade detection and reach out to the command and
control (C&C) center. Ransomware authors will attempt to split execution into different
scripts and processes to avoid AV (Anti-Virus) signature-based detection. When an
organization is targeted in an attack, ransomware will spread through the network,
determining file share locations and infecting them to maximize disruption and increase
the possible ransom. The executables will not run until multiple machines have been
infected.

CHAPTER 6
SYSTEM IMPLEMENTATIONS
6.1 SYSTEM IMPLEMENTATION

Ransomware like most malware, progress through several phases. radar can spot known
and unknown ransomware across these phases. Early detection can help prevent damage
done in later phases. Qader provides content extensions that include hundreds of use cases
to generate alerts across these phases. Content extensions are delivered through the App
Exchange and provide the ability to get the latest use cases. IBM Security® X-Force®
Threat Intelligence collections are used as references in use cases to help find the latest
known indicators of compromise (IOC), such as IP addresses, malware file hashes, URLs
and more.

 Initial Access
 Execution, Persistence
 Discovery, Lateral Movement, Collection
 Exfiltration, Impact

Initial Access

The ransomware is scanning the machine to analyze the administrative rights it could
obtain, make itself run at boot, disable recovery mode, delete shadow copies, and more.
Now that ransomware owns the machine from the starting phase, it will begin a phase of
reconnaissance of the network (attack paths), folders and files with predefined extensions,
and others. The real damage begins now.

Execution, Persistence
This is the moment the stopwatch starts. Ransomware is now in your environment. If the
ransomware used a “dropper” to avoid detection in the distribution phase, this is when the
dropper calls home and downloads the "real executable” and runs it.

Discovery, Lateral Movement, Collection

Now that ransomware owns the machine from the starting phase, it will begin a phase of
reconnaissance of the network (attack paths), folders and files with predefined extensions,
and others.

Exfiltration, Impact

The real damage begins now. Typical actions include: create a copy of each file, encrypt
the copies, place the new files at the original location. The original files might be
exfiltrated and deleted from the system, which allows the attackers to extort the victim
with threats of making their breach public, or even to leak stolen documents.

6.2 SYSTEM TESTING

Step 1: Problem Definition and Planning


Define the objectives and scope of the ransomware detection and prevention system.
Determine the types of ransomware threats to be addressed and the data sources available
for analysis.

Step 2: Data Collection

Gather data from diverse sources such as endpoint devices, network traffic, system logs,
and security event logs. Ensure the collected data is representative of normal and
malicious activities.

Step 3: Data Preprocessing

Clean the data to remove noise, handle missing values, and standardize formats. Perform
tasks such as data normalization, feature scaling, and feature engineering to prepare the
data for analysis.

Step 4: Feature Extraction

Extract relevant features from the preprocessed data that can help differentiate between
normal and ransomware activities. Features may include file access patterns, process
behaviors, network traffic characteristics, and system resource usage.

Step 5: Labeling Data:

Annotate the dataset with labels indicating whether each data instance represents normal
behavior or ransomware activity.This labeled dataset will be used for training the machine
learning models.

Step 6: Model Selection:

Choose appropriate machine learning algorithms based on the problem at hand and the
characteristics of the data. Common ML algorithms for ransomware detection include
supervised learning algorithms
Step 7: Model Training:

Split the labeled dataset into training and testing sets. Train the selected machine learning
models on the training set using appropriate techniques and algorithms. Validate the
trained models using the testing set to ensure generalization and robustness.

Step 8: Evaluation Metrics:

Evaluate the performance of the trained models using relevant evaluation metrics such as
accuracy, precision, recall, F1-score, and ROC-AUC.

Step 9: Model Deployment:

Deploy the trained machine learning models into the production environment for real-time
ransomware detection. Integrate the models with existing security infrastructure such as
endpoint protection systems, network intrusion detection systems (NIDS), and Security
Information and Event Management (SIEM) platforms.

Step 10: Real-time Monitoring:

Continuously monitor incoming data streams in real-time using the deployed machine
learning models. Analyze data patterns and identify anomalies indicative of ransomware
activity.

Step 11: Response Mechanisms:

Implement automated response mechanisms to mitigate ransomware threats detected by


the system. Response actions may include isolating affected devices, quarantining
malicious files, and alerting security personnel for further investigation.

Step 12: Model Maintenance and Updates:


Regularly monitor the performance of the deployed models and update them as needed to
adapt to new ransomware variants and evolving attack techniques. Retrain the models
periodically using fresh data to ensure their effectiveness over time.

CHAPTER 7
CONCLUSION AND FUTURE ENHANCEMENT

7.1 CONCLUSION

Malware or malicious applications may cause catastrophic damages to not only


computer systems but also data centers, web, and mobile applications to various
industries; particu larly, financial and healthcare institutes. Ensuring the safety of
stakeholders’ data from malicious entities is a major challenge that leads us towards the
concept of malware detection and prevention. Artificial Intelligence (AI) can be an
effective solution that we can adopt for the development of Anti Malware Systems.
Having such direction, this study presented a detailed review of malware detection
techniques and ap proaches. At first, we attempted to provide a clear overview of
malware, artificial intelligence, and its narration. An overview of existing malware
detection systems was discussed in section III (B) followed by identifying the limitations
of existing applications. Likely every system, the malware detection ap proaches also
consist of a number of limitations along with facilities and improvements from its
previous version. So far, our findings indicate that AI can be utilized as a promising
domain for the development of anti-malware systems for detecting and preventing
malware attacks or security risks of software applications towards a technological
wonderland. To draw a conclusion, we discuss scores of ideas to overcome the identified
limitations and aim to continue our effort explicitly towards significant accomplishments
around the domain of Malware Detection and Prevention.
7.2 FUTURE ENHANCEMENT

Data quality and quantity—A vast amount of high-quality data are needed to train
machine learning models effectively. However, obtaining high-quality data for
ransomware detection is challenging due to the limited availability of labeled ransomware
samples Rapidly evolving ransomware—Ransomware is constantly changing threat, with
new variants and attack techniques being developed regularly. This makes it challenging
to build machine learning models that can detect all ransomware accurately and quickly.
Preprocessing data for ransomware detection also presents several challenges.

Ransomware often employs obfuscation techniques to evade detection, such as


encrypting the payload or using anti-analysis mechanisms. This can make extracting
relevant data features and identifying patterns that distinguish ransomware from benign
software difficult. In addition, ransomware may use legitimate system functions that are
difficult to distinguish from malicious behavior, requiring advanced feature engineering
and modeling techniques.

Developing more robust and accurate models—Researchers must build more substantial
and precise machine learning models that detect a wide range of ransomware variants and
attack techniques. This can be achieved through advanced techniques such as deep
learning and ensemble learning.

Incorporating real-time detection capabilities—Ransomware-detection systems must


incorporate real-time detection capabilities to quickly identify and prevent ransomware
attacks. This can be achieved through the use of real-time monitoring and analysis
techniques.

Collaboration and sharing of data—Collaboration and sharing of data among re searchers


and organizations can help develop more effective ransomware-detection systems. This
can help build more comprehensive datasets for training and testing machine learning
models. Developing effective machine-learning-based ransomware-detection systems is
challenging for several reasons. However, with advanced techniques and collaboration
among researchers and organizations, it is possible to develop more robust and accurate
ransomware-detection systems.
REFERENCE

1. O. Asaolu, “On the emergence of new computer tech nologies.” Educational


Technology Society, vol. 9, pp. 335–343, 01 2006.
2. Z. Arsic and B. Milovanovic, “Importance of computer technology in realization of
cultural and educational tasks of preschool institutions,” International Journal of Cog
nitive Research in Science, Engineering and Education, vol. 4, pp. 9–15, 06 2016.
3. A. P. Gilakjani, “A detailed analysis over some important issues towards using
computer technology into the efl classrooms,” Universal Journal of Educational Research,
vol. 2, pp. 146–153, 2014.
4. H. F. Md Jobair, M. Paul, C. Ryan, S. Hossain, and C. Victor, “Smart connected
aircraft: Towards security, privacy, and ethical hacking,” International Conference on
Security of Information and Networks, 2022.
5. S. Subramanya and N. Lakshminarasimhan, “Computer viruses,” Potentials, IEEE, vol.
20, pp. 16– 19, 11 2001.
6. S. Levy and J. Crandall, “The program with a personality: Analysis of elk cloner, the
first personal computer virus,” 07 2020.
7. N. Milosevic, “History of malware,” 02 2013.
8. A. P. Namanya, A. Cullen, I. Awan, and J. Pagna Diss, “The world of malware: An
overview,” 09 2018.
9. I. Khan, “An introduction to computer viruses: Problems and solutions,” Library Hi
Tech News, vol. 29, pp. 8–12, 09 2012.
10. M. Bishop, “An overview of computer viruses in a research environment,” USA,
Tech. Rep., 1991.
11. D. B. Patil and M. Joshi, “A study of past, present computer virus perfor- mance of
selected security tools,” Southern Economist, 12 2012.
12. A. Terekhov. History of the antivirus. [Online]. Available:
https://wwwhotspotshieldcom/blog/history of-the-antivirus.
13. M. J. Hossain Faruk, H. Shahriar, M. Valero, S. Sneha, S. Ahamed, and M. Rahman,
“Towards blockchain-based secure data management for remote patient monitor 5376
ing,” IEEE International Conference on Digital Health (ICDH), 2021.
14. M. J. Hossain Faruk, “Ehr data management: Hyper ledger fabric-based health data
storing and sharing,” The Fall 2021 Symposium of Student Scholars, 2021.
15. S. Ryan, R. Mohammad A, H. F. Md Jobair, S. Hossain, and C. Alfredo, “Ride-hailing
for autonomous vehi cles: Hyperledger fabric-based secure and decentralize blockchain
platform,” IEEE International Conference on Big Data, 2021.
16. D. G. Vigna. (2020) How ai will help in the fight against malware. [Online].
Available: https://techbeaconcom/ security/how-ai-will-help-fight-against-malware
17. H. Hassani, E. Silva, S. Unger, M. Tajmazinani, and S. MacFeely, “Artificial
intelligence (ai) or intelligence augmentation (ia): What is the future?” AI, vol. 1, p. 1211,
04 2020.
18. A. I. Nones, A. Palepu, and M. Wallace. (2019) Artificial intelligence (ai). [Online].
Available: cisseinfo/pdf/2019/ RR-01-artificial-intelligencepdf
19. S. Ahn, S. V. Couture, A. Cuzzocrea, K. Dam, G. M. Grasso, C. K. Leung, K. L.
McCormick, and B. H. Wodi, “A fuzzy logic based machine learning tool for sup porting
big data business analytics in complex artificial intelligence environments,” in 2019 IEEE
International Conference on Fuzzy Systems (FUZZ-IEEE), 2019, pp. 1–6.
20 A. Cranage. (2019) Getting smart about artificial intel ligence. [Online]. Available:
https://sangerinstituteblog/ 2019/03/04/getting-smart-about-artificial-intelligence
21. J. Alzubi, A. Nayyar, and A. Kumar, “Machine learning from theory to algorithms: An
overview,” Journal of Physics: Conference Series, vol. 1142, p. 012012, 11 2018.
22. T. Ayodele, Machine Learning Overview, 02 2010.
23. M. Ahmad, “Malware in computer systems: Problems and solutions,” IJID
(International Journal on Informat ics for Development), vol. 9, p. 1, 04 2020.
24. N. Milosevic, “History of malware,” Digital forensics magazine, vol. 1, no. 16, pp.
58–66, Aug. 2013.
25. S. Gupta, “Types of malware and its analysis,” Inter national Journal of Scientific
Engineering Research, vol. 4, 2013. [Online]. Available: https://wwwijserorg/
researchpaper/Types-of-Malware-and-its-Analysispdf
26. Statista. Number of worldwide internet hosts in the domain name system (dns) from
1993 to 2019. [Online]. Available: https://wwwstatistacom/statistics/264473/ number-of-
internet-hosts-in-the-domain-name-system/

27. S. Poudyal, D. Dasgupta, Z. Akhtar, and K. D. Gupta, “Malware analytics: Review of


data mining, machine learning and big data perspectives,” 12 2019.
%pip install pefile

%pip install mlxtend

%pip install tpot

%pip uninstall scikit-learn -y

%pip install scikit-learn==0.23.1

import os

import pandas as pd

import numpy as np

from matplotlib import pyplot as plt

import pickle

import pefile

import sklearn.ensemble as ek

from sklearn import tree, linear_model

from sklearn.feature_selection import SelectFromModel


import joblib

from sklearn.naive_bayes import GaussianNB

from sklearn.metrics import confusion_matrix

from sklearn.pipeline import make_pipeline

from sklearn import preprocessing

from sklearn import svm

from sklearn.linear_model import LogisticRegression

from statsmodels.stats.outliers_influence import variance_inflation_factor as vif

from sklearn.model_selection import train_test_split

from mlxtend.plotting import plot_confusion_matrix

import seaborn as sns

from sys import getsizeof

import warnings

warnings.filterwarnings("ignore")

df=pd.read_csv("Ransomware.csv",sep='|')

initial_size = getsizeof(df)/(1024.0**3)

print("Size of DataFrame: {} GB".format(initial_size))

df.legitimate.value_counts() #1 means legitimate, 0 means malware

# Converting labelled data in categories datatype

df.legitimate = df.legitimate.astype('category')

df.legitimate

plt.pie(df.legitimate.value_counts().values.tolist(), labels=['Safe','Ransomware'], autopct='%.2f%%')


plt.legend()

plt.show()

df.md5.nunique()

df.md5.shape[0]

df.shape[1]

df.columns

df.dtypes

print(X_test.shape[0] + X_train.shape[0])

print('Training labels shape:', y_train.shape)

print('Test labels shape:', y_test.shape)

print('Training features shape:', X_train.shape)

print('Test features shape:', X_test.shape)

X_test.iloc[i]

import os

import getpass

from cryptography.fernet import Fernet

key = Fernet.generate_key()

username = getpass.getuser()

url = 'C:\\Users\\'+username+'\\Desktop'

print(url)

os.chdir(url)

print(os.getcwd())

f = open("demo.txt", 'w')

f.write("hello world")
f.close()

getPermission = input('--> Enter Yes to permmision : ')

if 'yes' in getPermission:

with open('filekey.txt', 'wb') as filekey:

filekey.write(key)

filekey.close()

with open("demo.txt", 'r') as file:

original = file.read()

file.close()

fernet = Fernet(key)

encrypted = fernet.encrypt(original.encode())

with open('demo.txt', 'wb') as encrypted_file:

encrypted_file.write(encrypted)

encrypted_file.close()

else:

print('thank you')

You might also like