0% found this document useful (0 votes)
28 views10 pages

Final Synposis

Uploaded by

Sanjana.S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views10 pages

Final Synposis

Uploaded by

Sanjana.S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

Jnana Sangama, Belagavi – 590018.

A Project Synopsis on

ML-Based Cross-Platform Malware


Detection
Submitted in partial fulfilment of the requirement for the Eighth semester
Bachelor of Engineering
In
Computer Science and Engineering

Submitted by:
BINDHU SHREE G V 1SJ20CS027
CHANDAN GOWDA N 1SJ20CS032
SANJANA S 1SJ20CS128
SHWETHASHREE K V 1SJ20CS140

Under the guidance of


BHAVYA R A
Assistant Professor
Dept. of CSE, SJCIT

SJC INSTITUTE OF TECHNOLOGY


DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
CHICKBALLAPUR-562101
2023-2024
ABSTRACT

One of the most significant issues facing internet users nowadays is malware. Malware
is any software intentionally designed to cause damage to a computer, server, client, or
computer network. A wide variety of malware types exist, including computer viruses,
worms, Trojan horses, ransomware, spyware, adware, rogue software, wiper and
scareware. Polymorphic malware is a new type of malicious software that is more
adaptable than previous generations of viruses. Polymorphic malware constantly
modifies its signature traits to avoid being identified by traditional signature-based
malware detection models. To identify malicious threats or malware, we will use a
number of machine learning techniques. Machine learning algorithms can be used to
detect malware by identifying its behaviour and other characteristics. The proposed
approach is based on computing the difference in correlation symmetry integrals.
Which demonstrates that machine learning algorithms can be used to effectively detect
malware, even polymorphic malware. This is good news for internet users, as it can
help to improve the security of computer systems and networks.
TABLE OF CONTENTS
Sl. No. Chapters Page No.
1. Introduction 2
2. Literature Survey 6
3. Objectives 7
4. Study area and Methodology 8
5. Software and Hardware Requirements 10
6. Expected Outcome 11
References 12
Chapter 1

INTRODUCTION

Malware is a major threat to the security of computer systems and networks. Cyberattacks are currently
the most pressing concern in the realm of modern technology. The word implies exploiting a system’s
flaws for malicious purposes, such as stealing from it, changing it, or destroying it. Malware is an
example of a cyberattack. Malware is any program or set of instructions that is designed to harm a
computer, user, business, or computer system. The term “malware” encompasses a wide range of threats,
including viruses, Trojan horses, ransomware, spyware, adware, rogue software, wipers, scareware, and
so on. Malicious software, by definition, is any piece of code that is run without the user’s knowledge or
consent. Traditional signature-based malware detection methods are becoming increasingly ineffective
against new and emerging malware strains. Machine learning (ML) algorithms have the potential to
overcome these limitations by detecting malware based on its behaviour and other characteristics. Both
static and dynamic learning methods may be used to identify behavioral similarities between members of
the same family of malware. Unlike static analysis, which examines dangerous files’ contents without
actually running them, dynamic analysis takes their behavior into account by tracking data flows,
recording function calls, and adding monitoring code to dynamic binaries. Machine learning algorithms
may leverage such static and behavioral artefacts to describe the ever-evolving structure of
contemporary Symmetry 2022, 14, 2304 3 of 11 malware, allowing them to identify increasingly
complex malware assaults that could otherwise avoid detection using signature-based techniques. As
machine learning-based solutions do not rely on signatures, they are more successful against newly
released malware. Deep learning algorithms that can perform feature engineering on their own can be
used to obtain and represent features more accurately.
Our synopsis a comprehensive survey of ML algorithms for malware analysis and detection. We here
discuss the different types of ML algorithms that can be used for malware detection, as well as the
different features that can be extracted from malware samples for classification. We also review the
state-of-the-art ML-based malware detection systems and their performance.
In the introduction, we define the need for ML-based malware detection and discuss the advantages of
ML over traditional signature-based methods. We then provide a brief overview of the different types of
ML algorithms that can be used for malware detection, as well as the different features that can be
extracted from malware samples for classification. Finally, we review the state-of-the-art ML-based
malware detection systems and their performance.
Chapter 3
OBJECTIVES

 To investigate on how to implement machine learning to malware detection in order


to detection unknown malware.
 To develop a malware detection software that implement machine learning to detect
unknown malware.
 To validate that malware detection that implement machine learning will be able to
achieve a high accuracy rate with low false positive rate.
 To effectively Detecting malware in specific types of files, such as executable files,
PDFs, or images.
Chapter 4

STUDY AREA AND METHODOLOGY

 Dataset:

Collect a diverse dataset of malwares. The collection has many data files that include
log data for various types of malwares. These recovered log features may be used to
train a broad variety of models.

 Pre-Processing:

Data will be stored in the file system as binary code, and the files themselves will be
unprocessed executables. Unpacking the executables requires a protected
environment, or virtual machine (VM).

 Features Extraction:

We will be building a smaller set of features from a larger set; this technique is
commonly used to maintain the same degree of accuracy while using fewer features.
The goal is to refine the existing dataset of dynamic and static features by keeping
those that were most helpful and eliminating those that were not valuable for data
analysis.

 Features Selection:

After completing feature extraction, which involves the discovery of more features,
feature selection is performed. Feature selection is a crucial process for enhancing
accuracy, simplifying the model, and reducing overfitting, as it involves choosing
features from a pool of newly recognized qualities.
Chapter 5
SOFTWARE AND HARDWARE REQUIREMENTS
5.1 Hardware Requirements:
1. Processor: Intel core Duo 2.0Ghz or more.
2. RAM: 8GB or more
3. Hard disk: 80GB or more
4. Monitor: 15" CRT or LCD monitor

5.2 Software Requirements:


1. Operating System: Windows 10 and more.
2. IDE used: Jupyter notebook.
3. Programming Language Used: Python 3.9.

5.3 Functional Requirements:

1. Malware Detection Algorithm: The system must employ machine learning techniques for
malware detection, including Naive Bayes, SVM, J48, RF, and a proposed approach.
2. High Detection Accuracy: The selected algorithm must achieve a high detection ratio,
ensuring accurate identification of malicious threats.
3. Confusion Matrix: The system should generate a confusion matrix to measure false
positives and false negatives, providing additional performance insights.
4. Comparison of Classifiers: The system must compare the performance of DT, CNN, and
SVM algorithms in terms of detection accuracy, particularly on a small False Positive Rate
(FPR).
5.4 Non-Functional Requirements:

1. Performance: Performance: The system should demonstrate high performance in terms of


accuracy and efficiency in detecting polymorphic malware.
2. Security: It is imperative that the system enhances the security of computer networks by
effectively identifying and mitigating malicious threats.
3.Robustness: The system should be robust enough to handle increasingly common and
complex malicious software.
4.Technological Innovation: The system should incorporate innovative techniques and
approaches to keep up with evolving cyber threats and malware.
Chapter 6
EXPECTED OUTCOME

I. Improved ability to detect new and emerging malware strains.

II. Prevent user from entering malicious websites.

III. Improved detection accuracy.

IV. Reduced false positive rate.

V. Improved ability to detect new and emerging malware strains.


REFERENCES

[1] Akhtar, M.S.; Feng, T, “Malware Analysis and Detection Using Machine Learning
Algorithms (2022)”, DOI:10.3390/sym14112304.
[2] Akshit Kamboj, Priyanshu Kumar, Amit Kumar Bairwa , “Detection of malware in
downloaded files using various machine learning models (2022)”,DOI:
https://doi.org/10.1016/j.eij.2022.12.002.
[3] Raj Sinha, “Study Of Malware Detection Using Machine Learning”, DOI:
10.13140/RG.2.2.11478.16963.
[4] Souri, Hosseini Hum, Cent. Comput. Inf. Sci., “A State-Of-The-Art Survey Of Malware
Detection Approaches Using Datamining Techniques(2018)
”,DOI:org/10.1186/s13673.

You might also like