0% found this document useful (0 votes)

16 views38 pages

Naal

This project report details the development of a malware detection system using machine learning, focusing on the increasing cyber threats faced by the banking sector. It covers various malware detection techniques, methodologies for system creation, and the tools and technologies required for implementation. The report emphasizes the importance of proactive defense strategies and the need for continuous adaptation to evolving malware tactics.

Uploaded by

Arushi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views38 pages

Naal

Uploaded by

Arushi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 38

Project Report:

Malware detection system using machine learning

Abstract

The banking sector has become increasingly reliant on digital infrastructure, which, while
enhancing efficiency and accessibility, has also exposed it to a range of cyber threats. This
project explores the nature, scope, and impact of cyber crimes targeting banking institutions.
It highlights key types of cyber attacks such as phishing, identity theft, ATM fraud,
ransomware, and insider threats. Through detailed case studies, the report examines how
these crimes are executed and their devastating effects on financial stability and public trust.

The study also investigates the advanced tools and technologies used to safeguard banking
operations, including firewalls, intrusion detection systems, encryption, and AI-based threat
monitoring. Furthermore, it reviews the legal and regulatory frameworks governing
cybersecurity in the financial sector and evaluates the role of law enforcement and forensic
teams in investigating such crimes.

practices, this project aims to present a comprehensive understanding of how cyber security is
implemented in the banking domain, the challenges involved, and the emerging trends that
will shape its future. It emphasises the importance of proactive defense strategies, robust
policy enforcement, and inter-agency collaboration in mitigating the risks posed by cyber
crimes.
2. Literature Review

 Previous Research: Summarise key research works on malware detection techniques

such as behaviour-based analysis, signature-based detection, and heuristic-based
approaches.
 Malware Evolution: Discuss how malware has evolved over the years and the
challenges in detecting newer types of malware.
 Current Malware Detection Tools: Briefly explain existing malware detection tools
like ClamAV, Windows Defender, McAfee, Sophism, etc.
3. Malware Detection Techniques

 Signature-Based Detection: Explain how malware is detected based on predefined

patterns or signatures.
 Heuristic-Based Detection: Describe how behaviour patterns are analysed to identify
malware without relying on signatures.
 Machine Learning-Based Detection: Discuss the growing use of machine learning
algorithms for detecting malware, including supervised and unsupervised learning
models.
Methodology for Malware Detection System

The methodology for creating an effective malware detection system involves several key
stages, from data collection and preprocessing to model selection, training, evaluation, and
deployment. Below is a detailed explanation of each step involved in the process.

1. Data Collection

The first step in creating a malware detection system is to gather a comprehensive dataset
containing both malicious and benign samples. The quality and diversity of the dataset are
crucial to training a robust and accurate model. Common sources of malware datasets
include:

 CICIDS 2017 Dataset: Contains features extracted from network traffic to classify
benign and malicious activities.
 Kaggle Datasets: Publicly available datasets with both benign and malware samples.
 MalwareBusters Dataset: A dataset containing various types of malware samples,
often used for testing malware detection systems.

Data collection can also include other forms of malware such as worms, viruses, ransomware,
and trojans. These datasets typically contain characteristics such as:

 File metadata (size, creation date, etc.)

 Behaviour patterns (API calls, system changes, etc.)
 Network traffic characteristics (if malware is network-based)
 based)

2. Data Preprocessing

Before using the dataset for training a machine learning model, the data must undergo several
preprocessing steps:

Feature Extraction:

Feature extraction is the process of identifying and isolating relevant attributes from raw data
to aid in identifying patterns that represent malicious behaviour. Common features extracted
in malware datasets include:

include:

 Static Features: Features derived from the file without executing it, such as file size,
file type, and hash values.
 Dynamic Features: Features collected when a file is executed, such as system calls,
file system changes, and network traffic.
 Behavioural Features: These include system and process behaviour during execution,
like memory consumption, process spawning, and API calls.
Data Normalization:

Normalisation ensures that the input features are on a similar scale to help machine learning
models converge more quickly. Methods like Min-Max Scaling or Z-Score Standardisation
are commonly used.

Handling Imbalanced Data:

In many malware detection datasets, the number of benign files typically outweighs the
number of malicious files. This class imbalance can bias the model toward predicting benign
files. Techniques to handle this include:
 Oversampling: Generating more samples for the minority class (malware).
 Under-sampling: Reducing the number of benign samples in the dataset.
 Synthetic Data Generation: Using techniques like SMOTE (Synthetic Minority Over-
sampling Technique) to generate new malware samples.
3. Model Selection

Selecting an appropriate model is critical for effective malware detection. Several machine
learning techniques can be employed for this purpose, each having its own strengths:

3.1 Supervised Learning Algorithms

Supervised learning requires a labeled dataset where both malicious and benign samples are
identified. The following models are commonly used for malware detection:

 Decision Trees: Decision trees work by making a series of binary decisions based on
feature values. These models are interpretable, which can be useful for understanding
how the system makes

 decisions.
 Random Forest: An ensemble method that combines multiple decision trees to
improve classification accuracy and reduce overfitting. It is highly effective in
distinguishing malware from benign files.
 Support Vector Machines (SVM): SVMs are powerful classifiers that work well for
high-dimensional feature spaces and are effective in detecting malware by finding the
best hyperplane that separates malicious and benign samples.
 Logistic Regression: Although a simpler algorithm, logistic regression can be useful
when the dataset is linearly separable and can be used for binary classification tasks.
 Random Forest: An ensemble method that combines multiple decision trees to
improve classification accuracy and reduce overfitting. It is highly effective in
distinguishing malware from benign files.
 Support Vector Machines (SVM): SVMs are powerful classifiers that work well for
high-dimensional feature spaces and are effective in detecting malware by finding the
best hyperplane that separates malicious and benign samples.
 Logistic Regression: Although a simpler algorithm, logistic regression can be useful
when the dataset is linearly separable and can be used for binary classification tasks.
 K-Nearest Neighbours (KNN): This algorithm classifies malware based on the
majority class of its nearest neighbours. It’s particularly useful when the decision
boundaries between classes are not easily definable.
3.2 Deep Learning Models

In recent years, deep learning techniques have gained popularity due to their ability to learn
complex patterns in large datasets. Some examples include:

 Convolutional Neural Networks (CNNs): Used to detect malware in binary files,

CNNs are good at learning hierarchical patterns in data.
 Recurrent Neural Networks (RNNs): Can be used when sequential or time-series data,
such as network traffic or system logs, are involved.
 Auto-encoders: Used for anomaly detection where the model learns a compressed
representation of benign behaviour and flags deviations as potential malware.


4. Model Training

Once the dataset is preprocessed and the model is selected, the next step is training the
machine learning model. This involves the following sub-steps:

4.1 Training the Model

 Split the dataset into training and test sets (typically an 80/20 or 70/30 split).
 The model is trained using the training data, and the learning algorithm updates the
model parameters based on the features and labels in the dataset.
 For deep learning models, training may require specialised hardware like GPUs to
handle the complexity and size of the data.

4.2 Hyper-parameter Tuning

 Hyper-parameters are parameters that are not learned from the data, such as the
learning rate, batch size, and tree depth. These hyper-parameters can be tuned using
techniques like Grid Search or Random Search to find the best combination that
maximises the model’s performance.
4.3 Cross-Validation

To ensure that the model is generalising well and not overfitting the training data, cross-
validation is used. This involves splitting the dataset into several subsets, training the model
on a subset, and testing it on the remaining data. The process is repeated for each subset.

5. Model Evaluation

After training the model, it is essential to evaluate its performance on the test dataset. The
following metrics are commonly used to evaluate malware detection systems
5.1 Accuracy

The percentage of correct predictions made by the model. However, in imbalanced datasets,
accuracy alone may not be sufficient.

5.2 Precision, Recall, and F1-Score

 Precision: The percentage of true positives (correctly identified malware) among all
predicted positives.
 Recall: The percentage of true positives among all actual positives (i.e., how many
actual malware samples the model detected).
 F1-Score: The harmonic mean of precision and recall, offering a balance between the
two.

5.3 Confusion Matrix

A confusion matrix provides a detailed breakdown of model performance, showing the true
positives, false positives, true negatives, and false negatives.

6. Deployment and Real-Time Detection

Once the model is trained and validated, it can be deployed into a production environment to
detect malware in real time. This involves:

 Integrating the detection system into network security infrastructure or endpoint

security solutions.

 Real-time monitoring of system behaviours and network traffic to identify malware

activities as they happen.
 Automated response systems can be built to isolate and neutralise malware once
detected.

7. Challenges and Future Work

 Evasion Techniques: Malware creators are constantly evolving new techniques to

bypass detection. Future systems will need to adapt to these new strategies.
 Polymorphic Malware: The ability of malware to change its code to avoid detection is
a significant challenge that machine learning systems will need to address.
 False Positives: Minimising the number of benign files wrongly flagged as malware is
critical to ensuring the reliability of the system.
Tools and Technologies for Malware Detection Project

In a Malware Detection project, various tools, technologies, and frameworks are required to
effectively implement and deploy the system. Below are the key tools and technologies that
can be used for this project:

1. Programming Languages

The choice of programming languages is crucial for building and implementing the malware
detection system. Common programming languages used in malware detection projects
include:

 Python:
o Widely used for its simplicity and extensive support for machine learning and
data analysis libraries.
o Libraries such as Scikit-learn, TensorFlow, Karas, and PyTorch allow easy
implementation of machine learning models.
o Python is also useful for data preprocessing, feature extraction, and integration
with other tools.
 R:
o R is a powerful language for statistical computing and is useful for data
analysis and visualisation.
o Commonly used in academic settings for modelling and statistical analysis
 Java:
o Used in enterprise-level applications for building scalable malware detection
systems.
o Java is robust and often used in network security tools.
 C/C++:
o Often used for developing low-level system tools such as antivirus engines,
malware analysis tools, and performance optimisation.
2. Machine Learning Frameworks

Machine learning forms the core of modern malware detection systems. These frameworks
help implement machine learning algorithms and deep learning models.

 Scikit-learn:
o A Python library that provides simple tools for data analysis and machine
learning. It supports various algorithms for classification, regression,
clustering, and dimensionality reduction, including Decision Trees, Random
Forest, KNN, and SVM.
o It is useful for traditional machine learning models in malware detection.

 TensorFlow:
o An open-source framework developed by Google that facilitates the
development of deep learning models. It is well-suited for larger datasets and
complex models, such as CNNs and RNNs.
o TensorFlow is particularly useful for developing malware detection systems
that use Deep Learning for feature extraction and classification.
 Keres:
o A high-level neural networks API written in Python, running on top of
TensorFlow. It simplifies the creation and training of deep learning models
such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks
(RNNs), and Auto-encoders.
 PyTorch:
o Another popular open-source machine learning library, especially useful for
deep learning. PyTorch provides flexibility in building complex neural
network architectures and is suitable for research and development of
advanced malware detection systems.

o samples.

5. Feature Extraction Tools

Feature extraction is essential for converting raw malware data (such as binary files or
network traffic) into features that a machine learning model can process.

 refile:
o A Python library used for extracting metadata from Windows executable files
(PE files). It is used to extract static features such as file headers, section
names, and size, which are helpful for detecting malicious executables.

o executables.
 Radar:
oAn open-source reverse engineering tool that can be used to analyse the
behaviour and structure of malware. It is useful for static analysis and feature
extraction from malware binaries.
 YARA:
o A tool for identifying and classifying malware through rules based on string
matching. It is often used in malware detection systems for signature-based
detection.

 Cuckoo Sandbox:
o An open-source automated malware analysis system. It is used to analyse the
behaviour of suspected malware in a controlled environment (sandbox),
providing dynamic features like system changes, API calls, and network
activity.
6. Evaluation Tools

Once the machine learning model is trained, it needs to be evaluated for performance. These
tools help in evaluating the efficiency and accuracy of malware detection systems:

 Scikit-learn:
o The same library used for model training also provides tools for model
evaluation, including functions for cross-validation, confusion matrices, and
performance metrics like accuracy, precision, recall, and F1-score.
 TensorBoard:
o A tool for visualising the training process of TensorFlow models. It helps
track the loss and accuracy of deep learning models and enables better
understanding and tuning of the model.

 Weka:
oA popular open-source tool for data mining and machine learning, useful for
evaluating classifiers and visualising results in a more user-friendly interface.
 XGBoost:
o A scalable, high-performance machine learning algorithm widely used for
classification problems. It is especially effective for datasets with large
features.
7. Deployment and Real-time Monitoring Tools

Once the malware detection system is developed and evaluated, it must be deployed into real-
world environments for continuous monitoring.

 Docker:
o A containerisation platform used to package the malware detection system and
all its dependencies into portable containers for deployment.
 Kubernetes:
o An open-source platform for managing containerised applications. It can help
deploy and scale malware detection models across multiple nodes in a
production environment.

 Apache Kafka:
o A distributed event streaming platform used to handle real-time data and
integrate malware detection systems into an organisation’s security
infrastructure.
 Splunk:
o A platform for searching, monitoring, and analysing machine-generated big
data. It is widely used for monitoring network activity and deploying security
information and event management (SIEM) systems.
6. Results and Discussion

 Detection Accuracy: Show the detection accuracy of your model compared to other
approaches.
 Case Study: Discuss real-world cases where your malware detection system would
have been useful.
 Limitations: Address any limitations or challenges you encountered, such as false
positives or difficulty in detecting polymorphic malware.
7. Conclusion

 Summary of Findings: Summarise the key outcomes of the project.

 Future Work: Suggest possible improvements, such as integrating more advanced
machine learning techniques or using hybrid models.
 Impact: Reflect on how effective malware detection systems can improve security in
banking, government, and other sectors.
References

1. Research Papers and Articles

1. Skipper, A. K., Petracca, G., & Aksu, H. (2019).

“A Survey on Sensor-based Threats to Smart Devices and Applications.”

IEEE Communications Surveys & Tutorials, 21(2), 1249-1270.

DOI: 10.1109/COMST.2019.2896471

2. Ye, Y., Li, T., Adjeroh, D., & Iyengar, S. S. (2017).

“A Survey on Malware Detection Using Data Mining Techniques.”

ACM Computing Surveys (CSUR), 50(3), 1–40.

DOI: 10.1145/3073559

3. Shabtai, A., et al. (2012).

3. 2012).

“Detection of malicious code by applying machine learning classifiers on static

features: A state-of-the-art survey.”

Information Security Technical Report, 14(1), 16–29.

Elsevier.

4. Vijayakumar, R., Soman, K. P., & Poornachandran, P. (2019).

“Evaluating Deep Learning Approaches to Malware Detection Using Image-based

Representation.”

arXiv preprint arXiv:1804.07973.

5. Eskandari, M., & Leveson, E. (2020).

“SoK: Machine Learning for Malware Detection.”

arXiv preprint arXiv:2006.01531.

2. Books

6. Stallings, W. (2018)

“Computer Security: Principles and Practice” (4th ed.).

Pearson Education.

ISBN: 9780134794105

7. Kaspersky Lab (2020)

“The Threats Handbook: A Guide to Malware, Vulnerabilities, and Attacks”

Kaspersky Security Resources.

8. Mark Stamp (2018)

“Information Security: Principles and Practice”

Wiley.

ISBN: 9781119026834
3. Online Sources and Blogs

9. Microsoft Security Blog (2021)

“Detecting polymorphic malware using ML-based heuristics.”

https://www.microsoft.com/security/blog

10. Kaggle – Microsoft Malware Classification Challenge (BIG 2015)

https://www.kaggle.com/c/malware-classification

11. Canadian Institute for Cybersecurity – CICIDS 2017 Dataset

https://www.unb.ca/cic/datasets/ids-2017.html

12. MITRE ATT&CK Framework

https://attack.mitre.org

A globally accessible knowledge base of adversary tactics and techniques based on

real-world observations.

13. VirusTotal – Online malware analysis tool.

350-701 All 449Q
No ratings yet
350-701 All 449Q
134 pages
Spam Tools Download
0% (1)
Spam Tools Download
2 pages
Development of Malware Detection and Analysis Mode
No ratings yet
Development of Malware Detection and Analysis Mode
50 pages
Using ES 5.0 Labs
50% (2)
Using ES 5.0 Labs
28 pages
Sandboxing Approach1
67% (3)
Sandboxing Approach1
27 pages
IT Infrastructure Security Risk Assessment Using The Center For Internet Security Critical Security Control Framework A Case Study at Insurance Company
No ratings yet
IT Infrastructure Security Risk Assessment Using The Center For Internet Security Critical Security Control Framework A Case Study at Insurance Company
6 pages
CompTIA Security+ SY0-201 Actual Test
100% (1)
CompTIA Security+ SY0-201 Actual Test
336 pages
Malwarepjct PDF
No ratings yet
Malwarepjct PDF
70 pages
Malware - Detection - Using - Machine - Learning (3) - Removed
No ratings yet
Malware - Detection - Using - Machine - Learning (3) - Removed
31 pages
Phase 1 Report Group ID CSE19-G58 Malware Detection Using ML
No ratings yet
Phase 1 Report Group ID CSE19-G58 Malware Detection Using ML
30 pages
Malware
No ratings yet
Malware
10 pages
Malware Detection
No ratings yet
Malware Detection
37 pages
PMDG 737NGX Introduction
No ratings yet
PMDG 737NGX Introduction
134 pages
Malware Adware Example
No ratings yet
Malware Adware Example
11 pages
Malware Detection by Machine Learning: Shivam Vatshayan Software Engineer
No ratings yet
Malware Detection by Machine Learning: Shivam Vatshayan Software Engineer
11 pages
Research Paper 2 Malware Detection
No ratings yet
Research Paper 2 Malware Detection
24 pages
Thesis
No ratings yet
Thesis
76 pages
Survey Paper J.cose.2018.11.001
No ratings yet
Survey Paper J.cose.2018.11.001
58 pages
Mal Ware Analysis and Dect I On
No ratings yet
Mal Ware Analysis and Dect I On
48 pages
Security Awareness - Chapter 1
No ratings yet
Security Awareness - Chapter 1
6 pages
Malware Detection Using Machine Learning
No ratings yet
Malware Detection Using Machine Learning
4 pages
Mushkan Report
No ratings yet
Mushkan Report
67 pages
The Curious Case of Machine Learning in Malware Detection: Sherif Saad, William Briguglio and Haytham Elmiligi
No ratings yet
The Curious Case of Machine Learning in Malware Detection: Sherif Saad, William Briguglio and Haytham Elmiligi
9 pages
PPT ch05
No ratings yet
PPT ch05
45 pages
OJT Narrative Report - Elixa Hernandez
No ratings yet
OJT Narrative Report - Elixa Hernandez
29 pages
Project JAISON
No ratings yet
Project JAISON
61 pages
Malware - Detection - Using - Machine - Learning (2) - Removed
No ratings yet
Malware - Detection - Using - Machine - Learning (2) - Removed
31 pages
Malware Classification ML Report TechGB2336 Group13
No ratings yet
Malware Classification ML Report TechGB2336 Group13
27 pages
Enhancing Malware Detection and Analysis Using Deep Learning and Explainable Ai (Xai)
No ratings yet
Enhancing Malware Detection and Analysis Using Deep Learning and Explainable Ai (Xai)
19 pages
Group 7
No ratings yet
Group 7
25 pages
A Framework For Detection of Malicious Code by Exploiting Machine Learning Techniques On Portable Executables
No ratings yet
A Framework For Detection of Malicious Code by Exploiting Machine Learning Techniques On Portable Executables
4 pages
Technical Seminar Report 565
No ratings yet
Technical Seminar Report 565
22 pages
Supervised Malware Detection Model
No ratings yet
Supervised Malware Detection Model
21 pages
Malware Detection
No ratings yet
Malware Detection
38 pages
A Malicious Code Detection Method Based On Stacked Depthwise Separable Convolutions and Attention Mechanism
No ratings yet
A Malicious Code Detection Method Based On Stacked Depthwise Separable Convolutions and Attention Mechanism
27 pages
Malware Detection Using Machine Learning and Deep Learning
No ratings yet
Malware Detection Using Machine Learning and Deep Learning
10 pages
Scalable Malware Detection System Using Big Data A
No ratings yet
Scalable Malware Detection System Using Big Data A
18 pages
2023S FE AM Questions
No ratings yet
2023S FE AM Questions
29 pages
Automated Malware Detection Project R1
No ratings yet
Automated Malware Detection Project R1
10 pages
Final Synposis
No ratings yet
Final Synposis
10 pages
INSTA Mains 2024 Exclusive Internal Security
No ratings yet
INSTA Mains 2024 Exclusive Internal Security
31 pages
Malware Detection
No ratings yet
Malware Detection
29 pages
Presentation 12
No ratings yet
Presentation 12
11 pages
Electronics 11 03665 v2
No ratings yet
Electronics 11 03665 v2
20 pages
Malware Detection Using ML
No ratings yet
Malware Detection Using ML
20 pages
AI-driven Data Analytics For Cyber Threat Intelligence and Anomaly Detection-2108
No ratings yet
AI-driven Data Analytics For Cyber Threat Intelligence and Anomaly Detection-2108
14 pages
Asdajlsdaskldjasd
No ratings yet
Asdajlsdaskldjasd
20 pages
MML Homework Answers
100% (1)
MML Homework Answers
8 pages
Advanced Persistent Threat
No ratings yet
Advanced Persistent Threat
9 pages
Unit 3
No ratings yet
Unit 3
19 pages
Ess Unit 4
No ratings yet
Ess Unit 4
23 pages
The State-of-the-Art in AI-Based Malware Detection Techniques: A Review
No ratings yet
The State-of-the-Art in AI-Based Malware Detection Techniques: A Review
18 pages
Ly Ngoc Vu YSCPaper
No ratings yet
Ly Ngoc Vu YSCPaper
11 pages
Author's Accepted Manuscript: Journal of Network and Computer Applications
No ratings yet
Author's Accepted Manuscript: Journal of Network and Computer Applications
34 pages
Comp. Project Synopsis Reviwed
No ratings yet
Comp. Project Synopsis Reviwed
16 pages
Malware Application Detection Using Machine Learning
No ratings yet
Malware Application Detection Using Machine Learning
7 pages
Malware Detection Research Paper Updated Soheb6
No ratings yet
Malware Detection Research Paper Updated Soheb6
6 pages
Ensemble Model
No ratings yet
Ensemble Model
6 pages
The Curious Case of Machine Learning in Malware Detection: Sherif Saad, William Briguglio and Haytham Elmiligi
No ratings yet
The Curious Case of Machine Learning in Malware Detection: Sherif Saad, William Briguglio and Haytham Elmiligi
8 pages
6 Thsemminiproject
No ratings yet
6 Thsemminiproject
12 pages
FuzzyRNN NIT SUB 2columns PDF
No ratings yet
FuzzyRNN NIT SUB 2columns PDF
8 pages
Malware Detection Using ANN
No ratings yet
Malware Detection Using ANN
10 pages
Malware - Detection - Research - Paper - Updated Soheb6
No ratings yet
Malware - Detection - Research - Paper - Updated Soheb6
8 pages
Malware Final
No ratings yet
Malware Final
13 pages
Malware Detection
No ratings yet
Malware Detection
10 pages
Malware Detection Using Machine Leaning
No ratings yet
Malware Detection Using Machine Leaning
9 pages
A New Malware Detection Model Using
No ratings yet
A New Malware Detection Model Using
9 pages
Final Research - Merged
No ratings yet
Final Research - Merged
10 pages
Mini Project
No ratings yet
Mini Project
11 pages
Analyzing and Detecting Malicious Content: DOCX Files: August 2016
No ratings yet
Analyzing and Detecting Malicious Content: DOCX Files: August 2016
10 pages
Synopsis 1
No ratings yet
Synopsis 1
7 pages
Cyber Sphere and Security Unit 4
No ratings yet
Cyber Sphere and Security Unit 4
11 pages
Amutenda r206668v Technical Paper
No ratings yet
Amutenda r206668v Technical Paper
5 pages
The Ultimate Guide To Privilege Escalation and Prevention in 2021
No ratings yet
The Ultimate Guide To Privilege Escalation and Prevention in 2021
11 pages
IEEE Conference Template 1
No ratings yet
IEEE Conference Template 1
4 pages
A Comprehensive Survey On Identification of Malware Types and Malware Classification Using Machine Learning Techniques
No ratings yet
A Comprehensive Survey On Identification of Malware Types and Malware Classification Using Machine Learning Techniques
8 pages
Design Information Security in Electronic-Based Government Systems Using NIST CSF 2.0, ISO/IEC 27001: 2022 and CIS Control
No ratings yet
Design Information Security in Electronic-Based Government Systems Using NIST CSF 2.0, ISO/IEC 27001: 2022 and CIS Control
8 pages
Unifying Traditional and Machine Learning Approaches For Robust Malware Classification
No ratings yet
Unifying Traditional and Machine Learning Approaches For Robust Malware Classification
6 pages
Sophos CrowdStrike Cortex Comparison
No ratings yet
Sophos CrowdStrike Cortex Comparison
4 pages
Malcode Detection
No ratings yet
Malcode Detection
5 pages
Analysis of Cyber Security Threats Using
No ratings yet
Analysis of Cyber Security Threats Using
5 pages
Fronesis Digital Forensics-Based Early Detection of Ongoing Cyber-Attacks
No ratings yet
Fronesis Digital Forensics-Based Early Detection of Ongoing Cyber-Attacks
16 pages
Terms of Service and User Agreement DIGIO
No ratings yet
Terms of Service and User Agreement DIGIO
3 pages
Malware Detection Using Machine Learning
No ratings yet
Malware Detection Using Machine Learning
2 pages
Amogh Bajpai PBL
No ratings yet
Amogh Bajpai PBL
1 page
DeProdexCator Tool For Reverse Engineering Code Obfuscation
No ratings yet
DeProdexCator Tool For Reverse Engineering Code Obfuscation
10 pages
Amirul Amir Bin Jonaidi Rba 2732B Project Part A
No ratings yet
Amirul Amir Bin Jonaidi Rba 2732B Project Part A
9 pages
Quantum Malware
No ratings yet
Quantum Malware
8 pages
Software Engineering - Assignment#01
No ratings yet
Software Engineering - Assignment#01
1 page

Naal

Uploaded by

Naal

Uploaded by

Project Report:

Malware detection system using machine learning

 Previous Research: Summarise key research works on malware detection techniques

 Signature-Based Detection: Explain how malware is detected based on predefined

 File metadata (size, creation date, etc.)

Handling Imbalanced Data:

3.1 Supervised Learning Algorithms

 Convolutional Neural Networks (CNNs): Used to detect malware in binary files,

4.1 Training the Model

4.2 Hyper-parameter Tuning

5.2 Precision, Recall, and F1-Score

5.3 Confusion Matrix

6. Deployment and Real-Time Detection

 Integrating the detection system into network security infrastructure or endpoint

 Real-time monitoring of system behaviours and network traffic to identify malware

7. Challenges and Future Work

 Evasion Techniques: Malware creators are constantly evolving new techniques to

5. Feature Extraction Tools

 Summary of Findings: Summarise the key outcomes of the project.

1. Research Papers and Articles

1. Skipper, A. K., Petracca, G., & Aksu, H. (2019).

“A Survey on Sensor-based Threats to Smart Devices and Applications.”

IEEE Communications Surveys & Tutorials, 21(2), 1249-1270.

2. Ye, Y., Li, T., Adjeroh, D., & Iyengar, S. S. (2017).

“A Survey on Malware Detection Using Data Mining Techniques.”

ACM Computing Surveys (CSUR), 50(3), 1–40.

3. Shabtai, A., et al. (2012).

“Detection of malicious code by applying machine learning classifiers on static

Information Security Technical Report, 14(1), 16–29.

4. Vijayakumar, R., Soman, K. P., & Poornachandran, P. (2019).

“Evaluating Deep Learning Approaches to Malware Detection Using Image-based

arXiv preprint arXiv:1804.07973.

5. Eskandari, M., & Leveson, E. (2020).

“SoK: Machine Learning for Malware Detection.”

arXiv preprint arXiv:2006.01531.

“Computer Security: Principles and Practice” (4th ed.).

7. Kaspersky Lab (2020)

“The Threats Handbook: A Guide to Malware, Vulnerabilities, and Attacks”

Kaspersky Security Resources.

8. Mark Stamp (2018)

“Information Security: Principles and Practice”

9. Microsoft Security Blog (2021)

“Detecting polymorphic malware using ML-based heuristics.”

10. Kaggle – Microsoft Malware Classification Challenge (BIG 2015)

11. Canadian Institute for Cybersecurity – CICIDS 2017 Dataset

12. MITRE ATT&CK Framework

A globally accessible knowledge base of adversary tactics and techniques based on

13. VirusTotal – Online malware analysis tool.

You might also like