0% found this document useful (0 votes)

18 views

Automation Detection of Malware and Stenographical Content Using Machine Learning

Contact us for project abstract, enquiry, explanation, code, execution, documentation. Phone/Whatsap : 9573388833 Email : [email protected] Website : https://dcs.datapro.in/contact-us-2 Tags: btech, mtech, final year project, datapro, machine learning, cyber security, cloud computing, blockchain,

Uploaded by

dataprodcs

0% found this document useful (0 votes)

18 views

Automation Detection of Malware and Stenographical Content Using Machine Learning

Uploaded by

dataprodcs

You are on page 1/ 11

ABSTRACT

In recent times many malware attacks increasing in our society. Mainly image-
based malware attacks are spreading worldwide and many people get harmful
malware-based images through the technique called steganography. In the
existing system, only open malware and files from the internet is identified.

The image-based malware cannot be identified and detected so many phishers

make use of this technique and exploit the target. Social media platforms would be
totally harmful to the users. To avoid these difficulties, by implementing Machine
learning we can find the steganographic malware images(contents).

Our proposed methodology developing an Automation detection of malware and

steganographic content using Machine Learning. Steganography is the field of
hiding messages in apparently innocuous media (e.g., images), and steganalysis
is the field of detecting this covert malware.

We propose a machine learning (ML) approach to steganalysis. In the existing

system, only open malware and files from the internet are identified. But in recent
times many people get harmful malware-based images through the technique
called steganography. Social media platforms would be totally harmful to the
users.

To avoid these difficulties, by implementing Machine learning we can find the

steganographic malware images(contents). We use the steganalysis method using
machine learning for logistic classification. By using this we can spot and get
escape from the malware images sharing in social media like WhatsApp,
Facebook without downloading it. It can be also used for all the photo-sharing sites
such as google photos.

V
LIST OF FIGURES

Figure no. Name of the Figure Page no.

4.1 Input JPG image 41

4.2 Output image 41

4.3 Change in Output image 42

4.4 Malware Detection simulation 43

4.5 RGB Layer Identification Step 44

5.1 LSB Graph 50

5.2 False rate graph 50

5.3 Output image 51

5.4 Output image 51
5.5 Binary code image 52

VI
TABLE OF CONTENT

CHAPTER NO. TITLE PAGE NO

1 INTRODUCTION 1
2 LITERATURE SURVEY

2.1 Survey Walk Through 2

2.2 Tensor Flow 2
2.3 Opencv 2
2.4 keras 6
2.5 Numpy 7
2.6 Neural Networks 9
2.7 Convolutional Neural Network 14

3 IMPLEMENTATION
3.1 Image Processing 19
3.1.1 Digital Image Processing 19
3.1.2 Pattern Recognition 20
3.2 Basic approaches to malware detection 21

VII
3.3 Machine learning 22
4
METHODOLOGY 3.4 Unsupervised Learning 22

3.5 Supevised Learning 23

4
. 3.6 Deep Learning 24
13.7 Machine Learning Applications 24

Methodology 29

4.1.1 Training Model 29

4.1.2 Segmentation 29
4.2 Classification 30
4.3 Testing 34

5 RESULT

5.1 Result 49
5.2 Performance Analysis 52

6 CONCLUSION AND FUTURE SCOPE

6.1 Future Scope 54

6.2 Conclusion 54
7 APPENDIX
a) Sample code 58

VIII
IX
CHAPTER 1
INTRODUCTION

By definition, steganography is a technique or art of concealing a type of data

within a different type of data. The word steganography derives from the Greek
words stegano (sealed) and graph (writing), thus meaning "writing a sealed
message. The technique was historically used by governments to hide sensitive
information. One interesting form of steganography sends and receives secret
messages publicly.

There is no way to discover the hidden message except by the sender and
receiver. Because the secret message is embedded in the cover file, anyone
observing it as an ordinary file does not notice that the cover file contains secret
information, thus making steganography more secure.

The person who knows whether the cover file contains secret information is the
only one who can attempt to steal it.Machine learning is the main domain used for
modern steganography purposes. The major reason is the modern problem needs
a modern solution. Machine learning powerful prediction algorithm helps to find out
the stego content. It can be also useful for filtering the contents in the transmission
area.

Image Steganography is a type of steganography. Common template is already

programmed regarding the stego and the software identifies the text by matching
the template.[5]
A review of LSB image steganography techniques is used for small types of text
and URLs. It cannot find large-sized texts compared to the other techniques.It is
mostly based on the LSB algorithm and its accuracy level is very low.[6]
Detection of LSB alternate and LSB identical Steganography Using Gray Level
Run Length Matrix Using an old model system which is very useful for encrypting
the system. Grayscale image recognition is very useful for encrypting the text
alone and it is not useful for encrypting malware attacks.[7] Enhance security and
ability for Arabic text steganography using 'Kashida' extensions. is very time-
consuming for encrypting the texts.

1
CHAPTER 2
LITERATURE SURVEY

2.1 SURVEY WALKTHROUGH:

The domain analysis that we have done for the project mainly involved
understanding the neural networks

2.2 TensorFlow:

TensorFlow is a free and open-source software library for dataflow and

differentiable programming across a range of tasks. It is a symbolic math
library, and is also used for machine learning applications such as neural
networks. It is used for both research and production at Google.

Features: TensorFlow provides stable Python (for version 3.7 across all
platforms) and C APIs; and without API backwards compatibility guarantee:
C++, Go, Java, JavaScript and Swift (early release). Third-party packages are
available for C#, Haskell Julia, MATLAB,R, Scala, Rust, OCaml, and
Crystal."New language support should be built on top of the C API. However,
not all functionality is available in C yet." Some more functionality is provided
by the Python API.

Application: Among the applications for which TensorFlow is the foundation,

are automated image-captioning software, suchas DeepDream.

2.3 Opencv:

OpenCV (Open Source Computer Vision Library) is a library of programming

functions mainly aimed at real-time computer vision.[1] Originally developed
by Intel, it was later supported by Willow Garage then Itseez (which was later
acquired by Intel[2]). The library is cross-platform and free for use under the
open-source BSD license.

2
OpenCV's application areas include:

 2D and 3D feature toolkits

 Egomotion estimation
 Facial recognition system
 Gesture recognition
 Human–computer interaction (HCI)
 Mobile robotics
 Motion understanding
 Object identification
 Segmentation and recognition

Stereopsis stereo vision: depth perception from 2 cameras

 Structure from motion (SFM).

 Motion tracking
 Augmented reality

To support some of the above areas, OpenCV includes a statistical machine

learning library that contains:

 Boosting
 Decision tree learning
 Gradient boosting trees
 Expectation-maximization algorithm
 k-nearest neighbor algorithm
 Naive Bayes classifier
 Artificial neural networks
 Random forest
 Support vector machine (SVM)
 Deep neural networks (DNN)

AForge.NET, a computer vision library for the Common Language Runtime

(.NET Framework and Mono).

3
ROS (Robot Operating System). OpenCV is used as the primary vision
package in ROS.

VXL, an alternative library written in C++.

Integrating Vision Toolkit (IVT), a fast and easy-to-use C++ library with an
optional interface to OpenCV.

CVIPtools, a complete GUI-based computer-vision and image-processing

software environment, with C function libraries, a COM-based DLL, along with
two utility programs for algorithm development and batch processing.

OpenNN, an open-source neural networks library written in C++. List of free

and open source software packages

 OpenCV Functionality
 Image/video I/O, processing, display (core, imgproc, highgui)
 Object/feature detection (objdetect, features2d, nonfree)
 Geometry-based monocular or stereo computer vision (calib3d,
stitching, videostab)
 Computational photography (photo, video, superres)
 Machine learning & clustering (ml, flann)
 CUDA acceleration (gpu)

 Image-Processing:

Image processing is a method to perform some operations on an image, in

order to get an enhanced image and or to extract some useful information
from it.

If we talk about the basic definition of image processing then ―Image

4
processing is the analysis and manipulation of a digitized image, especially in
order to improve its quality‖.

Digital-Image :

An image may be defined as a two-dimensional function f(x, y), where x and y

are spatial(plane) coordinates, and the amplitude of fat any pair of coordinates
(x, y) is called the intensity or grey level of the image at that point.

In another word An image is nothing more than a two-dimensional matrix (3-D

in case of coloured images) which is defined by the mathematical function f(x,
y) at any point is giving the pixel value at that point of an image, the pixel
value describes how bright that pixel is, and what colour it should be.

Image processing is basically signal processing in which input is an image and

output is image or characteristics according to requirement associated with
that image.Image processing basically includes the following three steps :
Importing the image. Analysing and manipulating the imageOutput in which
result can be altered image or report that is based on image analysis

Applications of Computer Vision:

Here we have listed down some of major domains where Computer Vision is
heavily used.

 Robotics Application
 Localization − Determine robot location automatically
 Navigation
 Obstacles avoidance
 Assembly (peg-in-hole, welding, painting)
 Manipulation (e.g. PUMA robot manipulator)
 Human Robot Interaction (HRI) − Intelligent robotics to interact with and
serve people

 Medicine Application
 Classification and detection (e.g. lesion or cells classification and tumor

5
coding necessary for writing deep neural network code. The code is hosted on
GitHub, and community support forums include the GitHub issues page, and a
Slack channel.

In addition to standard neural networks, Keras has support for convolutional

and recurrent neural networks. It supports other common utility layers like
dropout, batch normalization, and pooling.

Keras allows users to productize deep models on smartphones (iOS and

Android), on the web, or on the Java Virtual Machine. It also allows use of
distributed training of deep-learning models on clusters of Graphics
processing units (GPU) and tensor processing units (TPU) principally in
conjunction with CUDA.

Keras applications module is used to provide pre-trained model for deep neural
networks. Keras models are used for prediction, feature extraction and fine
tuning. This chapter explains about Keras applications in detail.

Pre-trained models

Trained model consists of two parts model Architecture and model

Weights. Model weights are large file so we have to download and extract the
feature from ImageNet database. Some of the popular pre-trained models are
listed below,

 ResNet
 VGG16
 MobileNet
 InceptionResNetV2
 InceptionV3

2.5 Numpy:

NumPy (pronounced /ˈnʌmpaɪ/ (NUM-py) or sometimes /ˈnʌmpi/ (NUM-pee)) is a

library for the Python programming language, adding support for large, multi-

Flight Fare Prediction Final
No ratings yet
Flight Fare Prediction Final
65 pages
Ericsson:: Alarm Cause Type
100% (1)
Ericsson:: Alarm Cause Type
4 pages
1822-b.e-cse-batchno-103
No ratings yet
1822-b.e-cse-batchno-103
80 pages
Conversion of Sign Language Into Speech or Text Using CNN
No ratings yet
Conversion of Sign Language Into Speech or Text Using CNN
11 pages
Thesis 108EI038 026
No ratings yet
Thesis 108EI038 026
48 pages
Nano Meter
No ratings yet
Nano Meter
62 pages
Final Doc Fin PDF
No ratings yet
Final Doc Fin PDF
87 pages
INTERNSHIP REPORT-vivek Payla
No ratings yet
INTERNSHIP REPORT-vivek Payla
20 pages
acknowledgment skin lesion
No ratings yet
acknowledgment skin lesion
8 pages
Mini Project Documentation
No ratings yet
Mini Project Documentation
38 pages
3B Mini
No ratings yet
3B Mini
59 pages
2 Index
No ratings yet
2 Index
3 pages
Coronavirus Disease (Covid-19) Cases Analysis Using Machine Learning
No ratings yet
Coronavirus Disease (Covid-19) Cases Analysis Using Machine Learning
11 pages
Crime Prediction Model Using Artificial Neural Network
No ratings yet
Crime Prediction Model Using Artificial Neural Network
53 pages
Malware Detection Using Machine Learning
No ratings yet
Malware Detection Using Machine Learning
112 pages
Batch 4
No ratings yet
Batch 4
80 pages
Handwriting Recognition Using Machine Learning
No ratings yet
Handwriting Recognition Using Machine Learning
46 pages
Projecj Deep Learning
No ratings yet
Projecj Deep Learning
9 pages
Final.r1222eportt Facemask
No ratings yet
Final.r1222eportt Facemask
36 pages
startingPages[1][1]
No ratings yet
startingPages[1][1]
8 pages
Batch 1 Project Book
No ratings yet
Batch 1 Project Book
73 pages
Final Doc Fin
No ratings yet
Final Doc Fin
87 pages
1822 B.tech It Batchno 358
No ratings yet
1822 B.tech It Batchno 358
119 pages
Internship Report Dikshant Sharma (191203040)
No ratings yet
Internship Report Dikshant Sharma (191203040)
37 pages
Documentation Project
No ratings yet
Documentation Project
48 pages
Pratham Content
No ratings yet
Pratham Content
43 pages
Guidelines For Preparing Major Project Phase I Documentation 18-22 Batch
No ratings yet
Guidelines For Preparing Major Project Phase I Documentation 18-22 Batch
14 pages
1.Thesis Book Omar
No ratings yet
1.Thesis Book Omar
55 pages
List of Figures List of Tabular Columns Chapter-1 Introduction
No ratings yet
List of Figures List of Tabular Columns Chapter-1 Introduction
5 pages
B.E Cse Batchno 104
No ratings yet
B.E Cse Batchno 104
47 pages
OBJECT DETECTION AND IDENTIFICATION report tc
No ratings yet
OBJECT DETECTION AND IDENTIFICATION report tc
10 pages
Anas Index
No ratings yet
Anas Index
3 pages
Major Project Documentation Final 2
No ratings yet
Major Project Documentation Final 2
62 pages
Fuck
No ratings yet
Fuck
10 pages
Deep Learning For Cloud and Mobile
100% (2)
Deep Learning For Cloud and Mobile
42 pages
Table of Contents
No ratings yet
Table of Contents
4 pages
Report 2
No ratings yet
Report 2
17 pages
Wordprediction Reportfinal
No ratings yet
Wordprediction Reportfinal
45 pages
National Institute of Technology Calicut: Signature Verification Project Report
No ratings yet
National Institute of Technology Calicut: Signature Verification Project Report
39 pages
Car Number Plate Recognition
80% (5)
Car Number Plate Recognition
87 pages
Aknowledgement: Engineering and Technology, For His Encouragement and Support
No ratings yet
Aknowledgement: Engineering and Technology, For His Encouragement and Support
4 pages
Automatic Colorization of Black and White Images Using CNN
No ratings yet
Automatic Colorization of Black and White Images Using CNN
37 pages
(Ebook) Machine Learning Pocket Reference: Working with Structured Data in Python by Matt Harrison ISBN 9781492047544, 1492047546 all chapter instant download
No ratings yet
(Ebook) Machine Learning Pocket Reference: Working with Structured Data in Python by Matt Harrison ISBN 9781492047544, 1492047546 all chapter instant download
65 pages
Intro Ai Group3
No ratings yet
Intro Ai Group3
35 pages
Table of Contents
No ratings yet
Table of Contents
60 pages
Phase 1
No ratings yet
Phase 1
78 pages
Project Report Final
No ratings yet
Project Report Final
85 pages
Projects 1920 B 11
No ratings yet
Projects 1920 B 11
85 pages
Gayathri Report
No ratings yet
Gayathri Report
17 pages
Final Document Recent f5
No ratings yet
Final Document Recent f5
52 pages
Final Document Recent f4
No ratings yet
Final Document Recent f4
52 pages
Tic Tac Toe
No ratings yet
Tic Tac Toe
55 pages
Detecting Phishing Websites
100% (1)
Detecting Phishing Websites
65 pages
TemporalMotionlessanalysis - MIsba Siddiqui
No ratings yet
TemporalMotionlessanalysis - MIsba Siddiqui
30 pages
The Blackbook
No ratings yet
The Blackbook
86 pages
Classifying Hand-Written Digits Using Neural Network: A Project Report On
No ratings yet
Classifying Hand-Written Digits Using Neural Network: A Project Report On
19 pages
Image Segmentation Using Region Growing Algorithm: K.Senthilkumar
No ratings yet
Image Segmentation Using Region Growing Algorithm: K.Senthilkumar
7 pages
Thesis
No ratings yet
Thesis
73 pages
Report
No ratings yet
Report
49 pages
anu_2.1
No ratings yet
anu_2.1
5 pages
Learn OpenCV with Python by Examples
From Everand
Learn OpenCV with Python by Examples
James Chen
No ratings yet
Ensemble Approach On Customer Churn Prediction
No ratings yet
Ensemble Approach On Customer Churn Prediction
11 pages
Implementation of MVC Pattern in Content Management System Using Codeigniter As Skeleton Framework.
No ratings yet
Implementation of MVC Pattern in Content Management System Using Codeigniter As Skeleton Framework.
11 pages
Number Plate Recogination Using Machine Learning
No ratings yet
Number Plate Recogination Using Machine Learning
11 pages
Human Annotator For Imbalanced Dossier
No ratings yet
Human Annotator For Imbalanced Dossier
11 pages
Train Track Crack Classification Using Convolutional Neural Networks
No ratings yet
Train Track Crack Classification Using Convolutional Neural Networks
11 pages
Prediction of Cyber Attacks Using Data Science Technique
No ratings yet
Prediction of Cyber Attacks Using Data Science Technique
11 pages
Modern Crop Protection Using Python
No ratings yet
Modern Crop Protection Using Python
11 pages
Covid-19 Future Forecasting Using Supervised Machine Learning Models
No ratings yet
Covid-19 Future Forecasting Using Supervised Machine Learning Models
11 pages
Heart Diesease Prediction and Recommendation System Using Machine Learning
No ratings yet
Heart Diesease Prediction and Recommendation System Using Machine Learning
11 pages
Supervised Learning Method of Diabetes Prediction
No ratings yet
Supervised Learning Method of Diabetes Prediction
10 pages
Online Donation Based Crowdfunding
No ratings yet
Online Donation Based Crowdfunding
11 pages
Survey On Crime Analysis and Prediction Using Machine Learning Techniques
No ratings yet
Survey On Crime Analysis and Prediction Using Machine Learning Techniques
11 pages
MIssing Data Imputation Using Machine Learning Algorithm
No ratings yet
MIssing Data Imputation Using Machine Learning Algorithm
11 pages
Computer Vision-Based Early Fire Detection Using Open CV and Machine Learning
No ratings yet
Computer Vision-Based Early Fire Detection Using Open CV and Machine Learning
11 pages
Disease Prediction and Hospital Recommendation Using Machine Learning
No ratings yet
Disease Prediction and Hospital Recommendation Using Machine Learning
11 pages
Quick Aid
No ratings yet
Quick Aid
11 pages
Hybrid Movie Recommendation System
No ratings yet
Hybrid Movie Recommendation System
11 pages
Enhancing The Data and Security in Health Care System
No ratings yet
Enhancing The Data and Security in Health Care System
9 pages
Project Report On Emotion Aware Smart Music Recommended System Using CNN
No ratings yet
Project Report On Emotion Aware Smart Music Recommended System Using CNN
11 pages
Sales Forecast of Manufacturing Companies Using Machine Learning Navigating Pandemic
No ratings yet
Sales Forecast of Manufacturing Companies Using Machine Learning Navigating Pandemic
11 pages
Publication Automation System
No ratings yet
Publication Automation System
11 pages
Machine Learning Approach For Identifying Plant Diseases and Provide Cure
No ratings yet
Machine Learning Approach For Identifying Plant Diseases and Provide Cure
11 pages
Analysis of Road Traffic Fatal Accident Using Data Mining Techniques
No ratings yet
Analysis of Road Traffic Fatal Accident Using Data Mining Techniques
11 pages
Communication Interpretation Using Machine Learning and Open CV
No ratings yet
Communication Interpretation Using Machine Learning and Open CV
11 pages
Real Estate Web Application Using Flask
0% (1)
Real Estate Web Application Using Flask
11 pages
Secure and Efficient Facial Identification Based Attendance System For Institution
No ratings yet
Secure and Efficient Facial Identification Based Attendance System For Institution
11 pages
Social Media Analysis Using Machine Learning
No ratings yet
Social Media Analysis Using Machine Learning
11 pages
COvid-19 Detection Using Deep Learning With X-Ray
No ratings yet
COvid-19 Detection Using Deep Learning With X-Ray
11 pages
Segmentation On MRI Brain Image and Classification of Stages of Tumor Using Machine Learning
No ratings yet
Segmentation On MRI Brain Image and Classification of Stages of Tumor Using Machine Learning
11 pages
Gene Expression Analysis On Cancer Dataset
No ratings yet
Gene Expression Analysis On Cancer Dataset
11 pages
Attendance Management System Using Face Recognition
No ratings yet
Attendance Management System Using Face Recognition
16 pages
Soccer Management System Report
100% (1)
Soccer Management System Report
18 pages
C++ Recitation
No ratings yet
C++ Recitation
53 pages
4 - Login Test Cases
No ratings yet
4 - Login Test Cases
10 pages
Fundamental Unit-1
No ratings yet
Fundamental Unit-1
30 pages
Template (C++) - Wikipedia
100% (1)
Template (C++) - Wikipedia
37 pages
Sus Mod
No ratings yet
Sus Mod
574 pages
ISDA2 - Stage 2 Group 47
No ratings yet
ISDA2 - Stage 2 Group 47
11 pages
Relational Database Management Systems
No ratings yet
Relational Database Management Systems
8 pages
WUPOS System Requirements - Draft
No ratings yet
WUPOS System Requirements - Draft
1 page
Docker
No ratings yet
Docker
33 pages
Lecture 4.pptx
No ratings yet
Lecture 4.pptx
27 pages
OSY Project
No ratings yet
OSY Project
11 pages
Iscm 2
No ratings yet
Iscm 2
3 pages
323-1851-102.6 (6500 R12.6 PhotonicsEqpt) Issue2
No ratings yet
323-1851-102.6 (6500 R12.6 PhotonicsEqpt) Issue2
666 pages
Extending The Gupta Development Environment
No ratings yet
Extending The Gupta Development Environment
92 pages
ISESchemeandSyllabus2ndYear (2)
No ratings yet
ISESchemeandSyllabus2ndYear (2)
47 pages
Differences Between EXCEL, SHEETS, OpenOffice CALC, & Calc de LibreOffice
No ratings yet
Differences Between EXCEL, SHEETS, OpenOffice CALC, & Calc de LibreOffice
5 pages
03 - BB Manual - Measure - v12 - 4 PDF
No ratings yet
03 - BB Manual - Measure - v12 - 4 PDF
154 pages
Agile Development Software Architecture and Functional Safety Three Views of A System Challenge
No ratings yet
Agile Development Software Architecture and Functional Safety Three Views of A System Challenge
28 pages
DevOps-JAVA-UI ILP Learning Catalog V 1.0
No ratings yet
DevOps-JAVA-UI ILP Learning Catalog V 1.0
16 pages
BITS Questions - CS SS
No ratings yet
BITS Questions - CS SS
4 pages
PHISHING WEBSITE DETECTION USING MACHINE LEARNING - COMPLETED (1) Full
No ratings yet
PHISHING WEBSITE DETECTION USING MACHINE LEARNING - COMPLETED (1) Full
73 pages
Dbms Pros and Cons
No ratings yet
Dbms Pros and Cons
11 pages
Language Fundamentals
No ratings yet
Language Fundamentals
56 pages
Attacks
No ratings yet
Attacks
6 pages
BT MNGD DDOS Security On Net Service Schedule Part B Jan2024 V1
No ratings yet
BT MNGD DDOS Security On Net Service Schedule Part B Jan2024 V1
5 pages
Catalogue of The Arabic and Persian Manuscripts in The Oriental Public Library at Bankipore Persian Poets PDF
No ratings yet
Catalogue of The Arabic and Persian Manuscripts in The Oriental Public Library at Bankipore Persian Poets PDF
293 pages
Log Cat 1727443514949
No ratings yet
Log Cat 1727443514949
13 pages