0% found this document useful (0 votes)
27 views11 pages

Sat - 48.Pdf - Malicious Attacks Detection Using Machine Learning

The document proposes BotChase, a two-phased graph-based bot detection system that leverages both unsupervised and supervised machine learning. The first phase prunes presumable benign hosts using unsupervised learning techniques like k-means clustering. The second phase then uses this information to perform bot detection on the remaining hosts using supervised learning classifiers with high precision. Experimental results show that BotChase is able to detect multiple types of bots, is robust against zero-day attacks, and scales well to large networks. It outperforms existing flow-based detection methods.

Uploaded by

Vj Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views11 pages

Sat - 48.Pdf - Malicious Attacks Detection Using Machine Learning

The document proposes BotChase, a two-phased graph-based bot detection system that leverages both unsupervised and supervised machine learning. The first phase prunes presumable benign hosts using unsupervised learning techniques like k-means clustering. The second phase then uses this information to perform bot detection on the remaining hosts using supervised learning classifiers with high precision. Experimental results show that BotChase is able to detect multiple types of bots, is robust against zero-day attacks, and scales well to large networks. It outperforms existing flow-based detection methods.

Uploaded by

Vj Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

ABSTRACT

Bot detection using machine learning (ML), with network flow-level features, has
been extensively studied in the literature. However, existing flow-based
approaches typically incur a high computational overhead and do not completely
capture the network communication patterns, which can expose additional aspects
of malicious hosts. Recently, bot detection systems that leverage communication
graph analysis using ML have gained attention to overcome these limitations.

A graph-based approach is rather intuitive, as graphs are true representation of


network communications. In this paper, we propose BotChase, a two-phased
graph-based bot detection system that leverages both unsupervised and
supervised ML. The first phase prunes presumable benign hosts, while the second
phase achieves bot detection with high precision. Our prototype implementation of
BotChase detects multiple types of bots and exhibits robustness to zero-day
attacks. It also accommodates different network topologies and is suitable for
large-scale data. Compared to the state-of-the-art, BotChase outperforms an end-
to-end system that employs flow-based features and performs particularly well in
an online setting.

v
TABLE OF CONTENTS
CHAPTER TITLE PAGE
NO NO
ABSTRACT V

LIST OF FIGURES ix

LIST OF ABBREVIATIONS X

1 INTRODUCTION 1

1.1. OVERVIEW 2

1.2. OBJECTIVE 2

1.3. SCOPE 2

2 LITERATURE SURYEY 3

3 METHODOLOGY 10

3.0 EXISTING SYSTEM 10

3.1 EXISTING SYSTEM 10


DISADVANRAGES
3.2 PROPOSED WORK 10

3..3 SUPERVISED LEARNING 11

3.3.1 DECISION TREE 11

3.4 UNSUPERVISED LEARNING 11

3.4.1 K-MEANS 11

3.5 ADVANTAGES 14

vi
TABLE OF CONTENTS
CHAPTER TITLE PAGE
NO NO
3.6 SOFTWARE AND 14
HARDWARE
3.7 SYSTEM STUDY 15

3.7.1 ECONOMICAL 15
FEASIBILITY
3.7.2 ECHNICAL FEASIBILITY 15

3.7.3 SOCIAL FEASIBILITY 15

CHAPTER DATA FLOW 16


DIAGRAM
3.8.1 INTRODUCTION TO UML 18

3.8.2 GOAL OF UML 18

3.9 UML DIAGRAM 18

3.9.1 USE CASE DIAGRAM 19

3.9.2 CLASS DIAGRAM 20

3.9.3 OBJECT DIAGRAM 21

3.9.4 STATE DIAGRAM 21

3.9.5 ACTIVITY DIAGRAM 23

3.9.6 SEQUENCE DIAGRAM 24

3.9.7 COLLABORATION 24
DIAGRAM
3.9.8 COMPONENT DIAGRAM 25

vii
CHAPTER TITLE PAGE
NO NO
3.9.9 DEPLOYMENT DIAGRAM 26

3.10 MODULES 26

3.10.1 ALGORITHM 28

4 RESULTS AND DISCUSSION 30


PERFORMANCE ANALYSIS
5 SUMMARY AND CONCLUSION 30

REFERENCE APPENDICES 31

A.SCREENSHOTS 33

B.SOUREC CODE 37

C.PLAGARISM REPORT 39

viii
LIST OF FIGURES
FIGURE NO NAME OF THE FIGURE PAGE NO

1 SYSTEM ARCHITECTURE 16

2 DATA FLOW DIAGRAM 24

3 USE CASE DIAGRAM 19

4 CLASS DIAGRAM 19

5 OBJECT DIAGRAM 20

6 STATE DIAGRAM 21

7 ACTIVITY DIAGRAM 23

8 SEQUENCE DIAGRAM 24

9 COLLABORATION DIAGRAM 25

10 COMPONENT DIAGRAM 25

11 DEPLOYMENT DIAGRAM 26

ix
LIST OF ABBREVIATIONS

MATLAB Matrix Laboratory

PD Pandas

NLTK Natural Language Tool Kit


Computing

NX Networkx

TTK Tkinter

x
xi
CHAPTER 1
INTRODUCTION
Now a days everyone is storing their information in their systems. Here comes a
problem in providing security to their systems. On other hand cyber-attacks are also
increasing randomly which can hack your personal data like photos, social media and
chats. Bot attacks increased worldwide. There are also some servers getting hacked
which contains data of some lakhs people, where hacking a server is equal to
hacking some lakhs people data.
Botnet is also a type of cyber-attack which is a collection of internet-connected
devices, where these devices are called as bot. By using this bots the attacker can
also hack a big servers. These bots all together called as bot army. Botnet can make
time-consuming tasks easier because of its army. Botnet also perform helpful tasks
people are using it for malicious works. It is also a source of many malicious
activities. The different models of botnet are Client/Server .There are many types in
botnet like centralized, client-server, decentralized and peer-to-peer models and
attacks such as DDoS, phishing, cryptojacking, snooping, bricking, Brute force and
spambots. Common Botnet actions are Email spam, Financial breach, Targeted
intrusions. A bot herder can do a collective of hijacked devices by using remote
commands. Once your machine is infected, it becomes a bot, you may not even
know. Botnet leads to Financial theft, Informational theft, Sabotage of services,
Selling access to other criminals. The 3 main components of botnet are the bots,
Botnet attacks has been increased in the recent years at the same time different
types of Botnet detection frameworks are also increased.

The hacker can access the device only when his application was in the device. Once
his application started running in the device then he can steal, change or destroy
information. The hacker can also steal money, username and passwords. The hacker
can also change your confidential data. Also install and run any application in your
system he want. All the devices which are connected to the internet can be hacked
by the hacker. The more targeted devices like desktop and laptops which runs on
Windows OS or macOS. Mobiles are next target devices as more people are using by
connecting them to the internet. Recent years connecting devices to the internet has
increased rapidly botnets also create from connected devices has become more
noted.
First the hacker will start by injecting the malware infection to your device. some
download links to the target device to hack the device. For example Trojan Horse
(Happy New Year! Click here to see magic). If the owner of the device does not know
about whether the download link is an attacker link and if he click on the link then the
hacker application will get download in the device and sit around wait for command
from the main system (hacker system). Now the hacker can access everything from
his device. In order not to get attacked by hackers he should know all the malware
links, so he can save his device from hacker. To stay away from malware links his
device should able to find the malware links or prevent the initial infection or identify
1
an existing infection. Botnet attacks are hard to detect. Preventing botnet attacks is
more difficult. Yet we can still take certain measures to prevent botnet attacks.

1.1 OVERVIEW
Cyberattacks are on the rise these days. Many systems are getting infected by
attacks to overcome these attacks, In the past, we used signature-based research.
However, as technology developed, attacks became more sophisticated and we used
k-means and decision trees to see how many bots were targeted and how many were
not. If there is an attack, we will find how many bots were attacked or detected and
we will give the number.

1.2 OBJECTIVE
A botnet is a collection of bots, agents in compromised hosts, controlled by
botmasters via command and control (C2) channels. A malevolent adversary controls
the bots through botmaster, which could be distributed across several agents that
reside within or outside the network. Hence, bots can be used for tasks ranging from
distributed denial-of-service (DDoS), to massive-scale spamming, to fraud and
identify theft. While bots thrive for different sinister purposes, they exhibit a similar
behavioral pattern when studied up-close. The intrusion kill-chain dictates the
general phases a malicious agent goes through in-order to reach and infest its target.

1.3 SCOPE
For this phase in BotChase, we evaluate four SL techniques, namely DT, LR, SVM
and FNN. We use DT with Gini instance split rule algorithm, LR without
regularization, and SVM with the Gaussian kernel and a soft margin penalty of 1.
Moreover, NN is configured to use cross entropy as an error function and 10 hidden
layers of 1000 units each. The DT classifier shows the best performance with the
small dataset, as depicted in Table IV. It successfully detects all bots in the test
dataset, with only a single FP out of the 366871 benign hosts. In contrast, all other
classifiers are lackluster and unable to recall even a single bot from the dataset. We
believe this is because all classifiers, except DT, rely on gradient-descent for
errorcorrection. This implies that every single node in the dataset will affect the end-
hypothesis function. Thus, with a dataset that is unbalanced, the hypothesis will be
biased towards the benign hosts, which is the case for LR, SVM and FNN.

2
CHAPTER 2
LITERATURE SURVEY

2.1 Effective Botnet Detection Through Neural Networks on Convolutional


Features(Shao-Chien Chen, Yi-Ruei Chen, Wen-Guey Tzeng)
ABSTRACT: Botnet is one of the major threats on the Internet for committing
cybercrimes, such as DDoS attacks, stealing sensitive information, spreading spams,
etc. It is a challenging issue to detect modern botnets that are continuously improving
for evading detection. In this paper, we propose a machine learning based botnet
detection system that is shown to be effective in identifying P2P botnets. Our
approach extracts convolutional version of effective flow-based features, and trains a
classification model by using a feed-forward artificial neural network. The
experimental results show that the accuracy of detection using the convolutional
features is better than the ones using the traditional features. It can achieve 94.7% of
detection accuracy and 2.2% of false positive rate on the known P2P botnet datasets.
Furthermore, our system provides an additional confidence testing for enhancing
performance of botnet detection. It further classifies the network traffic of insufficient
confidence in the neural network. The experiment shows that this stage can increase
the detection accuracy up to 98.6% and decrease the false positive rate up to 0.5%.
2.2 An approach for host based botnet detection system
AUTHORS: Yulia ALEKSIEVA, Hristo VALCHANOV, Veneta ALEKSIEVA.
ABSTRACT: Most serious occurrence of modern malware is Botnet. Botnet is a
rapidly evolving problem that is still not well understood and studied. One of the main
goals for modern network security is to create adequate techniques for the detection
and eventual termination of Botnet threats. The article presents an approach for
implementing a host-based Intrusion Detection System for Botnet attack detection.
The approach is based on a variation of a genetic algorithm to detect anomalies in a
case of attacks. An implementation of the approach and experimental results are
presented.
2.3 Towards using transfer learning for Botnet Detection
AUTHORS: Prapa Rattadilok, Basil Alothman
ABSTRACT: Botnet Detection has been an active research area over the last
decades. Researchers have been working hard to develop effective techniques to
detect Botnets. From reviewing existing approaches it can be noticed that many of
them target specific Botnets. Also, many approaches try to identify any Botnet activity
by analysing network traffic. They achieve this by concatenating existing Botnet
datasets to obtain larger datasets, building predictive models using these datasets
and then employing these models to predict whether network traffic is safe or
harmful. The problem with the first approaches is that data is usually scarce and
costly to obtain. By using small amounts of data, the quality of predictive models will
always be questionable. On the other hand, the problem with the second approaches
is that it is not always correct to concatenate datasets containing network traffic from
different Botnets. Datasets can have different distributions which means they can
downgrade the predictive performance of machine learning models. Our idea is
3
instead of concatenating datasets, we propose using transfer learning approaches
tocarefully decide what data to use. Our hypothesis is ―Predictive Performance can
be improved by using transfer learning techniques across datasets containing
network traffic from different Botnets‖.
2.4 Development of an Intrusion Detection System Using a Botnet with the R
Statistical Computing System
AUTHORS: Takashi Yamanoue, Junya Murakami
ABSTRACT: Development of an intrusion detection system, which tries to detect
signs of technology of malware, is discussed. The system can detect signs of
technology of malware such as peer to peer (P2P) communication, DDoS attack,
Domain Generation Algorithm (DGA), and network scanning. The system consists of
beneficial botnet and the R statistical computing system. The beneficial botnet is a
group of Wiki servers, agent bots and analyzing bots. The script in a Wiki page of the
Wiki server controls an agent bot or an analyzing bot. An agent bot is placed between
a LAN and its gateway. It can capture every packet between hosts in the LAN and
hosts behind the gateway from the LAN. An analyzing bot can be placed anywhere in
the LAN or WAN if it can communicate with the Wiki server for controlling the
analyzing bot. The analyzing bot has R statistical computing system and it can
analyze data which is collected by agent bots.
2.5 An efficient botnet detection system for P2P botnet
AUTHORS: M. Thangapandiyan, P. M. Rubesh Anand
ABSTRACT: Peer-to-Peer (P2P) botnets are exploited by the botmasters for their
resiliency against the take down efforts. As the modern botnets are stealthier, the
traditional botnet detection approaches are not suitable for the botnet detection. In
this paper, an efficient botnet detection system is proposed for detecting the P2P
botnet. The proposed botnet detection system estimates the flow export using
NetFlow protocol. The packet flow is analyzed using three main components namely,
Exporter, Collector, and Analyzer. The exporter captures the packet and monitors the
contents of the packet. The collector captures the flow traffic and the analyzer
component initiates an automated analysis of traffic with the captured packet
information. The packet flow information is collected by virtual interface and physical
probe. The virtual interface is used for collecting the malicious traffic information
between the Virtual Machines (VMs) and the physical probe gathers malicious traffic
information between the network bridges connecting VMs. The information collected
from these techniques are analyzed for detecting the botnets in inter VM and intra
VM. Compared to the existing Dendritic Cell Algorithm (DCA), the proposed VM
based botnet detection system has minimal time consumption, increased detection
speed, and higher attack prevention ratio.
2.6 Overview of Botnet Detection Based on Machine Learning
AUTHORS: Xiaxin Dong, Jianwei Hu ,Yanpeng Cui
ABSTRACT: With the rapid development of the information industry, the applications
of Internet of things, cloud computing and artificial intelligence have greatly affected
people's life, and the network equipment has increased with a blowout type. At the
same time, more complex network environment has also led to a more serious
4

You might also like