0% found this document useful (0 votes)
546 views6 pages

Semi Supervised Machine Learning Approach For DDOS Detection

This document proposes a semi-supervised machine learning approach for detecting DDoS attacks. Existing approaches are divided into supervised, unsupervised, and semi-supervised categories. The proposed approach uses entropy estimation, co-clustering, information gain ratio, and extra trees algorithms. It filters irrelevant normal traffic to reduce false positives and increase accuracy. The unsupervised part reduces noise while the supervised part further decreases false positives and accurately classifies attacks. Experiments on public datasets achieved over 93% accuracy and less than 0.5% false positive rates, outperforming other methods.

Uploaded by

Tunnu Sunny
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
546 views6 pages

Semi Supervised Machine Learning Approach For DDOS Detection

This document proposes a semi-supervised machine learning approach for detecting DDoS attacks. Existing approaches are divided into supervised, unsupervised, and semi-supervised categories. The proposed approach uses entropy estimation, co-clustering, information gain ratio, and extra trees algorithms. It filters irrelevant normal traffic to reduce false positives and increase accuracy. The unsupervised part reduces noise while the supervised part further decreases false positives and accurately classifies attacks. Experiments on public datasets achieved over 93% accuracy and less than 0.5% false positive rates, outperforming other methods.

Uploaded by

Tunnu Sunny
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Semi-supervised Machine Learning Approach for DDoS

Detection
ABSTRACT:
Even though advanced Machine Learning(ML)techniques have been adopted for DDoS
detection, the attack remains a major threat of the Internet. Most of the existing ML-based DDoS
detection approaches are under two categories: supervised and unsupervised. Supervised ML
approaches for DDoS detection rely on availability of labeled network traffic datasets. Whereas,
unsupervised ML approaches detect attacks by analyzing the in coming network raffic. Both
approaches are challenged by large amount of network traffic data, low detection accuracy and
high false positive rates. In this paper, we present an online sequential semi-supervised ML
approach for DDoS detection based on network Entropy estimation, Co-clustering, Information
Gain Ratio and Exra-Trees algorithm. The unsupervised part of the approach allows to reduce
the irrelevant normal traffic data for DDoS detection which allows to reduce false positive rates
and increase accuracy. Whereas, the supervised part allows to reduce the false positive rates of
the unsupervised part and to accurately classify the DDoS traffic. Various experiments were
performed to evaluate the proposed approach using three public datasetsnamelyNSL-KDD,
UNBISCX12andUNSW-
NB15.Anaccuracyof98.23%,99.88%and93.71%isachievedforrespectively NSL-
KDD,UNBISCX12andUNSW-NB15datasets, with respectively the false positive
rates0.33%,0.35%and0.46%.
Keywords: DDoS detection · Co-clustering · Entropy analysis · Information gain ratio ·
Feature selection · Extra-Trees

EXISTING SYSTEM:
The existing Machine Learning based DDoS detection approaches can be divided into three
categories. Supervised ML approaches that use generated labeled network traffic datasets to
build the detection model. Two major issues are facing supervised approaches. First, the
generation of labeled network traffic datasets is costly in terms of computation and time. Without
a continuous update of their detection models, the supervised machine learning approaches are
unable to predict the new legitimate and attack behaviors. Second, the presence of a large
amount of irrelevant normal data in the incoming network traffic is noisy and reduces the
performances of supervised ML classifiers. Unlike the first category, in the unsupervised
approaches, no labeled dataset is needed to build the detection model. The DDoS and the normal
traffic are distinguished based on the analysis of their underlying distribution characteristics.
However, the main drawback of the unsupervised approaches is the high false positive rates. In
the high-dimensional network traffic data the distance between points becomes meaningless and
tends to homogenize. Also, the combination of supervised and unsupervised approaches allows
increasing accuracy and decreases the false positive rates. However, semi-supervised approaches
are also challenged by the drawbacks of both approaches. Hence, the semi-supervised
approaches require a sophisticated implementation of its components in order to overcome the
drawbacks of supervised and unsupervised approaches. In this thesis, we present an online
sequential semi-supervised ML approach for DDoS detection. The network traffic data clusters
that produce high information gain ratio are considered as anomalous and they are selected for
preprocessing and classification using an ensemble classifiers based on the Extra- Trees
algorithm. The first phase of their approach consists of dividing thein coming network traffic
into three types of protocols TCP, UDP or Other.. Two public datasets are used for experiments
in this thesis namely the UNSW-NB15 and the NSL-KDD Several approaches have been
proposed for detecting DDoS attack. in general, rely on the distribution characteristics of the
underlying network traffic data used for assessment. The DDoS detection approaches in the
literature are under two main categories unsupervised approaches and supervised approaches.
Depending on the benchmark datasets used, unsupervised approaches often suffer from highfalse

PROPOSED SYSTEM:
posed System: This section introduces our methodology to detect the DDoS attack. The five-fold steps
application process of data mining techniques in network systems discussed in characterizes the
followed methodology. The main aim of combining algorithms used in the proposed approach is to
reduce noisy and irrelevant network traffic data before preprocessing and classification stages for DDoS
detection1. Incoming network traffic within the time windows having abnormal entropy values is
suspected to contain DDoS traffic. The focus only on the suspected time windows allows filtering the
important amount of network traffic data, therefore only relevant data is selected for the remaining
steps of the proposed approach. Also, important resources are saved when no abnormal entropy occurs.
In order to determine the normal cluster, we estimate the information gain ratio based on the average
entropy of the FSD features between the received positive rate and supervised approach cannot handle
a large amount of network traffic data and their performances are often limited by noisy and irrelevant
network data. Therefore, the need of combining both, supervised and unsupervised approaches arises
to overcome DDoS detection issues.

Advantages:
• Where subset represents the received subset of network data during the time window w, Ci (i =
1, 2, 3) are the obtained clusters from subset and |Ci | is the size of the cluster. avgH(subset) is
the average entropy of the FSD features of the input subset and |subset | represents the size
• The clustering of the incoming network traffic data allows reducing the important amount of
normal and noisy data before the preprocessing and classification steps. More than 6% of a
whole traffic dataset can be filtered.

Architecture Diagram:
REQUIREMENT SPECIFICATION

Functional Requirements

 Graphical User interface with the User.

Software Requirements:

For developing the application the following are the Software Requirements:

1. Python

2. Django

3. Mysql
4. Wampserver

Operating Systems supported

1. Windows 7

2. Windows XP

3. Windows 8

Technologies and Languages used to Develop

1. Python

Debugger and Emulator


 Any Browser (Particularly Chrome)

Hardware Requirements

For developing the application the following are the Hardware Requirements:

 Processor: Pentium IV or higher


 RAM: 256 MB
 Space on Hard Disk: minimum 512MB

CONCLUSION
In this thesis, we have proposed a semi-supervised DDoS detection approach based on
entropy estimation, coclustering, information gain ratio, and the Extra-Tree ensemble classifiers.
The entropy estimator estimates and analyzes the network traffic data entropy over a time-based
sliding window. When the entropy exceeds its limits, the received network traffic during the
current time window is split into three clusters using the co-clustering algorithm. Then, an
information gain ratio is computed based on the average entropy of the network header features
between the current time window subset and each one of the obtained clusters. The network
traffic data clusters that produce high information gain ratio are considered as anomalous and
selected for preprocessing and classification using an ensemble classifiers based on the Extra-
Trees algorithm. Various experiments were conducted in order to assess the performance of the
proposed method using three public benchmark datasets namely the NSL-KDD, the UNB
ISCX12, and the UNSW-NB15. The experiment results, in terms of accuracy and false positive
rate, are satisfactory when compared with the state-of-the-art DDoS detection methods. Despite,
the proposed approach shows good performances with the public benchmark datasets, it is
important to evaluate its performances in real-world scenarios. For future work, we are planning
to perform real-world deployment of the proposed approach and evaluate it against several
DDoS tools.

You might also like