Semi Supervised Machine Learning Approach For DDOS Detection

This document proposes a semi-supervised machine learning approach for detecting DDoS attacks. Existing approaches are divided into supervised, unsupervised, and semi-supervised categories. The proposed approach uses entropy estimation, co-clustering, information gain ratio, and extra trees algorithms. It filters irrelevant normal traffic to reduce false positives and increase accuracy. The unsupervised part reduces noise while the supervised part further decreases false positives and accurately classifies attacks. Experiments on public datasets achieved over 93% accuracy and less than 0.5% false positive rates, outperforming other methods.

Uploaded by

Tunnu Sunny

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

546 views6 pages

Semi Supervised Machine Learning Approach For DDOS Detection

Uploaded by

Tunnu Sunny

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Semi-supervised Machine Learning Approach for DDoS

Detection
ABSTRACT:
Even though advanced Machine Learning(ML)techniques have been adopted for DDoS
detection, the attack remains a major threat of the Internet. Most of the existing ML-based DDoS
detection approaches are under two categories: supervised and unsupervised. Supervised ML
approaches for DDoS detection rely on availability of labeled network traffic datasets. Whereas,
unsupervised ML approaches detect attacks by analyzing the in coming network raffic. Both
approaches are challenged by large amount of network traffic data, low detection accuracy and
high false positive rates. In this paper, we present an online sequential semi-supervised ML
approach for DDoS detection based on network Entropy estimation, Co-clustering, Information
Gain Ratio and Exra-Trees algorithm. The unsupervised part of the approach allows to reduce
the irrelevant normal traffic data for DDoS detection which allows to reduce false positive rates
and increase accuracy. Whereas, the supervised part allows to reduce the false positive rates of
the unsupervised part and to accurately classify the DDoS traffic. Various experiments were
performed to evaluate the proposed approach using three public datasetsnamelyNSL-KDD,
UNBISCX12andUNSW-
NB15.Anaccuracyof98.23%,99.88%and93.71%isachievedforrespectively NSL-
KDD,UNBISCX12andUNSW-NB15datasets, with respectively the false positive
rates0.33%,0.35%and0.46%.
Keywords: DDoS detection · Co-clustering · Entropy analysis · Information gain ratio ·
Feature selection · Extra-Trees

EXISTING SYSTEM:
The existing Machine Learning based DDoS detection approaches can be divided into three
categories. Supervised ML approaches that use generated labeled network traffic datasets to
build the detection model. Two major issues are facing supervised approaches. First, the
generation of labeled network traffic datasets is costly in terms of computation and time. Without
a continuous update of their detection models, the supervised machine learning approaches are
unable to predict the new legitimate and attack behaviors. Second, the presence of a large
amount of irrelevant normal data in the incoming network traffic is noisy and reduces the
performances of supervised ML classifiers. Unlike the first category, in the unsupervised
approaches, no labeled dataset is needed to build the detection model. The DDoS and the normal
traffic are distinguished based on the analysis of their underlying distribution characteristics.
However, the main drawback of the unsupervised approaches is the high false positive rates. In
the high-dimensional network traffic data the distance between points becomes meaningless and
tends to homogenize. Also, the combination of supervised and unsupervised approaches allows
increasing accuracy and decreases the false positive rates. However, semi-supervised approaches
are also challenged by the drawbacks of both approaches. Hence, the semi-supervised
approaches require a sophisticated implementation of its components in order to overcome the
drawbacks of supervised and unsupervised approaches. In this thesis, we present an online
sequential semi-supervised ML approach for DDoS detection. The network traffic data clusters
that produce high information gain ratio are considered as anomalous and they are selected for
preprocessing and classification using an ensemble classifiers based on the Extra- Trees
algorithm. The first phase of their approach consists of dividing thein coming network traffic
into three types of protocols TCP, UDP or Other.. Two public datasets are used for experiments
in this thesis namely the UNSW-NB15 and the NSL-KDD Several approaches have been
proposed for detecting DDoS attack. in general, rely on the distribution characteristics of the
underlying network traffic data used for assessment. The DDoS detection approaches in the
literature are under two main categories unsupervised approaches and supervised approaches.
Depending on the benchmark datasets used, unsupervised approaches often suffer from highfalse

PROPOSED SYSTEM:
posed System: This section introduces our methodology to detect the DDoS attack. The five-fold steps
application process of data mining techniques in network systems discussed in characterizes the
followed methodology. The main aim of combining algorithms used in the proposed approach is to
reduce noisy and irrelevant network traffic data before preprocessing and classification stages for DDoS
detection1. Incoming network traffic within the time windows having abnormal entropy values is
suspected to contain DDoS traffic. The focus only on the suspected time windows allows filtering the
important amount of network traffic data, therefore only relevant data is selected for the remaining
steps of the proposed approach. Also, important resources are saved when no abnormal entropy occurs.
In order to determine the normal cluster, we estimate the information gain ratio based on the average
entropy of the FSD features between the received positive rate and supervised approach cannot handle
a large amount of network traffic data and their performances are often limited by noisy and irrelevant
network data. Therefore, the need of combining both, supervised and unsupervised approaches arises
to overcome DDoS detection issues.

Advantages:
• Where subset represents the received subset of network data during the time window w, Ci (i =
1, 2, 3) are the obtained clusters from subset and |Ci | is the size of the cluster. avgH(subset) is
the average entropy of the FSD features of the input subset and |subset | represents the size
• The clustering of the incoming network traffic data allows reducing the important amount of
normal and noisy data before the preprocessing and classification steps. More than 6% of a
whole traffic dataset can be filtered.

Architecture Diagram:
REQUIREMENT SPECIFICATION

Functional Requirements

 Graphical User interface with the User.

Software Requirements:

For developing the application the following are the Software Requirements:

1. Python

2. Django

3. Mysql
4. Wampserver

Operating Systems supported

1. Windows 7

2. Windows XP

3. Windows 8

Technologies and Languages used to Develop

1. Python

Debugger and Emulator

 Any Browser (Particularly Chrome)

Hardware Requirements

For developing the application the following are the Hardware Requirements:

 Processor: Pentium IV or higher

 RAM: 256 MB
 Space on Hard Disk: minimum 512MB

CONCLUSION
In this thesis, we have proposed a semi-supervised DDoS detection approach based on
entropy estimation, coclustering, information gain ratio, and the Extra-Tree ensemble classifiers.
The entropy estimator estimates and analyzes the network traffic data entropy over a time-based
sliding window. When the entropy exceeds its limits, the received network traffic during the
current time window is split into three clusters using the co-clustering algorithm. Then, an
information gain ratio is computed based on the average entropy of the network header features
between the current time window subset and each one of the obtained clusters. The network
traffic data clusters that produce high information gain ratio are considered as anomalous and
selected for preprocessing and classification using an ensemble classifiers based on the Extra-
Trees algorithm. Various experiments were conducted in order to assess the performance of the
proposed method using three public benchmark datasets namely the NSL-KDD, the UNB
ISCX12, and the UNSW-NB15. The experiment results, in terms of accuracy and false positive
rate, are satisfactory when compared with the state-of-the-art DDoS detection methods. Despite,
the proposed approach shows good performances with the public benchmark datasets, it is
important to evaluate its performances in real-world scenarios. For future work, we are planning
to perform real-world deployment of the proposed approach and evaluate it against several
DDoS tools.

Chen-Stochastic Methods For Modeling
No ratings yet
Chen-Stochastic Methods For Modeling
234 pages
Review of Products Using Sentiment Analysis (4-2 Project Report) - 3
No ratings yet
Review of Products Using Sentiment Analysis (4-2 Project Report) - 3
75 pages
Student Performance Analysis Using Machine Learning
No ratings yet
Student Performance Analysis Using Machine Learning
40 pages
Study Notion
No ratings yet
Study Notion
51 pages
Alumni Project Poornima
100% (1)
Alumni Project Poornima
138 pages
Project Final Report
100% (1)
Project Final Report
44 pages
Fake News Detection Using LSTM
No ratings yet
Fake News Detection Using LSTM
67 pages
Sowndharya.e Internship Report Final
100% (1)
Sowndharya.e Internship Report Final
35 pages
Communication Assessment For Cognizant Recruitment Drive - 2024 Graduating Batch
No ratings yet
Communication Assessment For Cognizant Recruitment Drive - 2024 Graduating Batch
18 pages
Weather Data Analysis
No ratings yet
Weather Data Analysis
4 pages
SMS Spam Detection Using Machine Learning
No ratings yet
SMS Spam Detection Using Machine Learning
9 pages
RNN Neural Network
No ratings yet
RNN Neural Network
23 pages
Flight Delay Prediction: Project Synopsis On
No ratings yet
Flight Delay Prediction: Project Synopsis On
13 pages
Performance Review System Mba HR Project. Project Report For Students
No ratings yet
Performance Review System Mba HR Project. Project Report For Students
55 pages
20121a3226 Internship Report
No ratings yet
20121a3226 Internship Report
64 pages
Mca, Bca Project List 2023-2024
No ratings yet
Mca, Bca Project List 2023-2024
90 pages
College Management e Magazine
No ratings yet
College Management e Magazine
82 pages
Internship Report DiabetesPrediction
No ratings yet
Internship Report DiabetesPrediction
15 pages
Internship Report
No ratings yet
Internship Report
102 pages
Project Report
No ratings yet
Project Report
55 pages
Search Engine: A Project On
No ratings yet
Search Engine: A Project On
60 pages
MM1 Queue and Lost Sales Inventory Model
0% (1)
MM1 Queue and Lost Sales Inventory Model
12 pages
Dbms Project Report Inventory Management System
No ratings yet
Dbms Project Report Inventory Management System
41 pages
A Study On Buying Behaviour of Customers at Big Bazaar
No ratings yet
A Study On Buying Behaviour of Customers at Big Bazaar
98 pages
Project Report
100% (1)
Project Report
69 pages
DBMS Mini Project Report (Review-1)
100% (1)
DBMS Mini Project Report (Review-1)
25 pages
Major Project Documentation Final 2
No ratings yet
Major Project Documentation Final 2
62 pages
Eee 113 Reviewer
No ratings yet
Eee 113 Reviewer
1 page
40) Gram Panchayat Management System
No ratings yet
40) Gram Panchayat Management System
3 pages
Online Bus Reservation System Project Report Good One
No ratings yet
Online Bus Reservation System Project Report Good One
59 pages
Ai ML DS - Summerinternship
No ratings yet
Ai ML DS - Summerinternship
59 pages
Vandana Internship Report
No ratings yet
Vandana Internship Report
48 pages
YouTube Data Analysis Using Hadoop1
No ratings yet
YouTube Data Analysis Using Hadoop1
69 pages
AJ Final Internship Report
No ratings yet
AJ Final Internship Report
67 pages
Numerical Errors
No ratings yet
Numerical Errors
21 pages
Internship - Report Nithin
No ratings yet
Internship - Report Nithin
25 pages
PHP Project Report On Employee
No ratings yet
PHP Project Report On Employee
28 pages
Btech Final Year Project
No ratings yet
Btech Final Year Project
47 pages
Atulkumar Bca 5thsem A35404819038 NTCC Amity University Jharkhand
No ratings yet
Atulkumar Bca 5thsem A35404819038 NTCC Amity University Jharkhand
76 pages
Credit Card Fraud Detection Using Machine Learning
No ratings yet
Credit Card Fraud Detection Using Machine Learning
28 pages
Matlab Code For Ecg
No ratings yet
Matlab Code For Ecg
5 pages
A Closed Loop Inverse Kinematics Solver Intended For Offline Calculation Optimized With GA
No ratings yet
A Closed Loop Inverse Kinematics Solver Intended For Offline Calculation Optimized With GA
13 pages
Unit 3 - DAA
No ratings yet
Unit 3 - DAA
139 pages
39) Franchise Management System
100% (1)
39) Franchise Management System
2 pages
Digital Marketing Seminar Report
No ratings yet
Digital Marketing Seminar Report
12 pages
Industrial Training Report (Amar Rai)
No ratings yet
Industrial Training Report (Amar Rai)
48 pages
A Report of 08 Weeks Industrial Training At: ASPEXX Health Solution Pvt. LTD
No ratings yet
A Report of 08 Weeks Industrial Training At: ASPEXX Health Solution Pvt. LTD
74 pages
Big Data
No ratings yet
Big Data
30 pages
Minor Project Report Format
No ratings yet
Minor Project Report Format
9 pages
2024 L2 Seminars
No ratings yet
2024 L2 Seminars
47 pages
ICPM2024 Navigating Complexity Patrizia Schalk
No ratings yet
ICPM2024 Navigating Complexity Patrizia Schalk
28 pages
5 - Mathematical Models For Plant Layout III
No ratings yet
5 - Mathematical Models For Plant Layout III
67 pages
Robust Malware Detection For Iot Devices Using Deep Eigenspace Learning
No ratings yet
Robust Malware Detection For Iot Devices Using Deep Eigenspace Learning
12 pages
MCA Syllabus New Updated
No ratings yet
MCA Syllabus New Updated
53 pages
Project Report
No ratings yet
Project Report
52 pages
A Study On Impact of Job Satisfaction On Performance
No ratings yet
A Study On Impact of Job Satisfaction On Performance
66 pages
Sampling and Estimation Theories
No ratings yet
Sampling and Estimation Theories
22 pages
CNN3 Pooling and Fully Contected Layers
No ratings yet
CNN3 Pooling and Fully Contected Layers
21 pages
Semi-Supervised Machine Learning Approach For DDoS Detection
No ratings yet
Semi-Supervised Machine Learning Approach For DDoS Detection
11 pages
6.1 Graphing Linear Inequalities in Two Variables PDF
No ratings yet
6.1 Graphing Linear Inequalities in Two Variables PDF
4 pages
Admission Requirements: SAIT Logo
No ratings yet
Admission Requirements: SAIT Logo
27 pages
C++ Record Ecom
No ratings yet
C++ Record Ecom
42 pages
1.1 Project Overview: Secure G-Cloud-Based Framework For Healthcare Services
No ratings yet
1.1 Project Overview: Secure G-Cloud-Based Framework For Healthcare Services
48 pages
A-3 Ai Print
No ratings yet
A-3 Ai Print
6 pages
Amit Kumar: Bigmart Sales Prediction A Project Report
No ratings yet
Amit Kumar: Bigmart Sales Prediction A Project Report
47 pages
Detection and Mitigation of DDoS Attack in Cloud
No ratings yet
Detection and Mitigation of DDoS Attack in Cloud
9 pages
Base Paper
No ratings yet
Base Paper
16 pages
E Learning Project Report
No ratings yet
E Learning Project Report
53 pages
Program Schedule of Wind Energy Short
No ratings yet
Program Schedule of Wind Energy Short
2 pages
E-Health Care Management Project Report
No ratings yet
E-Health Care Management Project Report
41 pages
Ec 606 Final 201920
No ratings yet
Ec 606 Final 201920
3 pages
Aspen Plus 10.2 Not Giving The Same Density As Aspen Plus 10.1
No ratings yet
Aspen Plus 10.2 Not Giving The Same Density As Aspen Plus 10.1
2 pages
Index: 1.1 Key Features
No ratings yet
Index: 1.1 Key Features
53 pages
Report Minor Project PDF
No ratings yet
Report Minor Project PDF
37 pages
ENC CS Extension
No ratings yet
ENC CS Extension
2 pages
Impact of Adventure Tourism in Rishikesh Key Issues and Opportunities - Synopsis
No ratings yet
Impact of Adventure Tourism in Rishikesh Key Issues and Opportunities - Synopsis
12 pages
CH61015 Advanced Mathematical Techniques in Chemical Engineering MA 2016
No ratings yet
CH61015 Advanced Mathematical Techniques in Chemical Engineering MA 2016
2 pages
Đề thi GT1 MI1016 CK 20221 aThu E
No ratings yet
Đề thi GT1 MI1016 CK 20221 aThu E
1 page
Body Fitness Prediction
No ratings yet
Body Fitness Prediction
16 pages
Seminar Document
No ratings yet
Seminar Document
32 pages
Fs Mini Project Report
No ratings yet
Fs Mini Project Report
25 pages
Online Hosuing Society ManagementSystem
No ratings yet
Online Hosuing Society ManagementSystem
31 pages
Sms Spam Detection
No ratings yet
Sms Spam Detection
23 pages
Jntuh Used Paper Aug-2022: (Common To ECE, EIE)
No ratings yet
Jntuh Used Paper Aug-2022: (Common To ECE, EIE)
2 pages
Essay Grading System
No ratings yet
Essay Grading System
14 pages
RBI
No ratings yet
RBI
2 pages
A Destination With Uniqueness of Three Religions
No ratings yet
A Destination With Uniqueness of Three Religions
8 pages
HW 369 Dxer
No ratings yet
HW 369 Dxer
5 pages
Phishing Email Detection Abstract
No ratings yet
Phishing Email Detection Abstract
8 pages
Security, Backup, Recovery, Tuning, Testing of Data Mining and Warehousing
No ratings yet
Security, Backup, Recovery, Tuning, Testing of Data Mining and Warehousing
16 pages
Final YouTube Automating Comment Analysis
No ratings yet
Final YouTube Automating Comment Analysis
19 pages
33.real Time Drowsy Driver Detection in Matlab
No ratings yet
33.real Time Drowsy Driver Detection in Matlab
5 pages
Employee Expenses Management System: Geddada Suresh A.N.Ramamani
No ratings yet
Employee Expenses Management System: Geddada Suresh A.N.Ramamani
5 pages
A Multilayer Feed-Forward Neural Network
No ratings yet
A Multilayer Feed-Forward Neural Network
9 pages
BT0033 DATA STRUCTURE USING C PAPER 2 (BSciIT SEM 1)
No ratings yet
BT0033 DATA STRUCTURE USING C PAPER 2 (BSciIT SEM 1)
15 pages
Literature Review
No ratings yet
Literature Review
2 pages
47) Distributed Computing For E-Learning
No ratings yet
47) Distributed Computing For E-Learning
4 pages
46) Digital Library
No ratings yet
46) Digital Library
3 pages
JETIR2005452 Certificate
No ratings yet
JETIR2005452 Certificate
2 pages
Prudent Fraud Detection Using Machine Learning
No ratings yet
Prudent Fraud Detection Using Machine Learning
2 pages
AN INDUSTRY ORIENTED MINI PROJECT - Docx Edited'
No ratings yet
AN INDUSTRY ORIENTED MINI PROJECT - Docx Edited'
5 pages
Stock Prediction Synopsis
No ratings yet
Stock Prediction Synopsis
3 pages
Advanced Vibrations - S. Graham Kelly
No ratings yet
Advanced Vibrations - S. Graham Kelly
5 pages
This Is To Certify That, PIN: 17030-EE-097:, Studying Final Year
No ratings yet
This Is To Certify That, PIN: 17030-EE-097:, Studying Final Year
1 page
Email Client Application Implementing SMTP and POP - DOC
No ratings yet
Email Client Application Implementing SMTP and POP - DOC
103 pages
Touchpad Plus Ver. 1.1 Class 7
From Everand
Touchpad Plus Ver. 1.1 Class 7
Nisha Batra
No ratings yet

Semi Supervised Machine Learning Approach For DDOS Detection

Uploaded by

Semi Supervised Machine Learning Approach For DDOS Detection

Uploaded by

Semi-supervised Machine Learning Approach for DDoS

 Graphical User interface with the User.

Operating Systems supported

Technologies and Languages used to Develop

Debugger and Emulator

 Processor: Pentium IV or higher

You might also like