Intrusion Detection in Software Defined Network Using Machine Learning
Intrusion Detection in Software Defined Network Using Machine Learning
i
TABLE OF CONTENTS
ABSTRACT i
LIST OF FIGURES ii
LIST OF ABBREVIATIONS iii
LIST OF TABLES iv
1. INTRODUCTION 1
1.1 OUTLINE 1
2. LITERATURE REVIEW 4
3. METHODOLOGY 9
MACHINE LEARNING
BASED APPROACH 9
WORKFLOW OF ML ALGORITHMS 9
ALGORITHMS USED 14
GENETIC ALGORITHM 18
DT RESULTS 20
EVALUATION METRICS 21
ACCURACY 23
CROSS VALIDATION 24
APPENDIX 31
A. SAMPLE CODE 31
B. SCREENSHOTS 38
C. PLAGARISM REPORT 40
LIST OF FIGURES
ii
LIST OF ABBREVIATIONS
ABBREVIATION EXPANSION
DT Decision Tree
iii
LIST OF TABLES
different Algorithms 23
sub-feature dataset 24
iv
CHAPTER 1
INTRODUCTION
OUTLINE
For the past few years, network has played a significant role in communication. The
computer network allows the computing network devices to exchange information
among different systems and individuals. The services of various organizations,
companies, colleges, universities are accessed throughout computer network. This
leads to a massive growth in networking field. The accessibility of internet has
acquired a lot of interest among individuals. In this context, security of information
has become a great challenge in this modern area. The information or data that we
would like to send is supposed to be secured in such a way that a third party should
not take control over them. When we are talking about security, we have to keep
three basic factors in our mind: Confidentiality, Integrity and availability.
Confidentiality means privacy of information. It gives the formal users the right to
access the system via internet. This can be performed suitably along with
accountability services in order to identify the authorized individuals. The second
key factor is integrity. The integrity service means exactness of information. It allows
the users to have self- assurance that the information passed is acceptable and has
not been changed by an illegal individual.
An Intrusion Detection System (IDS) is used to watch malicious activities over the
network. It can sort the unfamiliar records as normal or attack class. First monitoring
of the network traffic is done, and then the IDS sorts these network traffic records
into either malicious class or regular class. It acts as an alarm system that reports
when an illegal activity is detected. The exactness of the IDS depends upon
detection rate. If the performance is high for the IDS, then the correctness of
detection is also high. Some of the intrusion detection systems are marketed with
the ability to stop attacks before they are successful. They are used to shield an
association from attack. It is a relative concept that tries to identify a hacker when
intrusion is attempted. Ideally, such a system will only alarm when a successful
attack is made. Intrusion detection system is not a perfect solution to all attack types.
The various goals that can be accomplished with an Intrusion Detection System are:
The potential goals include the following:
1
IDS detect attacks.
IDS traces user activity from point of entry to.
IDS generate alerts when required.
Detect errors in system configuration.
Provides security of the system without the need of non – expert staff.
IDS can detect when the system is under attack. Provides evidences for attack.
DATA OVERVIEW
The dataset being used is the KDD'99 dataset for network intrusion. It‘s a famous
dataset being used by many researchers for the purpose of intrusion detection
applying various learnings. The dataset contains many attack types like the DOS,
U2R, R2L, Probe and normal (no attack). There are 21 types of attacks inside the
main categories mentioned above.
The dataset contains a total of 41 attributes which could be used to determine if the
attack is malicious or not at all an attack.
2
PROBLEM DESCRIPTION
With the rapid development of information technology in the past two decades.
Computer networks are widely used by industry, business and various fields of the
human life. Therefore, building reliable networks is a very important task for IT
administrators. On the other hand, the rapid development of information technology
produced several challenges to build reliable networks which is a very difficult task.
There are many types of attacks threatening the availability, integrity and
confidentiality of computer networks. The Denial-of-service attack (DOS)
considered as one of the most common harmful attacks.
3
CHAPTER 2
LITERATURE REVIEW
Software-defined Networking (SDNs) have as of late been created as a feasible and
promising answer for the eventual fate of the Internet. Networks are made due,
incorporated, and observed and adjusted utilizing SDN. These advantages, then
again, bring us ecological dangers, for example, network crashes, framework
incapacities, internet banking misrepresentation, and robbery. These issues can
detrimentally affect families, organizations, and the economy. Truth, superior
execution, and the genuine framework are fundamental to accomplishing this
objective. The extension of wise AI calculations into the network intrusion detection
system (NIDS) through a software-defined network (SDN) has been extremely
invigorating over the previous decade. The accessibility of data, the distinction in
information investigation, and the many advances in AI calculations assist us with
making a superior, more dependable, and solid framework for distinguishing the
various sorts of organization assaults. The review was essential for the NIDS SDN
survey.
Network Intrusion Detection Systems (NIDSs) are a significant device for network
framework overseers to decide network security. NIDS screens and examines
approaching and active calls from family network gadgets and cautions assuming
that entrance is identified. As far as access control, NIDS is separated into two
classifications: I) NIDS (SNIDS) based mark (abuse), and ii) NIDS (ADNIDS) based
secrecy location. SNIDS and Drinking put assault marks first in NIDS. The helpful
plan is made of against slip vehicle to permit admittance to the organization.
Interestingly, ADNIDS permits network traffic to stream in when it is going to split
away from typical traffic. Significant in characterizing SNIDS. notable, notable
assault, non-salvage assault. Nonetheless, its unmistakable makes it extremely
challenging to distinguish obscure or new assaults on the grounds that the marks of
pre-introduced assaults on the IDS are decreased. However, ADNIDS is critical to
be familiar with obscure and new assaults. In spite of the fact that ADNIDS estimates
its adequacy well, its capacity to identify new assaults has prompted its far and wide
acknowledgment. There are two issues that function admirably in the advancement
of NIDS: gentle and direct assaults. Above all else, the strategy for choosing the
right traffic information from the informational index line is hard to distinguish
4
peculiarities. Because of steady vacillations and changes, the capacities chose at a
similar assault level may not be reasonable for other assault classes. Second, there
is an absence of a bunch of traffic information from the genuine line of NIDS
improvement. It requires a ton of work to separate a bunch of genuine or ongoing
recorded information from the crude line of the gathered way.
Different Intrusion detection techniques have been explained and suggested over
the last 2 decades (Catania and Garino 2012). To build robust and efficient IDS
many types of research have been proposed with machine learning approaches and
made some improvements in this area. An approach was introduced by Ingre and
Yadav (2015) for ML-based IDS that combined the Decision Tree (DT) classifier and
correlation input selection method for the classification task. A filter-based feature
selection method was used to train and test their models. For their experimental
process, the NSL-KDD dataset was analyzed. 14 significant features had been
selected by filter-based feature selection technique to reduce the time complexity.
However, this study was conducted for both binary and multiclass classification
tasks. The result showed that they obtained an accuracy of 83.66% for a multiclass
classification task that successfully identified five different attacks and 90.30%
accuracy for binary classification.