0% found this document useful (0 votes)
76 views4 pages

Network Intrusion Detection System

This document discusses developing a network intrusion detection system using machine learning algorithms. It proposes using machine learning methods like support vector machines, naive bayes, and k-nearest neighbors algorithms on the NSL-KDD dataset to classify network traffic as normal or an intrusion, and to identify the type of intrusion. The system would monitor network traffic, extract features from the data, train a decision tree model on the training data to learn normal behavior and intrusion types, and then use the model to classify new network data as normal or an intrusion.

Uploaded by

Ojaswi Yerpude
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views4 pages

Network Intrusion Detection System

This document discusses developing a network intrusion detection system using machine learning algorithms. It proposes using machine learning methods like support vector machines, naive bayes, and k-nearest neighbors algorithms on the NSL-KDD dataset to classify network traffic as normal or an intrusion, and to identify the type of intrusion. The system would monitor network traffic, extract features from the data, train a decision tree model on the training data to learn normal behavior and intrusion types, and then use the model to classify new network data as normal or an intrusion.

Uploaded by

Ojaswi Yerpude
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Network Intrusion Detection System

Using Machine Learning


Christeena
Dublin Business School
Dublin, Ireland

1. ABSTRACT
The rapid advances in the internet and crime expanded decisively. As per the Internet
communication fields have resulted in a massive Security Threat Report (ISTR), around 430 million
increase in the network size and the corresponding new kinds of malware were recognized, 362 of which
data. As a result, many novel attacks are being were Crypto-deliver products in 2015. The estimated
generated and have posed challenges for network cybercrime rates delivered 1.5 trillion US$ in 2018.
security to detect intrusions accurately. Furthermore, If there's one clear conclusion in 2021, it was that no
the presence of intruders with the aim to launch organization was protected from a cyber-attack, large
various attacks within the network cannot be ignored. or small. Digital assaults are further developed,
An intrusion detection system (IDS) is one tool that slippery, and focused on than at any time in recent
prevents the network from possible intrusions by memory. In this way, security strategies should be
inspecting the network traffic, to ensure its ceaselessly evolved. A network intrusion detection
confidentiality, integrity, and availability. Recently, system (NIDS) plays a significant part in network
machine learning (ML) and deep learning security, where it identifies interruptions and speaks
(DL)-based NIDS systems are being deployed as with the appropriate authority.
potential solutions to detect intrusions across the
network in an efficient manner. We are developing a
Intrusion Detection System (IDS) is an important tool
system that works on a basic machine learning
use in cyber security to monitor and determine
algorithm. A network intrusion detection system will
intrusion attack. There are three types of IDS;
monitor a network or system for malicious activity
network IDS, host IDS, and Application IDS.
and protects a computer network from unauthorized
Network IDS monitors network packet to detect
access from users, including perhaps insider. We are
intrusion attack. While host IDS monitors a single
using the NSL-KDD dataset from New Brunswick
host (server or computer). Lastly, application IDS
University. Machine Learning includes a number of
monitors several known high risk applications. To
advanced statistical methods for handling regression
determine whether an intrusion attack has occurred or
and classification tasks with multiple dependent and
not, IDS depends on few approaches.
independent variables. We use methods that include
Support Vector Machine (SVM) for better prediction,
Naive Bayes for classification, and k-Nearest ML calculation can be arranged into different
Neighbors (KNN) for regression and classification. classifications yet we are utilizing orders, choice
trees, and relapse modules. We are utilizing the SVM
Keywords: Machine learning, Intrusion, NSL-KDD, model to learn and test individually. The strategy is
Network Intrusion Detection System (NIDS). powerful to diminish the space thickness of
information. The investigations contrast the
outcomes and the misleading positive rate and
increment the precision. Where Bayesian class
2. LITERATURE REVIEW
utilizes the Bayes Theorem of likelihood, which
decides the likelihood of explicit result to
2.1. INTRODUCTION materialize. The most famous calculation in this
After the rapid development of innovation and the classification is Naive Bayes. The choice tree has a
spread of web networks across the world, cyber- tree-like design that beginnings from root hubs,

1
which is the best indicator. Then advances through its the first KDD informational index. Subsequently, the
branches until it arrives at a left hub. This is the order paces of unmistakable AI strategies shift into a
choice result. The most famous calculation in this more extensive territory, which makes it more
classification is the k-Nearest Neighbor (kNN). This effective to have a precise assessment of various
classification of calculation tracks down the most learning techniques. The number of records in the
comparative cases or prepares information, that train and test sets is sensible, which makes it
matches the new information to make a forecast. reasonable to run the examinations on the total set
Relapse calculation attempt to assemble a model that without the need to choose a little piece haphazardly.
can address the connection between factors. It is Therefore, the assessment consequences of various
gotten from the measurable investigation. exploration works will be reliable and equivalent.

2.3. PROPOSED SYSTEM

In the proposed engineering, the information is


gathered from the straightforwardly available dataset
NSL-KDD. Then the absolute elements of the
information are encoded and extracted. Every one of
the highlights/boundaries of the information is not
required for fostering the model in this manner; a few
best elements are chosen to utilize selectors. This
information is then partitioned into preparing and
testing information. Preparing information is then
used to frame a Decision-Tree-Model. A choice tree
is a managed learning strategy that requires preparing
information for building a model and in view of this
preparing model testing information is tried. The
choice tree utilizes a tree-like design where each leaf
connotes the conceivable result; for this situation,
each leaf addresses the sort of assault or typical way
of behaving (harmless). Test information is gone
through the preparation model to decide if it is
harmless or assault and on the off chance that it looks
like any assault, it will return the kind of assault.
Choice trees like calculations work through recursive
Figure 1. Passive Installation of Network parceling of the preparation put together to get
Intrusion Detection System subsets that are essentially as unadulterated as
conceivable to a given objective class. Every hub of
the tree is related to a specific arrangement of
records. Then, at that point, we the assistance of the
2.2. RESEARCH OVERVIEW relapse model we will recognize the assault and their
sort and show the outcomes in graphical form. We
Applying AI strategies for interruption recognition is can likewise obtain the outcomes in CSV design.
to naturally fabricate the model in view of the
preparation informational index. This informational 3. STATE OF THE ART
index contains an assortment of information
examples every one of which can be depicted DEFINITION OF THE PROBLEM
utilizing a bunch of traits (highlights). It does exclude Design and implement intrusion detection system
excess records in the train set, so the classifiers won't using supervised and unsupervised machine learning
methods on cloud computing.
be one-sided towards additional continuous records.
There are no copy records in the proposed test sets;
in this way, the exhibition of the students is not 3.1 RESEARCH QUESTION
one-sided by the techniques which have better
This systematic review aims to answer the following
location rates on the successive records. The quantity research questions.
of chosen records from every trouble level gathering
is conversely corresponding to the level of records in

2
RQ1:-How to identify attacks and detect the
intrusions on the Networks ?
4.2. DESIGN AND IMPLEMENTATION
RQ2:-What are the ML/DL based methods that can
be used to Intrusions? We applied various classification algorithms that are
mentioned in the NSL-KDD dataset and will
RQ3:-What are the accuracy, strengths, and compare there results to build a predictive model.
limitations of the proposed models?
Dataset:
https://www.kaggle.com/datasets/hassan06/nslkdd
4. METHODOLOGY AND RESEARCH 1. Dataset: We are utilizing the NSL-KDD
TIMELINE dataset. Firstly, we broke down the dataset, to
distinguish the number of models and elements that
4.1 METHODOLOGY are available in it, and to recognize the number of
absolute highlights versus nonstop highlights.
4.1.1 Support Vector Machine (SVM): This
method performs regression and classification 2. We perform pre-handling on the dataset. We
tasks by constructing nonlinear decision will extricate the dataset from both the modules train
boundaries. Because of the nature of the feature and test.
space in which these boundaries are found, 3. After information extraction, the
Support Vector Machines can exhibit a large degree information is prepared and can be utilized to prepare
of flexibility in handling classification and regression an AI model. We applied different AI methods, for
tasks of varied complexities. There are several example, SVM, Naive Bayes, and KNN to this
types of Support Vector models including linear, pre-handled dataset for preparing the models to
polynomial, RBF, and sigmoid. characterize typical and assault network traffic. Then
we utilized these prepared models on a test dataset, to
4.1.2 Naive Bayes: This is a well established perform grouping of ordinary and assault network
Bayesian method primarily formulated for traffic.
performing classification tasks. Given its simplicity,
i.e., the assumption that the independent variables 4. Then we analyzed the exactness
are statistically independent, Naive Bayes models accomplished by every one of these AI models, to
are effective classification tools that are easy to recognize the best AI model and look at their results.
use and interpret. Naive Bayes is particularly
appropriate when the dimensionality of the 5. Model Deployment: This step is the last phase of
independent space (i.e., number of input the venture and it includes the mechanization of the
variables) is high (a problem known as the curse information extraction and information pre-handling,
of dimensionality). For the reasons given above, re-preparing of the model on the new dataset and
Naive Bayes can often outperform other more finally creating the location for the impending
sophisticated classification methods. A variety of matching data of interest.
methods exist for modeling the conditional
distributions of the inputs including normal, log
normal, gamma, and Poisson.

4.1.3 k-Nearest Neighbour Algorithm: k-Nearest


Neighbors is a memory-based method that, in
contrast to other statistical methods, requires no
training (i.e., no model to fit). It falls into the
category of Prototype Methods. It functions on the
intuitive idea that close objects are more likely
to be in the same category. Thus, in KNN,
predictions are based on a set of prototype examples
that are used to predict new (i.e., unseen) data
based on the majority vote (for classification
tasks) and averaging (for regression) over a set of
k-nearest prototypes (hence the name k-nearest
neighbors).

3
Figure.2. System Architecture and require broad figuring assets concerning
handling power and capacity abilities. These
4.2.1. Hardware requirements provokes should be addressed to satisfy constant
● Processor: Any Update Processor necessities for NIDS and consequently further
● Ram: Min 8 GB develops NIDS execution.
● Hard Disk: Min 100 GB
REFERENCES
4.1.3. Software requirements
● Operating System: Windows family [1] Vipin, Das & Vijaya, Pathak & Sattvik,
● Technology: Python 3.8 Sharma & Sreevathsan, & MVVNS.Srikanth, & T,
● IDE: Jupyter Notebook Gireesh. (2010). Network Intrusion Detection System
Based On Machine Learning Algorithms.
International Journal of Computer Science &
4.1.4. Server Deployment
Information Technology. 2. 10.5121/ijcsit.2010.2613.
● Python anywhere
[2] M. E. KarsligЕl, A. G. Yavuz, M. A.
Güvensan, K. Hanifi and H. Bank, "Network
RESEARCH TIMELINE intrusion detection using machine learning anomaly
detection algorithms," 2017 25th Signal Processing
Expected to be completed in 1 months with the help and Communications Applications Conference (SIU),
of available dataset and research documents as 2017, pp. 1-4, doi: 10.1109/SIU.2017.7960616.
available.
[3] Ahmad, Z, Shahid Khan, A, Wai Shiang, C,
Abdullah, J, Ahmad, F. Network intrusion detection
1. CONCLUSION system: A systematic study of machine learning and
deep learning approaches. Trans Emerging Tel Tech.
Soft computing techniques are standing out from 2021; 32:e4150. https://doi.org/10.1002/ett.4150
analysts in IDS. This is on the grounds that this
method is not difficult to apply and frequently [4] L. Ashiku and C. Dagli, "Network Intrusion
produce improved outcome contrasted with a single Detection System using Deep Learning", Procedia
calculation. The proper combination of multiple Comput. Sci., vol. 185, no. June, pp. 239-247, 2021.
algorithms is the way forward. Most specialists are
zeroing in on the order of IDS, which is gainful in [5] P. Parkar, "A Network Intrusion Detection
deciding realized interruption crime. System Based on Ensemble Machine Learning
Notwithstanding, it might represent an issue in Techniques," 2021 IEEE 2nd International
identifying irregular interruption, which might Conference on Applied Electromagnetics, Signal
incorporate new or changed interruption assaults. Processing, & Communication (AESPC), 2021, pp.
Through this trial examination of multi-access 1-6, doi: 10.1109/AESPC52704.2021.9708502.
arrangement, which had been found in the NSL-KDD
[6] KishorWagh, Sharmila & Pachghare, Vinod
dataset, we have shown that managed learning
& Kolhe, Satish. (2013). Survey on Intrusion
models are fit for recognizing close to 100%
Detection System using Machine Learning
precision. Despite the fact that DL plans have a lot of
Techniques. International Journal of Computer
prevalent exhibition than the ML-based strategies as
Applications. 78. 30-37. 10.5120/13608-1412.
far as their capacity to learn highlights without help
from anyone else and more grounded model abilities
to fit. In any case, these plans are very perplexing

You might also like