0% found this document useful (0 votes)
12 views

Intrusion Detection System With Ensemble Machine Learning Approaches Using VotingClassifier

Internets have become a part of our everyday life due to the advancement in the electronics and signal processing technologies during past decades. The tremendous growth of internet leads towards the network threats.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Intrusion Detection System With Ensemble Machine Learning Approaches Using VotingClassifier

Internets have become a part of our everyday life due to the advancement in the electronics and signal processing technologies during past decades. The tremendous growth of internet leads towards the network threats.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Volume 9, Issue 6, June – 2024 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUN659

Intrusion Detection System with Ensemble Machine


Learning Approaches using Voting Classifier
Karuna G. Bagde1 (Research Scholar) Atul D. Raut2
Department of Computer Science & Engg. Department of Computer Science & Engg
Sant Gadge Baba Amravati University P. R. Pote Patil College of Engg & Mgmt
Amravati, India Amravati, India

Abstract:- Internets have become a part of our everyday  Previous Work:


life due to the advancement in the electronics and signal The paper [1] presents a cloud-based intrusion detection
processing technologies during past decades. The model using random forest and feature engineering, achieving
tremendous growth of internet leads towards the network high accuracy in detecting abnormal activities in network
threats. Many times firewalls and anti-viruses fails to traffic.
manage the network because of this Intrusion Detection
System (IDS) comes to assists us. In this paper we use IDS The paper [2] proposes a prediction-level fusion model
with Ensemble methodologies utilized in machine for intrusion detection and classification using machine
learning involve the fusion of multiple classifiers to learning techniques.
improve predictive performance, while voting classifiers
combine predictions from individual models to reach The paper [3] proposes a combination of ant colony
conclusive decisions. The paper employs a voting optimization and the firefly approach for feature selection in
ensemble method combing decision tree, logistic intrusion detection using machine learning algorithms such as
regression and support vector machine classifier models. AdaBoost, gradient boost, and Bayesian network.
We test our proposed model to classify the NSL-KDD
dataset. Our ensemble methodologies of proposed The paper [4] proposes a combination of ant colony
algorithmproduce a good result. optimization and the firefly approach for feature selection in
intrusion detection using machine learning algorithms such as
Keywords:- Intrusion Detection System, Ensemble Algorithm, AdaBoost, gradient boost, and Bayesian network. Gradient
Machine Learning. boost performs better in recognizing and classifying
intrusions.
I. INTRODUCTION
The paper [5] explores the use of machine learning
Web changes our life; due to its keenness the computer algorithms for intrusion detection systems, specifically
structures are uncovered to an expanded number of dangers. focusing on dataset selection, machine algorithms, and
The inquire about and mechanical developments in are performance metrics.
advancing quickly, a supreme Cyber security remains a
challenge. The paper [6] discusses the development of an Intrusion
Detection System (IDS) that uses machine learning
The Intrusion Detection Systems (IDSs)detect attacks techniques such as Support Vector Machines, Random
against a given set of computer assets from a single desktop Forest, and K-Nearest Neighbor to automatically identify
PC to a major corporate enterprise network. The attacks are attacks on complex networks and systems.
detected by looking for a predetermined set of criteria that is
not present during normal daily use. II. DATASET

Intrusion detection systems observe and analyze  NSL-KDD Dataset:


network traffic to identify anomalies in network behavior and The NSL-KDD data set is an improved version of the
potential unauthorized access. IDSs are designed to KDD’99 intrusion data set. Data were captured from an
constantly monitor the network, resulting in resource usage evaluation test bed and included large numbers of virtual
even when there are no attacks. hosts and user automata. NSL- KDD is a randomly selected
subset of KDD’99 after redundant data were removed and is
a widely used benchmark for evaluating anomaly detection
techniques. NSL-KDD dataset captures TCP, UDP, and
Internet Control Message Protocol (ICMP) traffic collected
using the tcpdump utility. It contains four types of intrusion
attacks: DoS, U2R, R2L, and Probe described in Table [1]

IJISRT24JUN659 www.ijisrt.com 2690


Volume 9, Issue 6, June – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUN659

Table 1 NSL-KDD Data Set


Type Intrusion attacks
DoS back, land, neptune, pod, smurf, teardrop, mailbomb, processtable, udpstorm, apache2, worm
U2R buffer-overflow, loadmodule, perl, rootkit,sqlattack, xterm, ps
R2L fpt-write, guess-passwd, imap, multihop, phf, spy, warezmaster,xlock, xsnoop, snmpguess, snmpgetattack,
httptunnel, sendmail,named
Probe ipsweep, nmap, portsweep, satan, mscan, saint

III. EXISTING SYSTEM The proposed intrusion detection system which used
ensemble method. The method we uses the combination of
 Let’s Delve into the Available Methods: best available algorithm. Ensemble learning is a powerful
technique that combines multiple machine learning models to
 Support Vector Machines (SVM): create a stronger, more robust predictor.
Support Vector Machine (SVM) presents itself as a
classification algorithm designed to identify the hyper plane  Proposed Algorithm 1: Intrusion Detection model using
that maximizes the margin between distinct classes within the Ensemble Method.
dataset. This technique proves effective in handling linear and  Input: Dataset
non-linear data through the utilization of kernel functions,  Output: Model for Intrusion Detection
such as linear, radial basis function, polynomial, or sigmoid,
which aid in the transformation of data into a higher-  Take the Dataset.
dimensional space. The computational complexity of SVMs  Data preprocessing.
is notable, necessitating the consideration of feature reduction  Feature Selection.
methods, such as Principal Component Analysis, to enhance
operational efficiency. It is essential to adjust certain hyper  Cc = Find Correlation on Data components to select high
parameters when employing SVM, including the kernel type correlation values.
(for instance, linear, rbf, poly, sigmoid) and the regularization
parameter (C value).  Classify Cc using train data
 Decision Trees (DT):  Logistic Regression, Decision Tree and SVM
Decision Trees (DTs) are algorithmic models rooted in
tree structures, which iteratively partition the dataset  Use Ensemble Voting algorithm
according to distinct features in order to formulate decision  Propose the Ensemble model
criteria. These models exhibit Proficiency in addressing tasks
 Test the proposed Ensemble model by using test data
related to classification as well as regression analysis.
 Compute the accuracy, precision, Recall
Decision Trees have a tendency to excessively fit the training
data, thus prompting the necessity for ensemble  Return the model
methodologies to alleviate this particular drawback.
 Performance Analysis:
 Logistic Regression (LR):
Logistic Regression (LR) represents a straightforward  True Positive (TP). A true positive outcome is one where
yet efficient classification methodology. The goal is to predict the model predicts a positive outcome correctly.
the likelihood of a binary outcome by analyzing different  False Positive (FP). A false positive outcome is one where
input features. LR is renowned for its interpretability and the model predicts a positive outcome incorrectly.
performance, particularly in scenarios where the association  True Negative (TN). A true negative outcome is one
between predictors and the response variable is close to being where the model predicts a negative outcome correctly.
linear.  False Negative (FN). A false negative outcome is one
where the model predicts a negative outcome incorrectly.
 Proposed System:  Accuracy. Accuracy is simply the measure of how
In this work, to improve the efficiency of intrusion correctly the model predicts a data given.
detection system an ensemble algorithm based on the decision
tree, Support vector machine and linear regression is used.
The result shows that Ensemble methods work best when the
predictors are as independent from one another as possible.
To get diverse classifiers is to train those using very different  Precision.
algorithms. This increases the chance that they improve the Precision is the proportion of positives out of the total
ensemble’s accuracy. The NSL-KDD data set is used to verify number of positives.
the superiority of the algorithm.

IJISRT24JUN659 www.ijisrt.com 2691


Volume 9, Issue 6, June – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUN659

 Recall.
Recall is the proportion of positives that was identified
correctly

 F1-Score:
F1 Score is similar to accuracy but is a better metric
because it seeks to create a balance between precision and
recall especially when there is an uneven class. F1 Score is
given by : Fig 2 Performance Comparison with Previous Works

V. CONCLUSIONS

In this study, we proposed a ensemble intrusion


detection model for detecting widely known attacks in
IV. RESULTS networks. Our model uses correlation methods to select the
best feature. Then applied ensemble classification algorithm,
i.e., Decision Tree, Logistic Regression and SVM for better
accuracy rate. Results show our model’s better performance
on NSL-KDD datasets in comparison to existing methods.

REFERANCES

[1]. Hanaa, Attou., Azidine, Guezzaz., Said, Benkirane.,


Mourade, Azrour., Yousef, Farhaoui (2023), “Cloud-
Based Intrusion Detection Approach Using Machine
Learning Techniques. Big data mining and analytics”,
doi: 10.26599/bdma.2022.9020038
[2]. Ramesh, Boraiah. (2023), “Network intrusion
Fig 1 Accuracy Result of Various Classifier detection and classification using machine learning
predictions fusion”, Indonesian Journal of Electrical
The machine used to run the above algorithm was Engineering and Computer Science, doi:
Intel® Core™ i5-5200U CPU @ 2.20GHz × 4, 7.7 10.11591/ijeecs.v31.i2.pp1147-1153
GiB,Ubuntu 20.04.6 LTS machine. [3]. Mutyalaiah, Paricherla., Mahyudin, Ritonga., Sandip,
R., Shinde., Smita, M., Chaudhari., Rahmat, Linur.,
The existing classifier algorithm Logistic Regression, Abhishek, Raghuvanshi. (2023), “Machine learning
Support Vector Machine and Decision Tree are train and techniques for accurate classification and detection of
tested with NSL_KDD dataset, The result are shown in fig 1. intrusions in computer network”, Bulletin of Electrical
The Proposed ensemble algorithm achieved the accuracy Engineering and Informatics, doi:
score 99.46 followed by Decision Tree was 99.44 then SVM 10.11591/beei.v12i4.4708
99.39 and Logistic Regression was 94.41. Thus, showing that [4]. “Machine learning techniques for accurate
our ensemble model was able to achieve the best result . classification and detection of intrusions in computer
network”, Bulletin of Electrical Engineering and
The proposed ensemble model shows the promising Informatics, doi: 10.11591/eei.v12i4.4708
results with comparison to SVM+RF, IntrudTree and PCA- [5]. Pierpaolo, Dini., Abdussalam, Elhanashi., Andrea,
FELM techniques. The Precision of the proposed model is Begni., Sergio, Saponara., Qinghe, Zheng., Kaouther,
99.64% and Recall rate is 99.1% which is quite good as Gasmi. (2023), “Overview on Intrusion Detection
compared to existing methods. Systems Design Exploiting Machine Learning for
Networking Cybersecurity”, Applied Sciences, doi:
Table 2 Comparison with Existing Work 10.3390/app13137507
Accuracy Precision Recall [6]. Ch. Sai Sampath, Dr. P. Anuradha (2023), “Intrusion
Proposed 0.9946 0.9964 0.991 Detection using Machine Learning: A Random Forest-
SVM+RF 0.675 0.636 0.426 based Approach”, International Journal For
IntrudTree 0.98 0.98 0.98 Multidisciplinary Research, doi:
PCA-FELM 0.998 0.92 10.36948/ijfmr.2023.v05i03.3408

IJISRT24JUN659 www.ijisrt.com 2692


Volume 9, Issue 6, June – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUN659

[7]. D. Xuan, H. Hu, B. Wang and B. Liu , “Intrusion


Detection System Based on RF-SVM Model
Optimized with Feature Selection”, 2021 International
Conference on Communications, Computing,
Cybersecurity, and Informatics (CCCI), Beijing,
China, 2021, pp. 1-5, doi:
10.1109/CCCI52664.2021.9583206.
[8]. Sarker, I.H.; Abushark, Y.B.; Alsolami, F.; Khan, A.I.,
“IntruDTree: A Machine Learning Based Cyber
Security Intrusion Detection Model”,
Symmetry2020,12,754
https://doi.org/10.3390/sym12050754
[9]. E. Vishnu Balan, M.K. Priyan, C. Gokulnath, G. Usha
Devi, “Fuzzy Based Intrusion Detection Systems in
MANET” Procedia Computer Science, Volume
50,2015,Pages 109-114,ISSN 1877-0509,
https://doi.org/10.1016/j.procs.2015.04.071.

IJISRT24JUN659 www.ijisrt.com 2693

You might also like