01 Enhancing Cybersecurity With ML A Multi-Algorithm Approach To Anomaly-Based Intrusion Detection
01 Enhancing Cybersecurity With ML A Multi-Algorithm Approach To Anomaly-Based Intrusion Detection
Keywords: IDS (Intrusion Detection System), Anomaly Detection, False Alarm Rate (FAR) and Machine Learning algorithms.
1,2
Fig 1. IDS Taxonomy
Department of Computer Science and Engineering, Sipna College of
Engineering and Technology, Amravati, Maharashtra, India.
[email protected]
[email protected]
International Journal of Intelligent Systems and Applications in Engineering IJISAE, 2024, 12(21s), 1111–1116 | 1111
random variables. The affiliation between one or greater
variables is taken into consideration by means of the
multivariate model. The time collection version includes
the order and inter-arrival instances of the observations,
as well as their values which might be labeled as
anomalous, and employs an c programming language
timer, along side an invitational counter or aid degree.
2.2. Knowledge-Primarily Based Techniques
Knowledge-based structures preserve the music of
difficulty-unique statistics. A format that enables the
inference engine to execute deduction on the statistics in
information-based totally statistics contains symbolic
representations of professional's standards of judgment.
One of the know-how-based totally IDS strategies this is
most regularly hired is the expert system method. The 3
classes of information-based totally methodologies
include expert systems, rule-based totally models, and
body-primarily based models. A modified model of
Fig 2. Anomaly Detection classification grammar-based manufacturing regulations are rule-based
policies. A full corpus of predicted data and behaviors is
Anomaly detection is but any other helpful approach for
localized right into a single structure through a body-
intrusion detection. Since it has become first cautioned in
based version. 3 levels are worried inside the expert
[3], anomaly detection related to vulnerability scanning
machine classification of the audit records according with
and facts protection has been the point of interest of
a set of policies. The education information is first used to
research. In anomaly-based IDSs, the typical system in
become aware of diverse properties and classifications.
addition to network traffic behavior is depicted, and any
conduct that deviates above a positive threshold is 2.3. Machine learning-primarily based IDS
recognized as an uncommon hobby. Then again, as
The muse of machine gaining knowledge of procedures is
compared to IDSs based on signatures, anomaly-based
the introduction of an both specific or implicit model. The
IDSs create a larger amount of fake positives. How those
need for categorizing statistics and teaching the modeling
technologies ought to gain knowledge of, or the way to
technique, a procedure that lays heavy needs on resources,
decide what constitutes normal conduct of a platform or
is a completely unique feature of these approaches. The
network environment, and a way to computerize represent
application of statistical methods and gadget studying
this conduct, is an important topic in anomaly-based IDSs.
standards often overlaps, in spite of the latter's emphasis
Anomaly detection methods may be divided into 3 on growing a version that enhances overall performance
primary corporations [4]: statistical, know-how-primarily based on prior statistics. As an end result, system studying
based, and gadget studying-based totally, depending on for IDS has the functionality to adjust its execution
the type of processing related to the target device's method. This function could make it attractive to utilize
"behavioral" version. The intrusion prevention such systems in all situations.
methodologies and comparisons of numerous processes
3. Intrusion Detection and Machine
are mentioned in [18] alongside the benefits and
Learning
drawbacks of each.
Using machine learning strategies for intrusion detection
2.1. Statistical anomaly-based totally IDS
involves developing a model routinely from training
A statistically anomaly-primarily based IDS monitors statistics. Every of the information times in this set may
regular community activity, consisting of the bandwidth, be represented by a hard and fast of homes (functions) and
commonly utilized, the protocols used, and the ports and the labels that go along with them. The qualities may be
devices which can be usually connected to each other. It continuous or categorical, for instance. The suitability of
notifies the administrator or person whilst extraordinary anomaly detection approaches depends on the
site visitors is recognized (now not everyday) [5] [7]. It's characteristics' nature. As an example, distance-based
far all over again divided into time series, multivariate, processes are commonly unsatisfactory when carried out
and univariate models. The permissible range of values to specific variables seeing that they have been initially
for every variable is described by using modeling the designed to characterize with non-stop features.
univariate version parameters as impartial Gaussian Normally, labels for facts times take the form of binary
International Journal of Intelligent Systems and Applications in Engineering IJISAE, 2024, 12(21s), 1111–1116 | 1112
values, along with normal and peculiar. As opposed to the
use of the time period "anomaly," different researchers
have used other sorts of attacks such DoS, U2R, R2L, and
Probe. Learning strategies can provide greater details
about the extraordinary varieties of anomalies in this way.
However, experimental findings demonstrate that existing
mastering methodologies is insufficiently accurate to pick
out the particular types of abnormalities. Obtaining a
efficaciously categorized statistics collection this is
representative of all sorts of sports is relatively steeply-
priced because labeling is often accomplished manually
by way of human specialists. For this reason, 3 working
modes for anomaly detection techniques are described:
supervised learning, unsupervised gaining knowledge of,
and semi-supervised learning.
4. System Design For Intrusion Detection Figure 3. System Design for IDS
System 2. Post Processing
1. Preprocessing Phase
The preprocessing output is in comparison to the answer
With the usage of packet sniffing equipment (which elegance, and device performance is calculated as the sum
includes Wireshark and Capsa), packet characteristics of accuracy and false alarms. true superb, proper negative,
including IP/TCP/ICMP headers are extracted from every false superb, and false bad, respectively.
packet in the course of this segment of packet shooting
and packet analysis. The packet header will then be 1.2. Reducing False Alarms
divided with supply addresses, vacation spot addresses, Greater training is needed if the machine is still emitting
and many others. There are sure techniques needed in this some fake indicators throughout all the algorithms. The
step for the selection of key features. figuring out the machine will retain to examine on its personal, loose
packet's normality or incursion, and so forth. on this stage, human intervention, in line with the machine learning
packets are captured from datasets (such the KDD approach. Consequently, there may be no need for
dataset/NLS KDD) with the intention to act because the upgrading. Different classification machine learning
IDS's records source. algorithms are tested on real time data set And as shown
1.1. Classification in below fig. it shows decision tree achieve 98% accuracy
among the other classifiers as LR, RF, NB, Adaboost and
Utilize the data from the preceding step throughout the MLP.
classification phase to decide if the packet is a regular
packet or an attack packet. The corresponding algorithms V.Performance Evaluation Metric And Results
will categorize the packet into related organizations based
on the character values. Solution training and packet
functions are offered at some point of the schooling
section so that it will help in the improvement of rules
governing mapping domain names. These tips can be
modified or amended based on additional schooling. Each
set of rules has a completely unique categorization
approach. Untrained records are dispatched to the system
all through the testing segment with a purpose to pattern
whether or not proper responses are returned. Without
defining the reaction class, the device operation is
executed whilst accepting input packets.
International Journal of Intelligent Systems and Applications in Engineering IJISAE, 2024, 12(21s), 1111–1116 | 1113
Different classification machine learning algorithms are
tested on real time data set And as shown in below
figures.
Fig.5 Evaluation of RF
5. Conclusion
In this exploration of “Enhancing Cybersecurity with
ML”,we have delved into the intersection of machine
learning and cybersecurity, focusing specifically on
anomaly-based intrusion detection. Through the
integration of multiple ML algorithms, our study
demonstrates the potential to significantly enhance
cybersecurity measures by detecting intrusions
effectively. By leveraging a multi-algorithm approach, we
have showcased the versatility and adaptability of
machine learning techniques in indentifying network
anomaly. This comprehensive strategy not only enhances
Fig.7 Evaluation of Logistic Regression
the detection capabilities of Intrusion Detection Systems
but also strengthens the overall resilience of cybersecurity
framworks against evolving thrats.
References:
[1] MananJ, Ahmed A, Ullah I, Merghem-Boulahia L,
Gaiti D (2019) Distributed intrusion detection
International Journal of Intelligent Systems and Applications in Engineering IJISAE, 2024, 12(21s), 1111–1116 | 1114
scheme for next generation networks. J Netw [14] Liu J, He J, Zhang W, Ma T, Tang Z, Niyoyita JP,
Comput Appl 147. Gui W (2019) ANID-SEoKELM: adaptive network
intrusion detection based on selective ensemble of
[2] Nadiammai G, Hemalatha M (2014) Effective
kernel ELMs with random features. Knowl Based
approach toward Intrusion detection system using
Syst 177:104–116.
data mining techniques. Egypt Inform J 15:37–50.
[15] Khonde SR, Ulagamuthalvi V (2022) Blockchain:
[3] Almseidin M, Alzudi M, Kovacs S, Alkasassbeh M
secured solution for signature transfer in distributed
(2017) Evaluation of machine learning algorithms
intrusion detection system. Comput Syst Sci Eng
for intrusion detection. In: 15th International
40(1):37–51.
symposium on intelligent systems and informatics,
Subotica, Serbia, pp 14–16. [16] Khonde SR, Ulagamuthalvi V (2022) Hybrid
intrusion detection system using blockchain
[4] Vinayakumar R, Alazab M, Soman K,
framework. Eurasip J Wirel Commun Netw 58.
Poornachandran P, Al-Nemrat A, Venkatraman S
(2019) Deep learning approach for intelligent [17] Ferrag MA, Maglaras L, Moschoyiannis S, Janicke
intrusion detection system. IEEE Access 7:14525– H (2020) Deep learning for cyber security intrusion
41550. detection: approaches, datasets, and comparative
study. J Inf Secur Appl 50:102–419.
[5] Butun I, Morgera S, Sankar R (2014) A survey of
intrusion detection systems in wireless sensor [18] Garg S, Kaur K, Batra S, Aujla GS, Morgan G,
networks. IEEE Commun Surv Tutorials 16(1):266– Kumar N, Zomaya AY, Ranjan R En-abc: an
282. ensemble artificial bee colony based anomaly
detection scheme for cloud environment. J Parallel
[6] Alazab A, Hobbs M, Abawajy J, Khraisat A, Alazab
Distrib Comput 135:219–233.
M (2014) Using response action with intelligent
intrusion detection and prevention system against [19] Wu K, Chen Z, Li W (2018) A novel intrusion
web application malware. Inf Manage Comput detection model for a massive network using
Secur, 22(5):431–449. convolutional neural networks. IEEE Access
6:50850–50859.
[7] Aburomman, Reaz M,”A survey of intrusion
detection systems based on ensemble and hybrid [20] Xiao Y, Xing C, Zhang T, Zhao Z (2019) An
classifiers. Comput Secur 65:135–152. intrusion detection model based on feature reduction
and convolutional neural networks. IEEE Access
[8] Buczak, Guven E (2016) A survey of data mining
7:42210–42219.
and machine learning methods for cyber security
intrusion detection. IEEE Commun Surv Tutorials [21] R. Chitrakar and H. Chuanhe, ‘‘Anomaly detection
18(2):1153–1176. using support vector machine classification with k-
medoids clustering,’’ in Proc. 3rd Asian Himalayas
[9] Qassim Q, Zin A, Aziz M (2016) Anomalies
Int. Conf. Internet, Nov. 2012, pp. 1–5.
classification approach for network-based intrusion
detection system. Int J Netw Secur 18(6):1159– [22] I. P.-B. A. Syarif and G. Wills, ‘‘Unsupervised
1172. clustering approach for network anomaly
detection,’’ in Proc. Int. Conf. Netw. Digit. Technol.,
[10] Vimala S, Khanaa V, Nalini C (2019) A study on
Berlin, Germany, 2012, pp. 135–145.
supervised machine learning algorithm to improvise
intrusion detection systems for mobile ad hoc [23] K. Moh, M. Aung, and N. N. Oo, ‘‘Association rule
networks. Clust Comput 22:4065–4074 pattern mining approaches network anomaly
detection,’’ in Proc. Int. Conf. Future Comput.
[11] Ahmed M, Mahmood AN, Hu J (2016) A survey of
Technol., Singapore, 2015, pp. 164–170.
network anomaly detection techniques. J Netw
Comput Appl 60:19–31. [24] A. H. Hamamoto, L. F. Carvalho, L. D. H. Sampaio,
T. Abrão, and M. L. Proença, ‘‘Network anomaly
[12] Feng W, Zhang Q, Hu G, Huang JX (2014) Mining
detection system using genetic algorithm and fuzzy
network data for intrusion detection through
logic,’’ Expert Syst. Appl., vol. 92, pp. 390–402,
combining svms with ant colony networks. Futur
Feb. 2018.
Gener Comput Syst 37:127–140.
[13] Li L, Yu Y, Bai S, Hou Y, Chen X (2017) An [25] N. T. Pham, E. Foo, S. Suriadi, H. Jeffrey, and H. F.
effective two-step intrusion detection approach M. Lahza, ‘‘Improving performance of intrusion
based on binary classification and k-NN. IEEE detection system using ensemble methods and
Access 6:12060–12073.
International Journal of Intelligent Systems and Applications in Engineering IJISAE, 2024, 12(21s), 1111–1116 | 1115
feature selection,’’ in Proc. Australas. Comput. Sci. [37] M. Alkasassbeh, ‘‘An empirical evaluation for the
Week Multiconference, Jan. 2018, pp. 1–6. intrusion detection features based on machine
learning and feature selection methods,’’ J. Theor.
[26] I. Sharafaldin, A. Gharib, A. H. Lashkari, and A. A.
Appl. Inf. Technol., vol. 95, no. 22, pp. 5962–5976,
Ghorbani, ‘‘Towards a reliable intrusion detection
2017.
benchmark dataset,’’ Softw. Netw., vol. 2017, no. 1,
pp. 177–200, 2017. [38] M. A. Ambusaidi, X. He, P. Nanda, and Z. Tan,
‘‘Building an intrusion detection system using a
[27] A. M. Al Tobi and I. Duncan, ‘‘KDD 1999
filter-based feature selection algorithm,’’ IEEE
generation faults: A review and analysis,’’ J. Cyber
Trans. Comput., vol. 65, no. 10, pp. 2986–2998, Oct.
Secur. Technol., vol. 2, nos. 3–4, pp. 164–200, Oct.
2016.
2018.
[39] CybersecurityUpdate—WebProNews. Accessed:
[28] N. Moustafa and J. Slay, ‘‘UNSW-NB15: A
Aug. 21, 2020. [Online]. Available:
comprehensive data set for network intrusion
https://www.webpronews. com/cisco-cybersecurity-
detection systems,’’ in Proc. Mil. Commun. Inf.
threats/
Syst., 2015, pp. 1–6.
[40] Z. M. Smith, E. Lostri, and J. A. Lewis, ‘‘The hidden
[29] I. Sharafaldin, A. Habibi Lashkari, and A. A.
costs of cybercrime,’’ in Proc. McAfee, 2020, p. 3.
Ghorbani, ‘‘Toward generating a new intrusion
detection dataset and intrusion traffic
characterization,’’ in Proc. 4th Int. Conf. Inf. Syst.
Secur. Privacy, 2018, pp. 108–116.
[30] Hulk—Packet Storm. Accessed: Aug. 22, 2020.
[Online]. Available:
https://packetstormsecurity.com/files/112856/HUL
K-Http-UnbearableLoad-King.html
[31] N. Moustafa and J. Slay, ‘‘The evaluation of network
anomaly detection systems: Statistical analysis of
the UNSW-NB15 data set and the comparison with
the KDD99 data set,’’ Inf. Secur. J., Global
Perspective, vol. 25, nos. 1–3, pp. 18–31, Apr. 2016.
[32] Cyber Kill Chain—Lockheed Martin. Accessed:
Aug. 27, 2020. [Online]. Available:
https://www.lockheedmartin.com/en-
us/capabilities/ cyber/cyber-kill-chain.html
[33] A. Divekar, M. Parekh, V. Savla, R. Mishra, and M.
Shirole, ‘‘Benchmarking datasets for anomaly-based
network intrusion detection: KDD CUP 99
alternatives,’’ in Proc. IEEE 3rd Int. Conf. Comput.,
Commun. Secur. (ICCCS), Kathmandu, Nepal, Oct.
2018, pp. 1–8.
[34] P. Gil. Cleaning Big Data—Forbes. Accessed: Aug.
26, 2020. [Online]. Available:
https://www.forbes.com/sites/gilpress/2016/
03/23/data-preparation-most-time-consuming-least-
enjoyable-datascience-task-survey-
says/#79e15eaa6f63
[35] Documentation—Argus Accessed: Aug. 27, 2020.
[Online]. Available:
https://openargus.org/documentation,
[36] Online Manual—Tcptrace. Accessed: Aug. 27,
2020. [Online]. Available:
http://www.tcptrace.org/manual.html
International Journal of Intelligent Systems and Applications in Engineering IJISAE, 2024, 12(21s), 1111–1116 | 1116