0% found this document useful (0 votes)
12 views5 pages

Next-Gen Network Attack Detection With Machine Learning and Deep Learning Techniques

Systems for detecting network intrusions that are based on anomalies are very important. This research proposes robust machine learning and deep learning models for classifying different forms of network intrusions and attacks. The 49-feature UNSW-NB15 dataset has been used in experiments by suggested models for nine distinct assault samples.

Uploaded by

SMARTX BRAINS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views5 pages

Next-Gen Network Attack Detection With Machine Learning and Deep Learning Techniques

Systems for detecting network intrusions that are based on anomalies are very important. This research proposes robust machine learning and deep learning models for classifying different forms of network intrusions and attacks. The 49-feature UNSW-NB15 dataset has been used in experiments by suggested models for nine distinct assault samples.

Uploaded by

SMARTX BRAINS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Journal Publication of International Research for Engineering and Management (JOIREM)

Volume: 10 Issue: 06 | June-2024

NEXT-GEN NETWORK ATTACK DETECTION WITH MACHINE


LEARNING AND DEEP LEARNING TECHNIQUES
Dr.M.Deepa1, M.P.Venkat Vijay2, S. SriRanjani3,V. Sowmiya4,V. Ramya5
1Assistant Professor, Department of Computer Science, Pavai Arts & Science College for Women.
2,3,4.5Student,Department of Computer Science & Engineering, Paavai College of Engineering

---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract -Systems for detecting network intrusions that are low-frequency network attacks, adaptation to software-
based on anomalies are very important. This research defined networks, and the large amount of stored data
proposes robust machine learning and deep learning models are major issues with current NIDS.
for classifying different forms of network intrusions and
attacks. The 49-feature UNSW-NB15 dataset has been used in
experiments by suggested models for nine distinct assault
The majority of NIDS in use today are anomaly- or
samples. Among the ensemble models, the Decision Tree signature-based detection systems. Only the list of
classifier yielded the highest accuracy of 99.05%, followed by known threats and associated compromise indicators are
Random forest (98.96%), Adaboost (97.87%), and XGBoost covered by signature-based NIDS [13]. When it comes
(98.08%).The K-Nearest Neighbour classifier was trained for
a range of K values, with K=7 yielding the best results and an to known assaults, it is quite accurate and processes data
accuracy of 95.58%. For binary classification, a Deep quickly. However, it is unable to recognize the zero-day
Learning model with two dense layers activated by ReLU and attack and, regardless of the result, raises unnecessary
a third dense layer activated by Sigmoid was created. It alarms, akin to a Window worm attempting to infiltrate
yielded good accuracy of 98.44% when used with the ADAM
optimizer and an 80:20 Train-Test Split Ratio. XGBoost a Linux machine. Depending on the operating system,
detects network attack exploits with 95% accuracy, Random version, and apps, it is impractical for internal attacks
Forest detects fuzzer attacks with 90% accuracy, Random [16]. Using anomaly-based NIDS, one can find novel,
Forest detects generic assaults with 99% accuracy, and
suspicious activity that deviates from typical behavior.
Decision Trees detects reconnaissance attacks with 79%
accuracy. Detecting network attacks requires no feature NIDS that is anomaly-based is effective at identifying
selection because all features are powerful and important. zero-day attacks. Nevertheless, the greater chance of
false positives means more time and money need to look
into every warning of possible risks.
KeyWords: optics, photonics, light, lasers, templates, journals
NIDS based on machine learning [18] has the ability to
acquire classification models from training sets.
1.INTRODUCTION Training the model with large and varied samples of
network data makes it robust to categorize attacks into
The Internet of Things (IoT), cloud-based services, and potential groups. By identifying attack patterns based on
massive networking equipment are to blame for the network characteristics, deep learning models are also
exponential growth in network data [1]. Attacks are essential to NIDS. Additionally, it does away with the
growing rapidly along with network data, posing a need for feature representation, selection, and
serious risk to network security. Although more security correlation [19]. With fewer false alarms, the deep
devices can be added to a network to increase security, learning model effectively detects the attack and learns
total protection cannot be guaranteed. Future threats the hidden network behavior. Regretfully, criminals are
must be addressed in addition to current ones. At the utilizing
system, network, application, and transmission levels, advanced methods to take advantage of computer
the current Network Intrusion Detection System (NIDS) resource weaknesses. However, the number of
offers a tiered security defense for the network [12]. computing infrastructure breaches is rising at an
When an attacker gets past one layer of defense, layered exponential rate. This research proposes a robust NIDS
security ensures that additional layers will stop them. with excellent accuracy and F-Measure utilizing
Inadequate precision, dynamic network traffic behavior, machine learning and deep learning approaches.
Journal Publication of International Research for Engineering and Management (JOIREM)
Volume: 10 Issue: 06 | June-2024

2. RELATED WORKS (Support Vector Machines, Random Forest,


Moustafa et al. used the multivariate skewness, LinearRegression Models). For the purpose of
multivariate kurtosis, and Kolmogorov-Smirnov test to classifying Normal, Dos, and Prob categories, DNN
statistically analyze the observations and features. To yielded acceptable results. SVM did a good job of
assess the importance between features, supervised identifying four attacks in addition to Normal.
feature correlation using the Gain Ratio and Moreover, random forests and linear regression were
unsupervised feature correlation using Pearson's effective in detecting network threats. [24]
correlation coefficient were also carried out. Lastly, the
The UNSW-NB15 dataset was used in earlier research,
complexity of the UNSW-NB15 dataset are assessed
which involved learning machine learning and deep
using the current classifiers, which have metrics for false
learning models on a subset of features. This reduced
alarm rate and accuracy. Well-performing, the decision
model performance because the feature set only had 47
tree classifier yielded an accuracy of 85.56% and a false
features, which is a relatively small number given the
alert rate of 15.78% [13].
importance of each feature. There is a dearth of
An intrusion detection system, a frequent signature literature on deep neural networks, and what little there
database, an updating agent, and a complementing is has only addressed a small number of assaults. To
signature database make up the four components of the detect network threats, four classical, three ensemble
NIDS that Almutairi et al. suggested [1]. Network machine learning, and deep multi-layer perceptron
packets are stripped of their signatures by IDS, which models are used in this study.
then compares them to signature databases and sounds
an alarm if there is a match. With fewer false positives, 3. PROPOSED METHODOLOGY
this four-component approach guarantees early and The Australian Centre for Cyber Security created the
accurate attack detection. Moreover, attacks with rare UNSW-NB15 Dataset [22], which has 49 features (five
signatures are detected by the signatures stored in the flow features, thirteen basic features, eight control
supplementary database. The primary problem in features, nine-time features, five additional generated
signature-based detection is false alarm minimization, general-purpose features, seven additional generated
which can be resolved by state-full signatures, connection features, and two labeled features) and nine
vulnerability signatures, and signature augmentation. families of attacks on which the proposed method has
Meftah et al. suggested using machine learning been tested. These characteristics show the patterns of
techniques in anomaly-based NIDS. The index of modern network traffic that are gathered from packet
feature relevance in lowering impurity throughout the header, server to client, and packet header
entire forest is assigned using a random forest with 10- communications [21]. The dataset comprises 2540044
fold cross-validation. Ct dstsrcltm, ctsrvdst, ctdst sport samples containing network traffic related to Backdoors
ltm, ctsrcdportltm, and ctsrvsrc are the top features of (2329), DoS (16353), Exploits (44525), Normal
the UNSW-NB15 Dataset. In the binary classification (2218761), Fuzzers (24246), Analysis (2677),
model for attack detection, Support Vector Machine Reconnaissance (13987), Shellcode (1511), and Worms
surpassed Logistic Regression and Gradient Boost (174).
Machine with an accuracy of 82.11%. The multi- Support Vector Machine (SVM), Adaboost, XGBoost,
classification model with Decision Tree C5.0 beat Naive Random Forest, K-Nearest Neighbor (KNN), Decision
Bayes and Support vector machines in identifying the Tree (DT), Multi-Layer Perceptron (MLP), and Deep
type of attack.
Multi-Layer Perceptron (Deep MLP) are the binary
In order to detect attacks (Normal, DoS, Probe classification models that are used to identify network
Categories, R2L, U2R), Peng et al. suggested Deep threats [26]. SVM is a statistical learning classifier that
Neural Network (DNN)k with five hidden layers using distinguishes between network attack patterns and
the NSL-KDD Dataset. They then compared the regular traffic patterns using multidimensional hyper-
network's performance with machine learning models planes [33]. SVM with linear kernel allows for quick
Journal Publication of International Research for Engineering and Management (JOIREM)
Volume: 10 Issue: 06 | June-2024

training and a large feature collection. KNN is a Accuracy, Precision, Recall, F1-Measure, Receiver
memory-based classifier that uses the majority class Operating Characteristic Curve (ROC), and Area under
label of K-nearest neighbors to predict an unknown ROC (AUC) are the metrics used to assess the suggested
sample's class label [5]. A multistage approach for system [8]. True positive (TP), False negative (FN),
decision-making that works with both nominal and False positive (FP), and True negative (TN) values make
numerical data is the DT classifier [2]. up the 2*2 confusion matrix. The percentage of attack
samples that are categorized as attacks is called True
Because it only has a small number of layered simple Positive. False Negative denotes the quantity of attack
conditional expressions, it generates decisions quickly. samples that were incorrectly identified as normal.
An ensemble classification model called a Random The number of incorrectly categorized normal samples
Forest classifier uses bagging to train a set of CART is known as a false positive, while the number of
trees in order to produce predictions. This model correctly classified normal samples is known as a
employs out-of-bag error estimation, random feature genuine negative. The percentage of samples that are
selection to determine a node's choice, and averaging the correctly classified is known as accuracy. The ratio of
likelihood of each tree class assignment to determine the accurately categorized attacks to the total number of
final class [4]. attack samples is known as recall. In classified assaults,
Adaboost enhances a number of subpar models to create precision refers to real attack samples.
a strong prediction model. It is the best classifier right
out of the box, requires no parameter tinkering, and is The performance of many machine learning models in
less prone to overfitting [9]. Through parallel detecting network attacks. Decision Tree yielded the
implementation, Extreme Gradient (XG) is an ensemble highest results, with an accuracy of 99.05% and an F1-
classifier that boosts a poor classifier. Additionally, it Measure of 0.99, for recognizing network assaults. Not
facilitates hardware optimization, avoids overfitting, insignificant are the outcomes obtained via SVM, which
handles missing data, and prunes trees. yielded an accuracy of 95.17% and an F1-Measure of
0.94. The KNN models are trained for various K
In order to map input information to class labels, deep neighbors (2,3,4,5,6,7,8,9), with K=7 yielding the best
neural networks are essential. Using a back propagation accuracy of 95.58%. The features that are utilized to
weight modification approach, two models are built to train the model are highly significant and might
predict class labels: Multi-Layer Perceptron (MLP) and significantly differ between normal network traffic and
Deep MLP.The quasi-newton lbfgs for the optimizer, attacks. Thus, ensemble learning techniques like
hidden neuron strength of 15, penalty regularization bagging in random forest and boosting in Adaboost and
term parameter alpha with value 1e-5, and random state XGBoost are inferior to simple decision trees.
value 1 are the values of the MLP parameters. Two
dense layers with Relu activation and a dense layer with
a sigmoid activation function make up the suggested
Deep MLP. Based on the outputs from the preceding
layer, sigmoid activation predicts the target class, and a
dense layer using Relu activation produces an accurate
mixing of inputs from features.

4. RESULTS & DISCUSSION


The suggested intrusion detection solution for networks
was put into practice using Ubuntu 20.04.4 with an Intel
Core 11th generation i7 CPU with 16GB RAM and a
1TB hard drive. The Keras 2.3.1 and TensorFlow 2.2.0
libraries are used in Python to implement the machine
learning and deep learning models.
Journal Publication of International Research for Engineering and Management (JOIREM)
Volume: 10 Issue: 06 | June-2024

and XGBoost ensemble models, with a 99.05% success


rate. The UNSWNB15 dataset has robust and highly
relevant features for detecting network assaults. With
seven neighbors, the KNN machine learning model
yielded an accuracy of 95.58%. using high True
Positives and False low negatives, the deep learning
model using the ADAM optimizer and 80:20 Train-Test
split yielded an accuracy of 98.44%. Machine learning
models are also useful for classifying several types of
attacks, including reconnaissance, fuzzers, exploits, and
generic attacks.

The UNSW-NB15's features aren't strong enough to


withstand variations from DoS, Worms, Backdoors, and
Shellcode attacks. In the future, network traffic pcap
files' texts can be used to directly train deep learning
models, such as one-dimensional Convolutional Neural
Networks (CNN), to recognize assaults. With its
convolutional layers, CNN automatically extracts
features from pcap data, negating the need for intricate
feature engineering. Visualization representation is a
new technique being used for malware identification and
network assault prevention. Greyscale, RGB, or Markov
picture representations are available for the network
pcap files. Two-dimensional CNN, which can identify
network assaults with a high degree of covariance, is
trained using these images.

Figure 1: ROC & Precision Recall curve for all ML REFERENCES


Models
[1] Almutairi, A.H., Abdelmajeed, N.T., 2017.
With an 80:20 Train-Test ratio, a high accuracy of Innovative signature based intrusion detection
98.44%, and an F1-Measure of 0.98 for the Adam system: Parallel processing and minimized
optimizer, the Deep Multi-Layer Perceptron performs database, in: 2017 International Conference on
the Frontiers and Advances in Data Science
exceptionally well. Adam outperforms the stochastic
(FADS), IEEE. pp. 114–119.
gradient descent optimizer in terms of performance. For
[2] Ammar, A., et al., 2015. A decision tree
the 80:20 Train-Test split, the Deep MLP model with classifier for intrusion detection priority
Adam Optimizer yielded the best results since it tagging. Journal of Computer and
accurately reflects the ratio of attack network traffic to Communications 3, 52.
normal system traffic. The Moustafa et al. network [3] Arce, I., 2004. The shellcode generation. IEEE
anomaly detection method achieved 85.56% accuracy security & privacy 2, 72–76.
for Decision Trees and 81.34% accuracy for Artificial [4] Belgiu, M., Dragut¸, L., 2016. Random forest in
Neural Networks. remote sensing: A review of applications and
future directions. ISPRS journal of photogram- ˘
4. CONCLUSION metry and remote sensing 114, 24–31.
[5] Cunningham, P., Delany, S.J., 2021. k-
Deep learning and machine learning models for NIDS nearestneighbour classifiers-a tutorial. ACM
are covered in this work. In terms of performance, Computing Surveys (CSUR) 54, 1–25.
Journal Publication of International Research for Engineering and Management (JOIREM)
Volume: 10 Issue: 06 | June-2024

[6] Dada, E.G., Bassi, J.S., Chiroma, H., Adetunmbi, [18]Lee, C.H., Su, Y.Y., Lin, Y.C., Lee, S.J., 2017.
A.O., Ajibuwa, O.E., et al., 2019. Machine Machine learning based network intrusion
learning for email spam filtering: review, detection, in: 2017 2nd IEEE International
approaches and open research problems. conference on computational intelligence and
Heliyon 5, e01802. applications (ICCIA), IEEE. pp. 79–83.
[7] De Canniere, C., Biryukov, A., Preneel, B., 2006. [19]Li, P., Salour, M., Su, X., 2008. A survey of
An introduction to block cipher cryptanalysis. internet worm detection and containment. IEEE
Proceedings of the IEEE 94, 346–356. Communications Surveys & Tutorials 10, 20–35.
[8] Dhanya, K., Dheesha, O., Gireesh Kumar, T., [20]Meftah, S., Rachidi, T., Assem, N., 2019.
Vinod, P., 2020. Detection of obfuscated mobile Network based intrusion detection using the
malware with machine learning and deep unsw-nb15 dataset. International Journal of
learning models, in: Symposium on Machine Computing and Digital Systems 8, 478–487.
Learning and Metaheuristics Algorithms, and [21]Moustafa, N., Slay, J., 2015a. The significant
Applications, Springer. pp. 221–231. features of the unsw-nb15 and the kdd99 data
[9] Freund, Y., Schapire, R.E., 1997. A decision- sets for network intrusion detection systems,
theoretic generalization of on-line learning and in: 2015 4th international workshop on building
an application to boosting. Journal of computer analysis datasets and gathering experience
and system sciences 55, 119–139. returns for security (BADGERS), IEEE. pp. 25–
[10]Friedman, J.H., 2001. Greedy function [22]Moustafa, N., Slay, J., 2015b. Unsw-nb15: a
approximation: a gradient boosting machine. comprehensive data set for network intrusion
Annals of statistics , 1189–1232. detection systems (unsw-nb15 network data
[11]Gandhi, M., Srivatsa, S., 2008. Detecting and set), in: 2015 military communications and
preventing attacks using network intrusion information systems conference (MilCIS), IEEE.
detection systems. International Journal of pp. 1–6.
Computer Science and Security 2, 49–60. [23]Moustafa, N., Slay, J., 2016. The evaluation of
[12]Garuba, M., Liu, C., Fraites, D., 2008. Intrusion network anomaly detection systems: Statistical
techniques: Comparative study of network analysis of the unsw-nb15 data set and the
intrusion detection systems, in: Fifth comparison with the kdd99 data set.
International Conference on Information Information Security Journal: A Global
Technology: New Generations (itng 2008), IEEE. Perspective 25, 18–31
pp. 592–598. [24]Peng, Y., Su, J., Shi, X., Zhao, B., 2019.
[13]Gascon, H., Orfila, A., Blasco, J., 2011. Analysis Evaluating deep learning based network
of update delays in signature-based network intrusion detection system in adversarial
intrusion detection systems. Computers & environment, in: 2019 IEEE 9th International
Security 30, 613–624. Conference on Electronics Information and
[14]Hubballi, N., Suryanarayanan, V., 2014. False Emergency Communication (ICEIEC), IEEE. pp.
alarm minimization techniques in signature- 61–66.
based intrusion detection systems: A survey. [25]Rieger, P., Nguyen, T.D., Miettinen, M., Sadeghi,
Computer Communications 49, 1–17. A.R., 2022. Deepsight: Mitigating backdoor
[15]Jing, D., Chen, H.B., 2019. Svm based network attacks in federated learning through deep
intrusion detection for the unsw-nb15 dataset, model inspection. arXiv preprint
in: 2019 IEEE 13th international conference on arXiv:2201.00763 .
ASIC (ASICON), IEEE. pp. 1–4. [26] Sugunan, K., Gireesh Kumar, T., Dhanya, K.,
[16]Kumar, V., Sangwan, O.P., 2012. Signature 2018. Static and dynamic analysis for android
based intrusion detection system using snort. malware detection, in: Advances in Big Data
International Journal of Computer Applications and Cloud Computing. Springer, pp. 147–155
& Information Technology 1, 35–41.
[17]Lee, B., Amaresh, S., Green, C., Engels, D., 2018.
Comparative study of deep learning models for
network intrusion detection. SMU Data Science
Review 1, 8.

You might also like