A Data-Driven Approach For Classifying and Predicting DDoS Attacks With Machine Learning
A Data-Driven Approach For Classifying and Predicting DDoS Attacks With Machine Learning
Abstract:- The importance of IoT security is growing as a cyberattacks due to their extensive use. The most prevalent and
result of the growing number of IoT devices and their deadly kind of cyberattacks are DDoS attacks [3]. Numerous
many applications. Distributed denial of service (DDoS) services are being interrupted.
assaults on IoT systems have become more frequent,
sophisticated, and of a different kind, according to recent Denial of service, or DoS, is an acronym describing what
research on network security, making DDoS one of the happens when a system delivers a malicious message to a
most formidable dangers. Real, lucrative, and efficient server. When several hacked systems or computers launch DoS
cybercrimes are carried out using DDoS attacks. One of assaults against a single application, it's known as a
the most dangerous types of assaults in network security is DDoS attack. A deluge of packets from all corners of the globe
the DDoS attack. ML-based DDoS-detection systems is thereafter sent towards the designated network. The
continue to face obstacles that negatively impact their proliferation of disruptive Internet technologies is causing
accuracy. AI, which incorporates ML to detect DDoS assaults to evolve and grow in both number and
cyberattacks, is the most often utilised approach for these sophistication[4][5]. Cyber threats that might seriously affect a
goals. In this study, it is suggested that DDoS assaults in business's operations include ransom demands from attackers,
Software-Defined Networking be identified and countered data theft, and disruptions.
using ML approaches. The F1-score, recall, accuracy, and
precision of many ML techniques, including Cat Boost and Responding quickly to DDoS assaults is the best way to
Extra Tree classifier, are compared in the suggested prevent them. Cyberattacks against internet-connected devices
model. DDoS-Net is designed to handle data imbalance have become more appealing as a target due to the expanding
effectively and incorporates thorough feature analysis to use of the internet. As ML and DL [6][7] reveal their enormous
enhance the model's detection capabilities. Evaluation on potential in multiple areas, academics and industry are
the UNSW-NB15 dataset demonstrates the exceptional investigating the notion of using these technologies for DDoS
performance of DDoS-Net. The highest accuracy achieved detection. Traditional approaches are slower and less accurate
by the machine learning algorithms Cat Boost and Extra when it comes to risk detection. Using an ML method, threats
Tree classifier is 90.78% and 90.27% respectively using the may be identified. DL may thus be a useful DDoS detection
most familiar dataset. This work presents a strong and technique.
precise approach for DDoS attack detection, which greatly
improves the cybersecurity environment and strengthens Contribution of Study
digital infrastructures against these ubiquitous threats. This research contributes to the field of cybersecurity by
implementing ML techniques for the classification and
Keywords:- Denial-of-Service (DoS), Attack, Classification, prediction of DDoS attacks. This study main contributions are:
Identification, Machine Learning.
Implementation of ML models for DDoS attack detection
I. INTRODUCTION and classification with the UNSW-NB15 dataset.
Feature selection using Select K-Best method with the
These days, almost every aspect of contemporary life is ANOVA F-test to identify relevant features.
impacted by the "IoT" [1]. A diverse array of devices that Data normalization using Min-Max Scaler to ensure
comprise the IoT, each with a different technical background, consistent data scaling.
leaves them open to potential security risks. Each entity has Application of Cat Boost, ETC for robust prediction
different security basics and qualities, thus it's become difficult performance.
to find a single solution that can safely solve every issue. Metrics for assessing the model's efficacy, including F1-
Attackers may choose to target IoT devices due to insufficient score, recall, accuracy, and precision.
security infrastructure. Furthermore, the Internet's service
offering makes it possible to conduct banking and financial Structure of Paper
operations, communicate, engage in e-commerce, shop, make For the sections that follow, this study is organised as
payments online, access healthcare, and get an education online follows: In Section 2, the study's context is examined. Section
[2]. The aforementioned services are particularly susceptible to 3 provides a full approach for this investigation. In Section 4,
talk about the study's conclusions and assessments. Findings research. The study's result analysis using DT, RF, and SVC is
from the study and recommendations for the future Section 5. accomplished up to 99.6%[10].
II. LITERATURE REVIEW In this research work, Patil et al., (2022), create a model
based on ML to forecast DDoS flooding assaults. The DDoS
Machine learning/deep learning (ML/DL) has previously flooding assaults that are to be expected encompass several
shown to be an effective method for identifying DDoS assaults. kinds. These assaults were classified using ML models such as
Some of the previous researchers work explained below: decision tree classifiers, MLP, KNN, and LR. A Jupyter
notebook with the necessary Python libraries loaded was used
In this research, Jiyad et al., (2024), presents a novel for the implementation. KNN and DTC have shown almost
ensemble model that can identify DDoS attacks. The approach identical performance, with the highest accuracy of 99.98
leverages ML algorithms such as LR, RF, DT, and XGBoost percent, in predicting TCP and ICMP flooding attacks out of
classifiers to detect and classify these malicious attacks these four classifiers. When it came to predicting UDP flooding
effectively. In the research, use the potent explainable Artificial attacks, the DTC performed a best, with an accuracy rate of
Intelligence (XAI) models SHAP and LIME. By utilizing 77.23 percent[11].
SHAP and LIME's capabilities, improve the ML models'
readability and transparency, giving us a better understanding Cybersecurity is a critical topic in the field of internet
of difficult predictions and model behavior. The evaluation security (Tufail, Batool and Sarwat, 2022). Cyberattacks affect
results demonstrate that the XGBoost ensemble model many industries, with thousands occurring year. DDOS and
outperforms other classifiers, achieving an impressive accuracy FDIA are two of the most deadly cyberattacks. Two machine
rate of 97 %, with an outstanding F -score of 97%. The learning techniques, LR and SNN, were compared in this
precision and recall are accordingly 98% and 96%[8]. research in order to predict DDoS assaults. 99.85% accuracy
was attained for SNN and 98.63% accuracy in logistic
In this research, Al-Eryani, Hossny and Omara, (2024), regression, respectively. In contrast to logistic regression, the
focuses on providing a comparative study between recent ML analysis reveals that SNN required a significantly longer
algorithms that were tested using the CICDoS2019 dataset. The training period[12].
objective of this comparison is to determine the most effective
ML algorithm for DDoS detection. Based on the comparative Despite significant advancements in machine learning
study results, it is found that the Gradient Boosting (GB) and techniques for DDoS attack detection and classification,
the XGBoost algorithms are extraordinarily accurate and several gaps remain in the current research. While numerous
correctly predicted the type of network traffic with 99.99% and studies have demonstrated high accuracy using various
99.98% accuracy respectively, in addition to, a low false alarm algorithms, there is a lack of comprehensive comparison across
rate of approximately 0.004 for GB[9]. diverse datasets and attack types. This study, showcase
impressive performance with XGBoost and Gradient Boosting,
In this research, Kaur, Sandhu and Bhandari, (2023), respectively, they do not address the performance consistency
developed effective ML classifiers utilising attributes from the across different attack scenarios. Additionally, research focuses
SDN dataset to identify DDoS assaults at the application layer. on specific attack types or datasets but lacks a holistic approach
To narrow down the feature set of data, they have used ICA, incorporating a wide range of attacks and feature reduction
PCA, and LDA. Furthermore, ML classifiers are developed techniques. Furthermore, the computational efficiency and
using extracted characteristics, and DDoS attack prediction is scalability of models are not thoroughly explored. Closing
carried out at the application layer. Out of 13, one feature was these shortcomings could improve DDoS detection systems'
recovered using the LDA model, which provides the highest resilience and applicability. For a detailed overview of related
detection accuracy possible for the classifiers in use. Results work, refer to Table 1: Related work on DDoS Attacks using
are analysed by comparing the suggested work to earlier ML and DL techniques.
Table 1 Related Work on DDoS Attacks using Machine and Deep Learning Techniques
Ref Methods Dataset Performance Limitation/Remarks
Jiyad et al. LR, RF, DT, Custom XGBoost: Accuracy 97%, F- Limited to a specific dataset, lacks
(2024) XGBoost + SHAP, dataset score: 97%, Precision: 98%, real-time implementation analysis
LIME (XAI tools) Recall: 96%
Al-Eryani, Gradient Boosting, CICDoS2019 GB Accuracy: 99.99%, Focuses only on ML algorithms,
Hossny, and XGBoost XGBoost Accuracy: 99.98% no DL models explored
Omara (2024)
Kaur, Sandhu, PCA, LDA, ICA with SDN dataset LDA Accuracy: 99.6% with Limited to application-layer
and Bhandari Decision Tree, ML classifiers DDoS attacks, lacks DL
(2023) Random Forest, SVM exploration
Patil et al. LR, KNN, MLP, DT Custom KNN & Decision Tree: Lower accuracy for UDP attack
(2022) dataset 99.98% (TCP/ICMP attacks), prediction (77.23%), only
Decision Tree: 77.23% (UDP classical ML methods
attacks)
Tufail, Batool, Logistic Regression, Custom SNN Accuracy: 99.85%, High training time for SNN, no
and Sarwat Shallow Neural dataset Logistic Regression: 98.63% other DL models evaluated
(2022) Network (SNN)
1
https://www.kaggle.com/datasets/mrwellsdavid/unsw-
nb15?select=UNSW_NB15_training-set.csv
Label Encoding on the Categorical Column example, the corresponding encoded values would be 0, 1, and
Categorical variables are those that can take on a small, 2, respectively. Keep in mind that this method may mislead the
fixed range of values. Some examples of these factors include model if it unintentionally implies an ordinal connection among
colour (red, blue, green), size (small, medium, big), and the numerical variables.
location (city, suburban, rural, etc.) [13]. Encoding categorical
variables may be done in a number of ways. Feature Selection using Select k-Best with Anova f-Test
The first step is to partition the dataset according to the
Label Encoding is one approach; it entails assigning a features and the variable of relevance [14]. After that, find the
number value to each separate category. For a colour most significant features by using the SelectKBest technique
characteristic that includes green, blue, and red categories, for when combined with the ANOVA F-test. Select the desired
number of features to be preserved. To find the best features, The ETR and RF systems differ from one another in two
the SelectKBest technique takes each feature's score relative to important ways. The ETR first separates nodes by randomly
the target variable and uses that score to choose the top k selecting a subset of all the cutting points. Secondly, to reduce
features [15]. To improve the model's performance, this bias, it cultivates the trees using all of the learning samples. The
method focuses on the features that are most strongly related to parameters k and nmin, which determine the minimum sample
the dependent variable. size needed to separate nodes, indicate the number of attributes
that are randomly picked for each node in the ETR approach.
Normalization with Minmax Scaler The splitting procedure is controlled by these variables. Also,
Normalisation, or Min-Max scaling, is a commonly used k and nmin, respectively, dictate the intensity of the attribute
method. To make values lie between 0 and 1, this approach selection and the average output noise strength. The ETR
adjusts and rescales the values [16]. The formula (1) is used to model's accuracy is increased and overfitting is decreased by
do the transition. these two parameters [18][19].
Keras, Pandas, NumPy, Seaborn, Matplotlib, Scikit-learn, and Exploratory Data Analysis
TensorFlow, enabling efficient model development and data This section of the research uses exploratory data
processing. The hardware setup for the pre-processing phase analysis, or EDA, to look at the data closely. To facilitate
includes a system equipped with an Intel (R) Core (TM) i3- understanding, this study employs a graphical representation of
6100U CPU @ 2.30GHz, 2304 MHz, 2 Cores, and 4 Logical the data. In order to explore the data and gather a synopsis of
Processors, along with 8 GB of RAM and a 256 GB SSD. the most important findings, EDA is used. You may utilize its
Additionally, for computationally intensive tasks, Google statistical insights and visualizations to help you find patterns
Research provides access to dedicated GPUs and TPUs, or trends. The following data visualization graphs are provided
enhancing a performance of ML models used in this project. in this section.
The following Figure 2 represents the Count plot for the "service" x-axis can go from 0 to 6. The tallest bar corresponds
Distribution of service on UNSW_NB15 data. Values on the to service value “0,” indicating the highest count (well above
"count" y-axis may go up to 40,000, while values on the 40,000).
The distribution of seven network traffic states is shown The first two states have significantly higher counts (around
in figure 3 by the count plot of the UNSW_NB15 dataset. The 40,000 and 35,000), while the remaining states range from
x-axis represents "state," and the y-axis indicates "COUNT." 10,000 to 5,000, and the last state has a count of 0.
The bar graph Distribution of attack cat on UNSW_NB15 frequency for that attack category. Although the exact labels
data displays in figure 4 the count of 9 different attack for the categories are not visible, the graph effectively shows
categories on the x-axis and their respective counts on the y- the overall distribution of cyber-attacks within the dataset.
axis. The first bar is significantly taller, indicating a higher
The box plot for features in the UNSW_NB15 dataset (line inside the box), quartiles (box edges), and potential
displays in figure 5, various features on the x-axis, such as 'dur', outliers (dots beyond the whiskers). This visualization
'spkts', 'dpkts', and 'sbytes', while the y-axis, scaled facilitates quick comparison of central tendency, variability,
logarithmically, shows the values of these features. Each box and outliers across different features.
represents the distribution of a feature, indicating the median
Figure 6 display the Feature important score graph The four-class classification system divides instances
generated by SelectKBest. The y-axis represents various (examples) into four separate groups. Class A, Class B, Class
features (such as ‘ct_dst_sport_ltm’, ‘ct_src_dport_ltm’, etc.). C, and Class D are the four groups that comprise the whole.
The x-axis shows the importance scores, ranging from 0 to Positive (1) and negative (0) stand for the expected values,
8000. Each feature has a corresponding bar, with its length whereas true (1) and false (0) indicate the actual values.
indicating its importance score. Estimates of the potential classification models are derived
using the confusion matrix expressions TP, TN, FP, and FN.
Evaluation Parameter
Model performance may be better understood with the use Accuracy
of evaluation metrics. The ability of evaluation metrics to The percentage of correct forecasts compared to the total
differentiate between different model outputs is a key feature. number of predicts is known as accuracy. Equation (5) was
In general, the values used to compute these measures are used to calculate accuracy.
obtained from the confusion matrix (see figure 7 below), which
displays the correctness of the model in a very intuitive way. TN + TP
This matrix is N X N, where N is the projected number of 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = … … … … … … … … . (5)
TP + TN + FP + FN
classes.
Recall
Recall, which may be expressed as a ratio of positively
categorised samples to the total number of samples in the real
class (including both TP and FN samples), is given by equation
(6).
TP
Recall = … . … … … … … … … … … … … … … … … . (6)
TP + FN
Bar Graph for proposed model performance shows in 90.27%, indicating they equally capture true positive instances.
figure 8. When comparing the performance metrics between F1-scores also favor Cat Boost slightly, achieving 90.37%
ETC and Cat Boost, both models demonstrate strong compared to ETC's 89.89%, reflecting a better balance between
capabilities across accuracy, precision, recall, and F1-score. precision and recall. Overall, while both models perform
Cat Boost slightly outperforms ETC in accuracy (90.78% vs. exceptionally well, Cat Boost demonstrates slightly superior
90.27%) and precision (90.58% vs. 89.86%), showing a slight performance in accuracy and F1-score, making it a favorable
edge in correctly predicting positive instances and minimizing choice for tasks requiring robust predictive performance.
false positives. Recall scores are identical for both models at
Figure 9 displays the ETC's classification report, which different classes: it excels in precision for classes 0, 5, and 6
includes a total of ten categories. The classifier's accuracy is but struggles with recall in classes 0, 8, 1, and 9. Classes 3, 4,
90.27%, showing a good match between model predictions and and 7 show moderate to good performance with balanced
labels. The Precision of ETC is 89.86, recall is 90.27, and f1- precision and recall. The overall accuracy of 0.90 with 15124
score is 89.89. The model displays varied performance across support value.
The confusion matrix of an ETC is shown in Fig. 10, predictions for a true-predicted label pair are represented by
where the real class labels (0–9) are shown on the y-axis, and deeper hues in each cell. Diagonal cells stand for each class's
the predicted class labels are represented on the x-axis. More accurate predictions, also known as true positives.
Figure 11 illustrates the Cat Boost classifier's varied performance across different classes: it excels in
classification report, which includes 10 classes. The classifier's precision for classes 0, 5, and 6 but struggles with recall in
accuracy is 90.79%, showing a good match among model classes 0, 8, 1, and 9. Classes 3, 4, and 7 show moderate to good
predictions and labels. The Precision of Cat Boost classifier is performance with balanced precision and recall. The overall
90.58, recall is 90.78, and f1-score is 90.37. The model displays accuracy of 0.91 with 15124 support value.
Figure 12 displays the confusion matrix for the Cat Boost Comparative Study
classifier. In this figure, the y-axis displays the actual labels The Comparison of Base and proposed model
while the x-axis displays the predicted labels. Both axes range performance across performance parameters is provided in this
from 0 to 9. Correct predictions are along the diagonal, with section. The model performance comparison in Table 3 below
darker blue indicating higher counts, like 7058 for class 6. Off- demonstrates how well the suggested model performs in
diagonal cells show misclassifications, such as 55 instances contrast to basic models.
where true label 0 was predicted as 1. This matrix helps identify
correct classifications and common confusions, guiding model
improvements.
[8]. Z. M. Jiyad, A. Al Maruf, M. M. Haque, M. Sen Gupta, [19]. G. Mishra, D. Sehgal, and J. K. Valadi, “Quantitative
A. Ahad, and Z. Aung, “DDoS Attack Classification Structure Activity Relationship study of the Anti-
Leveraging Data Balancing and Hyperparameter Hepatitis Peptides employing Random Forest and Extra
Tuning Approach Using Ensemble Machine Learning Tree regressors,” Bioinformation, 2017, doi:
with XAI,” in 2024 Third International Conference on 10.6026/97320630013060.
Power, Control and Computing Technologies [20]. A. V. Dorogush, V. Ershov, and A. Gulin, “CatBoost:
(ICPC2T), 2024, pp. 569–575. doi: gradient boosting with categorical features support,”
10.1109/ICPC2T60072.2024.10475035. pp. 1–7, 2018.
[9]. A. M. Al-Eryani, E. Hossny, and F. A. Omara, [21]. L. Prokhorenkova, G. Gusev, A. Vorobev, A. V.
“Efficient Machine Learning Algorithms for DDoS Dorogush, and A. Gulin, “Catboost: Unbiased boosting
Attack Detection,” in 2024 6th International with categorical features,” in Advances in Neural
Conference on Computing and Informatics (ICCI), Information Processing Systems, 2018.
2024, pp. 174–181. doi: [22]. H. Liu, L. Guo, H. Li, W. Zhang, and X. Bai,
10.1109/ICCI61671.2024.10485168. “Matching Areal Entities with CatBoost Ensemble
[10]. S. Kaur, A. K. Sandhu, and A. Bhandari, “Feature Method,” J. Geo-Information Sci., 2022, doi:
Extraction and Classification of Application Layer 10.12082/dqxxkx.2022.220050.
DDoS Attacks using Machine Learning Models,” in
2023 International Conference on Communication,
Security and Artificial Intelligence, ICCSAI 2023,
2023. doi: 10.1109/ICCSAI59793.2023.10421652.
[11]. P. S. Patil, S. L. Deshpande, G. S. Hukkeri, R. H.
Goudar, and P. Siddarkar, “Prediction of DDoS
Flooding Attack using Machine Learning Models,” in
Proceedings of the 3rd International Conference on
Smart Technologies in Computing, Electrical and
Electronics, ICSTCEE 2022, 2022. doi:
10.1109/ICSTCEE56972.2022.10100083.
[12]. S. Tufail, S. Batool, and A. I. Sarwat, “A Comparative
Study Of Binary Class Logistic Regression and
Shallow Neural Network For DDoS Attack
Prediction,” in Conference Proceedings - IEEE
SOUTHEASTCON, 2022. doi:
10.1109/SoutheastCon48659.2022.9764108.
[13]. W. Yustanti, N. Iriawan, and Irhamah, “Categorical
encoder based performance comparison in
preprocessing imbalanced multiclass classification,”
Indones. J. Electr. Eng. Comput. Sci., 2023, doi:
10.11591/ijeecs.v31.i3.pp1705-1715.
[14]. V. Rohilla, S. Chakraborty, and R. Kumar, “Deep
learning based feature extraction and a bidirectional
hybrid optimized model for location based
advertising,” Multimed. Tools Appl., vol. 81, no. 11,
pp. 16067–16095, May 2022, doi: 10.1007/s11042-
022-12457-3.
[15]. R. C. Chen, C. Dewi, S. W. Huang, and R. E. Caraka,
“Selecting critical features for data classification based
on machine learning methods,” J. Big Data, 2020, doi:
10.1186/s40537-020-00327-4.
[16]. A. Bhandari, “Feature Engineering: Scaling,
Normalization and Standardization,” Analytics
Vidhya.
[17]. P. Geurts, D. Ernst, and L. Wehenkel, “Extremely
randomized trees,” Mach. Learn., vol. 63, no. 1, pp. 3–
42, 2006, doi: 10.1007/s10994-006-6226-1.
[18]. V. John, Z. Liu, C. Guo, S. Mita, and K. Kidono, “Real-
time lane estimation Using Deep features and extra
trees regression,” in Lecture Notes in Computer
Science (including subseries Lecture Notes in Artificial
Intelligence and Lecture Notes in Bioinformatics),
2016. doi: 10.1007/978-3-319-29451-3_57.