Hybrid Feature Selection
Hybrid Feature Selection
In the realm of business, it is crucial to establish robust the web application. At the same time, the WannaCry attack
security mechanisms for identifying web attacks. Advanced on the UK National Health Service in 2017 is an excellent
Intrusion Detection Systems (IDSs) can effectively fortify network example of a significant cyber-attack. The attack used a
security by operating at the network's perimeter. However, the Windows Server Message Block protocol vulnerability to
presence of a vast number of features within the system presents a
significant challenge in accurately detecting web attackers. To
install backdoor tools and execute a ransomware package. [6].
address this issue, a technique known as Recursive Feature Another attack on Tesco Bank in the UK was targeted by an
Elimination and Mutual Information (RFEMI) is proposed. This online fraudster on November 7, 2016, resulting in cash being
technique aims to select the most essential features while taken from approximately 40,000 accounts and causing a loss
eliminating duplicates, thereby simplifying attack detection and of about £600 per account. [7].
reducing computational time. The study conducted experiments to
assess the effectiveness of the proposed feature selection technique The web application uses a web browser and its
in detecting web attacks using various Machine Learning technology to complete precise tasks across the internet. Web
Algorithms (MLAs) such as Decision Tree Classifier (DTC), XGB applications use a combination of client-side scripts and
Classifier (XGB), Gradient Boosting Classifier (GBC) and K- servers to distribute data to legitimate users through a web
Nearest Neighbor (KNN) for Network-based Intrusion Detection browser. The client-side scripts are used to present specific
Systems (NIDS). The results demonstrate that the proposed system data to users, while the server-side scripts help handle, store,
can accurately identify web attacks using the CICIDS-2017 and retrieve the data. This model allows for the efficient and
dataset. Among the classifiers, the DTC classifier exhibited the
effective sharing of data with people and businesses.[2].
highest accuracy at 0.9961, with a False Positive Rate (FPR) of
0.054. Nevertheless, the web traffic has been examined via the
Keywords— Computer networks. Intrusion Detection Systems. requests from servers and clients through the HTTPS and
Machine Learning Algorithms. High dimensionality, web attack HTTP protocols. Web application security is essential as it
detection. stores numerous user data and delivers the means to access
many assets like online internet banking, online buying, online
I.INTRODUCTION purchase and other services provided by the financial
Security mechanisms to detect web attacks are an essential organization. This expansive application needs web systems
part of the design of network devices; new technologies such to handle and store an enormous amount of data, including
as IoT and mobile devices pose new threats and challenges to personally sensitive data such as username, password, date of
web security. This is because Web applications are widely birth, and credit card information, thereby eliciting varying
used in our daily lives and across communication networks for degrees of cyber security threats. For example, the service is
client-server communication. Despite the efforts of inconsistent if it fails to defend sensitive data such as bank
researchers and developers who have implemented security details. Consequently, Web applications are vulnerable to a
measures like firewalls, data encryption, and user wide range of security attacks that can compromise
authentication, web attacks still pose a significant challenge to Confidentiality, Integrity, and Availability, collectively
web applications. A study done by Symantec in 2019 found known as the CIA. These attacks can occur at different levels
that every month, around 4,818 websites were hacked with of the web application's architecture, posing a significant
form jacking code, and cybercriminals could make up to $2.2 challenge to its security.
million per month by stealing credit card information from Even though Web applications have similarities with
those compromised websites. [3]. traditional IT systems, they have unique characteristics that
Furthermore, almost one million new threats are make them more susceptible to attacks. This vulnerability
distributed into the network daily, as stated by Symantec [5]. requires special attention to secure web applications from
There are almost one million new threats distributed daily on various security threats.[8]. For instance, remote code
execution attacks target file systems; however, SQL injection
79
On the other hand, Cross Site Scripting (XSS) is another data automatically; they use the UNSW-NB15 dataset to
kind of web attack. The improper system coding writing gives compare results. Their result achieves a detection rate of 97.28
an excellent option for the hacker to manipulate the known using the OGM method on the web attack dataset and 95.56
weaknesses. This attack uses the victim's exposure when using the UNSW-NB15 dataset.
visiting the vulnerable website. And the weak website works
as a medium for the hackers in order, to provide malicious As previously mentioned, a majority of the studies and
code to the target's browser[13]. related research have failed to consider computer system
utilization or the influence of big data on the detection of web
The exposure of weak web applications and essential data attacks or zero-day attacks. Utilizing Machine Learning (ML)
on it has highlighted the necessity for the exploration of algorithms could prove advantageous, not only for traffic
network security, which caused a sharp increase in the number classification but also for investigating the underlying causes
of attacks that triggered the web-based system. [14]. of features to determine their normality or anomaly. Initially,
intrusion detection systems (IDS) operated on the principles
Among the works done to deal with the web attack issue, of "deep packet inspection," involving the examination of The
several techniques have been suggested to identify the web limited studies discussing web and zero-day attacks primarily
attacks using IDS mainly. For example, reference [7] applied focused on achieving high accuracy and assessing the
end-to-end deep learning to identify attacks autonomously in vulnerability of such attacks. Consequently, there exists a
real time. They explore the potential of end-to-end deep significant gap in the existing literature regarding solutions for
learning in IDS and use deep learning in the whole process detecting and preventing web attacks [12]. This article aims to
starting with feature engineering and ending with prediction. bridge this knowledge gap, which is crucial for the
Their approach is to work without users manually selecting development of an efficient system. While most existing
features or constructing big labelled training sets. The authors works concentrate on analyzing the content of payloads to
evaluate unsupervised and semi-supervised learning methods identify web attacks.
to identify web attacks. Their methodology is based on the
Robust Software Modelling Tool, and their results show that C. Anomaly Detection
RSMT can effectively identify web attacks. Anomalies detection (AD) is a technique used to identify
The reference work [15] proposed a detection method patterns in the data that do not conform to the defined features
using a new ensemble deep learning to detect web attacks; of standard patterns in the data. It is defined as "an observation
they first built three models to identify the attacks. After that, that deviates so much from other observations to arouse
they utilise an ensemble classifier to make the final suspicions that a different mechanism generated it[14]. Driven
determination corresponding to the achieved results from the by various unusual activities from web attacks, credit card
three models. They performed experiments with real-world fraud, and numerous different kinds of attacks, (AD) are
datasets running in a distributed environment and CSIC 2010 essentially the reason they point out unusual events. Thus,
as a benchmark to evaluate the proposed Web attack Detection they can take key actions in vast application areas[15].
System WADS. The investigation shows the suggested Denning in [16] The concept of anomaly detection was
method can identify web attacks correctly, with a performance introduced by Denning, and since then, many researchers
of 99.47% accuracy, 99.29%, and 99.70% precision. Although worldwide have presented a lot of work on it.
the work achieved good accuracy, it seemed to overlook the On the other hand, AD is a data analysis technique that
performance resulting from the ensemble methods, which
identifies data patterns that do not conform to the standard
consumed more time classifying the data.
patterns[17, 18]. This technique is a wide and dominant
As recommended by[9], a single classifier can detect the category of (IDS); It helps to detect unusual events, such as
anomaly with a reduced attack sample in the training dataset web attacks or credit card fraud.AD work by A "norm profile"
by evaluating different classification algorithms, including is a baseline or standard behavior pattern created for an entity,
Random Forest (RF) classifier, Logistic Regression (LR) and Naïve
Bayes (NB) algorithms. The result showed that NB skilfully
identified attacks with few samples in training, like R2L and
U2R attacks. Whereas RF and J48 have recognized attacks
like DoS and Probe, J48 showed slightly lower results than
RF. They initially evaluate the result using the complete NSL-
KDD dataset and then using 20 % in the second stage and
compare the result using Precision, Recall, and F-Score.
Although the result shows a high precision score of 98.28%
using 20 % of the data, the result may be inaccurate as there is
no evidence this module will work with other data or real-time
traffic and was not tested on novel attacks.
On the other hand, researchers [16] introduced an
architectural scheme to develop a threat intelligence strategy
to detect web attacks using four step approach as a fellow:
1. gathering web attack data using crawling websites, 2.
extracting essential features using the Association Rule
Mining3.utilizing the obtained features to simulate web attack
data, and 4. offering a new Outlier Gaussian Mixture (OGM) Figure 1Anomaly based IDS using Machine Learning
method to detect known and novel attacks using an AD
method. They propose a method to capture network traffic
80
which is used to compare and identify any observed behaviors done by considering all the features or attributes as dimensions
that deviate from the norm profile[19]. It looks for patterns in in the calculation of Euclidean distances[5].
data that don't match the usual or normal patterns, as a way to By leveraging these methods, you can enhance your ability to
identify unusual or suspicious events. It does this by detect web attacks by utilizing their unique strengths in pattern
comparing the observed behaviors to a normal behaviour recognition, classification, and anomaly detection. enhances
profile and identifying data entries that don't fit with the rest the system's ability to accurately detect web attacks. The
of the dataset. evaluation of the proposed system on the CICIDS-2017
Figure 1 shows some ML techniques used in AD. dataset demonstrates promising results, indicating improved
classification accuracy and reduced model building time.
D. Supervised ML and Classification Techniques
Classification is a method used to differentiate unusual E. The CICIDS-2017 dataset
patterns, and this technique is used to differentiate unusual Many published works in AD and feature selection proposals
patterns; a fully labelled training dataset and a test dataset are use DARPA’98 and KDD’99 Cup. However, famous critics
needed. The classifier is trained first and then tested with the received advice against their use due to the out-of-data issue
test dataset, making it a good choice for detecting unseen and not presenting the actual network traffic[24]. In this work,
attack patterns. This approach is effective and has a high The CICIDS2017 dataset is a more recent dataset proposed by
detection rate for known attacks.[20]. The methods used for the Canadian Institute in 2017 for cybersecurity research is
(IDS) can be updated with new information and strategies, used; it includes normal traffic and new attacks when data is
making them adaptable. They can also identify unusual data collected (Catillo et al., 2021). The data is accessible in
patterns and are ideal for detecting new or "zero-day" attacks. bidirectional flow labelled form ( CSV ) and packet format (
Most IDS models use a single classifier, such as support pcap ).
Vector Machines, Genetic Algorithms, Logistic Regression,
KNN, and Random Forest. This researchers used four Nevertheless, this data obtains massive traffic and has a large
Machine learning algorithms and will be discussed in details number of features for anomaly detection. It includes recent
in the following subsections. and challenging attack distribution attacks, including Brute
1) T Decision Tree Classifier force, Infiltration, Botnet, DDoS, Dos, Web, and PortScan
Decision Tree learning is a commonly used method for (Cybersecurity, 2017).To this end, The data capture period is
categorizing data based on different attributes. Decision trees five days. Monday is the "normal day" and contains benign
are useful for processing large amounts of data and are often traffic. However, on Thursday there were web attacks
used in data mining applications. They do not require any prior including Brute Force, XSS, and SQL Injection that occurred
knowledge or specific settings and are suitable for exploratory between (9:20 - 10 a.m.), XSS (10:15 - 10:35 a.m.), and SQL
knowledge discovery. Decision trees are represented in a tree- Injection (10:40 10:42 a.m.). The attacker was a Kali Linux
like structure, which makes the acquired knowledge easy to node, and the victim was a WebServer Ubuntu[25].
understand[21].
Feature Selection and data pre-processing
2) XGB Classifier
XGBoost is a popular algorithm used for boosting that aims to In Machine Learning, the isolation process for predictive
achieve high efficiency, flexibility, and portability. This features from unwanted features is known as Feature
algorithm generates decision trees sequentially and assigns Selection (FS). It is well agreed that information theory has
weights to all independent variables. The model then become a powerful concept to undertake this strategy. This is
combines the various classifiers/predictors to form a more justified by the fact that correlation can yield a predictive
powerful and precise model. XGBoost can solve problems power, namely mutual information. Within the available
including regression, classification, ranking, and user-defined literature, there are numerous FS techniques. That said, a
prediction. It includes a sparsity-aware split discovery considerable number of researchers have used several
algorithm to handle different forms of sparsity patterns in data, approaches by employing filtering methods and wrapping.
and a distributed weighted quantile sketch approach to The complexity of keeping the data’s dimensionality of the
determine the optimal split points across weighted data manageable and targeting the most suitable features on
datasets[22]. which to drive the learning process would increase with the
3) Gradient Boosting Classifier ongoing increase of datasets. It is to be noted that the increase
The Gradient Boosting Classifier (GBC) is a machine learning in the size of datasets would be explicitly in terms of the data
algorithm used for classification and regression models. It features as well as the sample. However, it is unavoidable to
builds a gradual sequence of weak prediction models, such as enhance prediction accuracy, ignore the unstoppable growth
regression decision trees, to optimize the learning process. of training complexity, and understand the model more
These models are combined in an ensemble to improve their deeply.
accuracy, with each ensemble correcting errors from the
previous one. The nodes and leaves in the model make The best known two categories to reduce the dimensionality
predictions based on decision nodes, and the accuracy of the are either by transforming the feature space through feature
model is improved as more weak models are added to the extraction, which describes putting the original features into
ensemble[23]. new ones, or by following a different strategy of FS, by
4) K-Nearest Neighbour choosing a subset of features. The second category branches
The k-Nearest Neighbor Algorithm is a machine learning tool into three methods: embedding, filtering approaches, and
that predicts class labels for different instances by measuring wrapping. Notably, the filtering strategy is more advantageous
the shortest Euclidean distance from other instances. This is since it does not depend on the classifier, is more dynamic in
81
dealing with overfitting risks, and is more responsive to a Stage 3 employs four other Machine Learning Algorithms (MLAs)
structured method. More interestingly, many researchers have to classify the final optimized subset C, generated by the RFEML
successfully applied information-theoretic techniques and method. This step focuses on improving the overall performance of
concepts, for example,[26, 27]. the computer network, in addition to enhancing the accuracy of
attack detection. Previous studies in this field have primarily
As datasets become larger, it becomes more difficult to concentrated on enhancing the detection accuracy without adequate
manage data dimensionality and select suitable features for consideration for the network's overall performance.
learning processes and has directed our focus on classifiers, a Overall, this hybrid approach combines the strengths of both RFE
filtering method precisely for machines that can potentially and MI methods, leading to more accurate results compared to using
classify these samples. Since classes have a fixed entropy, the either method alone. This approach is particularly effective when
main target would be formulated in relation to the conditional dealing with large datasets containing numerous potential variables
entropy corresponding to the classification given the set of or factors that influence outcomes, such as classification accuracy
features. Then, the goal is to select the minimal set of features rates. Figure 3 depicts the first part of the proposed system, which
that can reduces the uncertainty classification to the required represents the described method (RFEML) for feature selection,
while table 1 show feature selected by both method separately.
level. For this reason, the Random Forest classifier (RFC)has Table 1Selected Features by (RFEML)
been used to find the most critical eight features, as shown in
fig 2; and the proposed FS method is used to confirm the Selected Features by Recursive Selected Features by
results. This problem has attracted the attention of many feature elimination Mutual Information
researchers, for example [28][[29]]. Destination Port', Destination Port',
' Total Length of Bwd Packets', ' Total Length of Bwd
Packets',
' Bwd Packet Length Mean', ' Bwd Packet Length Mean',
' Bwd Packet Length Std', ' Packet Length Mean',
' Packet Length Mean', ' Packet Length Std',
' Average Packet Size', ' Packet Length Variance',
' Avg Bwd Segment Size', ' Average Packet Size',
' Fwd Header Length.1', ' Avg Bwd Segment Size',
' Subflow Fwd Bytes', ' Subflow Bwd Bytes',
'Init_Win_bytes_forward' Destination Port',
The second part of the framework (Fig. 4) provides a
characterization of four MLA to classify the dataset based on the
whole dataset in the first stage; then, use the thirteen features from
the dataset to test (RFEML) system. Next, compare and analyze the
accuracy, precision, (FNR), (TPR), (FPR), (TNR), and percentage
82
The recall is a measure of how well the model correctly
identifies all positive cases, and it is defined using a specific
formula., as follows:
𝑇𝑇𝑇𝑇 (6)
𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 =
𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
F. Results Evaluation
Table 3 presents the results using exclusive features on the
dataset, while Table 4 displays the results using the proposed
method. Both tables compare the performance of different
Figure 4The second part of the proposed method (RFEML) for classifiers based on various metrics such as False Negative
FS Rate (FNR), True Positive Rate (TPR), False Positive Rate
(FPR), Accuracy (AC), Precision (PC), and Recall (RC).
Comparing the two tables, it can be observed that the
classifiers' performance generally improves when using the
The False Negative Rate (FNR) refers to situations where an proposed method (Table 3) compared to using exclusive
attack has occurred, but the system fails to identify it as such, features (Table 4). The FNR values decrease for all classifiers
giving a wrong prediction[5]. in Table 4, indicating a reduction in the rate of false negatives.
𝐹𝐹𝐹𝐹 (1) Similarly, the TPR values increase, indicating an
FNR = improvement in the rate of true positives.
𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇
The True Positive Rate (TPR) is a measure of how accurately
a system can identify attacks. It measures the proportion of
correctly identified attack instances out of all actual attack
instances.
𝑇𝑇𝑇𝑇 (2) Table 2Results using the whole features on the dataset
𝑇𝑇𝑇𝑇𝑇𝑇 =
𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
The False Positive Rate (FPR) refers to the situation when a
M FNR( FPR( TPR(%) TNR AC( PC RC
system detects an attack when there is actually no attack. In LA %) %) (%) %)
other words, the system identifies a data point as an attack,
DT 0.0019 0.1203 0.998 0.879 0.9961 0.88 0.88
but it is actually not an attack. C 6 6
𝐹𝐹𝐹𝐹 (3)
𝐹𝐹𝐹𝐹𝐹𝐹 =
𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇
The True Negatives Rate (TNR) refers to the proportion of
cases where the model correctly predicts a negative outcome XG 0.0072 0.0043 0.9956 0.983 0.9924 0.99 0.83
when the actual outcome is also negative. In other words, B 9 8 5 4
TNR considers cases where both the prediction and actual
outcome are negative, and the model makes a correct KN 0.0025 0.1035 0.9974 0.896 0.9952 0.98 0.84
prediction. N 9 4
𝑇𝑇𝑇𝑇 (4)
𝑇𝑇𝑇𝑇𝑇𝑇 =
𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇
83
False Negative Rate = (FNR), True Positive Rate = TPR, KNN 0.003 0.0149 0.997 0.9851 0.9952 1.0 0.9
False Positive Rate = (FPR), AC = Accuracy , Recall = RC 8
1
These improvements in FNR and TPR suggest that the 0.5
proposed method enhances the classifiers' ability to detect web
0
attacks accurately. Additionally, the Accuracy (AC) values
generally remain high or improve slightly in Table 4
compared to Table 3, indicating that the proposed method
maintains or improves overall classification accuracy.
Figure 5 displays the accuracy of each classifier on the Evaluation method on…
whole dataset when using exclusive features. Figure 6, on the
other hand, illustrates the accuracy of each classifier when
utilizing the proposed method. Comparing the two figures, it Decision Tree Classifier
is evident that the classifiers' accuracy tends to improve or XGBClassifier
remain consistent when employing the proposed method
(Figure 6) compared to using exclusive features (Figure 5). KNeighbors Classifier
84
With respect to this objective, upon examining the Table 3 and Table 4 present the comparative performances
comparison presented in Table 1, it becomes evident that the of the classifiers based on metrics such as False Negative Rate,
suggested system, which employs the RFEML technique and True Negative Rate, True Positive Rate, False Positive Rate,
identifies the 13 most influential features, exhibits the highest Recall, and accuracy. Among the classifiers, XGB and KNN
level of classification accuracy when compared to similar exhibited superior performance in detecting web attacks on
studies. This proposed approach not only enhances the the CICIDS2017 dataset, warranting further investigation.
accuracy of classification but also significantly diminishes the However, the DTC classifier achieved the highest accuracy at
time required for model development, thereby delivering an 0.999, surpassing the other methods.
overall superior performance in contrast to previous On the other hand, the XGB and GBC algorithms
methodologies. Therefore, through careful feature selection demonstrated good accuracy at 0.992and 0.950, respectively.
and the RFEML approach, this work achieves a notable The proposed system, employing the (RFEML) method,
improvement in classification accuracy and model building effectively reduces training and testing time while improving
time compared to other relevant studies. The comparison is overall accuracy. Furthermore, the proposed approach can be
presented in Table 4 support this conclusion. enhanced in future research to detect a wider range of web
attacks. Additionally, this technical method holds potential for
deployment in IoT environments, where different feature
selection techniques can be employed to construct an optimal
Table 4Comparison with Other Work feature selection mechanism.
References Dataset Classifiers Advantages Limitation References
used /Results [1] N. Moustafa, G. Misra, and J. Slay, "Generalized outlier gaussian
mixture technique based on automated association features for
simulating and detecting web application attacks," IEEE Transactions
on Sustainable Computing, 2018.
[1] UNSW- Outlier Achieved Requires [2] D. Kshirsagar and S. Kumar, "Towards an intrusion detection system
NB15 Gaussian 97.28% validation for detecting web attacks based on an ensemble of filter feature
dataset Mixture accuracy with another selection techniques," Cyber-Physical Systems, pp. 1-16, 2022, doi:
method using the dataset; Not 10.1080/23335777.2021.2023651.
OGM effective for [3] Symantec, "ISTR Internet Security Threat Report," 2019.
method on detecting
the web unknown [4] A. S. A. Aziz, E. Sanaa, and A. E. Hassanien, "Comparison of
attack attacks. classification techniques applied for network intrusion detection and
dataset and classification," Journal of Applied Logic, vol. 24, pp. 109-118, 2017.
95.56%
[5] I. Abobaker and A. Musa, "Machine Learning for Intrusion Detection
using the
UNSW- and Network Performance," in 2021 8th International Conference on
NB15 Future Internet of Things and Cloud (FiCloud), 23-25 Aug. 2021 2021,
dataset. pp. 86-91, doi: 10.1109/FiCloud49777.2021.00020.
[2] CICIDS RF, Attaind To validate [6] M. H. Kamarudin, C. Maple, T. Watson, and N. S. Safa, "A
2017 Decision 99.6161% the result LogitBoost-Based Algorithm for Detecting Known and Unknown Web
Stump, J48, accuracy by with another
Attacks," IEEE Access, vol. 5, pp. 26190-26200, 2017, doi:
Hoeffding J48 dataset
tree, and
10.1109/access.2017.2766844.
REP [7] C. Agrawal and Z. Hasan, "Analysis of Major Security Attacks in
[4] NSL-KDD NB, RF, of 98.28% Only 20% Recent Years."
dataset J48, and using 20 % of testing
[8] Quartz. "Data is expected to double every two years for the next
ML of the data data used
when decade." https://qz.com/472292/data-is-expected-to-double-every-
comparing. two-years-for-the-next-
to training decade/#:~:text=Thanks%20to%20advancements%20in%20technolog
data which y,hitting%2045%2C000%20exabytes%20in%202020. (accessed.
can affect [9] Y. Pan et al., "Detecting web attacks with end-to-end deep learning,"
the
Journal of Internet Services and Applications, vol. 10, no. 1, pp. 1-22,
accuracy.
The CICIDS- DTC, XGB, Achieved Detect more
2019.
proposed 2017 GBC, and accuracy various web [10] M. Ahmad, Q. Riaz, M. Zeeshan, H. Tahir, S. A. Haider, and M. S.
System KNN and lees attacks. Khan, "Intrusion detection in internet of things using supervised
computing machine learning based on application and transport layer features
process. using UNSW-NB15 data-set," EURASIP Journal on Wireless
Communications and Networking, vol. 2021, no. 1, 2021, doi:
10.1186/s13638-021-01893-8.
V.CONCLUSIONS AND FUTURE WORK [11] S. M. Kasongo and Y. Sun, "Performance analysis of intrusion
detection systems using a feature selection method on the UNSW-
The findings of this study demonstrate that utilizing the NB15 dataset," Journal of Big Data, vol. 7, pp. 1-20, 2020.
proposed method to select 13 key features from the dataset [12] E. Levy, "Approaching Zero," IEEE Security & Privacy, vol. 2, no. 4,
significantly enhances the performance of the classifier by pp. 65-66, 2004, doi: 10.1109/MSP.2004.33.
eliminating data redundancy. The primary objective of this [13] P. R. McWhirter, K. Kifayat, Q. Shi, and B. Askwith, "SQL Injection
work was to achieve optimal performance and security within Attack classification through the feature extraction of SQL query
strings using a Gap-Weighted String Subsequence Kernel," Journal of
a system. By focusing on these significant features and information security and applications, vol. 40, pp. 199-216, 2018.
employing supervised machine learning classifiers such as [14] F. Cavallin and R. Mayer, "Anomaly Detection from Distributed Data
DTC, XGB, GBC, and KNN algorithms, web attacks can be Sources via Federated Learning," in Advanced Information
efficiently and effectively detected without incurring Networking and Applications: Proceedings of the 36th International
unnecessary computational costs. Conference on Advanced Information Networking and Applications
(AINA-2022), Volume 2, 2022: Springer, pp. 317-328.
85
[15] A. Smiti, "A critical overview of outlier detection methods," Computer Emerging Technologies in Data Mining and Information Security:
Science Review, vol. 38, p. 100306, 2020. Proceedings of IEMIS 2018, Volume 2, 2019: Springer, pp. 651-659.
[16] D. E. Denning, "An intrusion-detection model," IEEE Transactions on [24] J. McHugh, "Testing intrusion detection systems: a critique of the 1998
software engineering, no. 2, pp. 222-232, 1987. and 1999 darpa intrusion detection system evaluations as performed by
[17] Y. M. Tukur, D. Thakker, and I. U. Awan, "Edge‐based blockchain lincoln laboratory," ACM Transactions on Information and System
enabled anomaly detection for insider attack prevention in Internet of Security (TISSEC), vol. 3, no. 4, pp. 262-294, 2000.
Things," Transactions on Emerging Telecommunications [25] UNP. "Canadian Institute for Cybersecurity."
Technologies, vol. 32, no. 6, p. e4158, 2021. https://www.unb.ca/cic/datasets/ids-2017.html (accessed 15/05/2023.
[18] S. Garg and S. Batra, "A novel ensembled technique for anomaly [26] V. Bolon-Canedo, N. Sanchez-Marono, and A. Alonso-Betanzos,
detection," International Journal of Communication Systems, vol. 30, "Feature selection and classification in multiple class datasets: An
no. 11, p. e3248, 2017. application to KDD Cup 99 dataset," Expert Systems with
[19] M. H. Kamarudin, C. Maple, T. Watson, and N. S. Safa, "A New Applications, vol. 38, no. 5, pp. 5947-5957, 2011.
Unified Intrusion Anomaly Detection in Identifying Unseen Web [27] N. Acharya and S. Singh, "An IWD-based feature selection method for
Attacks," Security and Communication Networks, vol. 2017, pp. 1-18, intrusion detection system," (in English), Soft computing (Berlin,
2017, doi: 10.1155/2017/2539034. Germany), vol. 22, no. 13, pp. 4407-4416, 2017, doi: 10.1007/s00500-
[20] S. Bahl and S. K. Sharma, "Improving Classification Accuracy of 017-2635-2.
Intrusion Detection System Using Feature Subset Selection," presented [28] E. Jaw and X. Wang, "Feature Selection and Ensemble-Based Intrusion
at the 2015 Fifth International Conference on Advanced Computing & Detection System: An Efficient and Comprehensive Approach," (in
Communication Technologies, 2015. English), Symmetry (Basel), vol. 13, no. 10, p. 1764, 2021, doi:
[21] B. Gupta, A. Rawat, A. Jain, A. Arora, and N. Dhami, "Analysis of 10.3390/sym13101764.
various decision tree algorithms for classification in data mining," [29] H. B. M. Rais and T. Mehmood, "Feature selection in intrusion
International Journal of Computer Applications, vol. 163, no. 8, pp. 15- detection, state of the art: A review," (in English), Journal of
19, 2017. Theoretical and Applied Information Technology, vol. 94, no. 1, pp.
[22] K. Konar, S. Das, and S. Das, "Employee attrition prediction for 30-43, 2016. [Online]. Available: https://go.exlibris.link/yjmgnwPK.
imbalanced data using genetic algorithm-based parameter optimization
of XGB Classifier," in 2023 International Conference on Computer, EEE conference templates contain guidance text for composing and
Electrical & Communication Engineering (ICCECE), 2023: IEEE, pp. formatting conference papers. Please ensure that all template text is
1-6. removed from your conference paper prior to submission to the
[23] N. Chakrabarty, T. Kundu, S. Dandapat, A. Sarkar, and D. K. Kole, conference. Failure to remove template text from your paper may result
"Flight arrival delay prediction using gradient boosting classifier," in in your paper not being published.
86