0% found this document useful (0 votes)
10 views

Analyzing SQL Payloads Using Logistic Regression I

The document discusses using logistic regression and Spark ML to build a model that classifies SQL payloads as either malicious or benign in a big data environment. It proposes a three-layer framework with a middle protection layer that would receive payloads from users and analyze them to detect SQL injection attacks faster and with higher accuracy compared to previous methods.

Uploaded by

majeedster
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Analyzing SQL Payloads Using Logistic Regression I

The document discusses using logistic regression and Spark ML to build a model that classifies SQL payloads as either malicious or benign in a big data environment. It proposes a three-layer framework with a middle protection layer that would receive payloads from users and analyze them to detect SQL injection attacks faster and with higher accuracy compared to previous methods.

Uploaded by

majeedster
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Journal of Intelligent Systems 2023; 32: 20230063

Research Article

Omar Salah F. Shareef*, Rehab Flaih Hasan, and Ammar Hatem Farhan

Analyzing SQL payloads using logistic


regression in a big data environment
https://doi.org/10.1515/jisys-2023-0063
received May 15, 2023; accepted July 28, 2023

Abstract: Protecting big data from attacks on large organizations is essential because of how vital such data
are to organizations and individuals. Moreover, such data can be put at risk when attackers gain unauthorized
access to information and use it in illegal ways. One of the most common such attacks is the structured query
language injection attack (SQLIA). This attack is a vulnerability attack that allows attackers to illegally access a
database quickly and easily by manipulating structured query language (SQL) queries, especially when dealing with
a big data environment. To address these risks, this study aims to build an approach that acts as a middle protection
layer between the client and database server layers and reduces the time consumed to classify the SQL payload sent
from the user layer. The proposed method involves training a model by using a machine learning (ML) technique
for logistic regression with the Spark ML library that handles big data. An experiment was conducted using the
SQLI dataset. Results show that the proposed approach achieved an accuracy of 99.04, a precision of 98.87, a recall
of 99.89, and an F-score of 99.04. The time taken to identify and prevent SQLIA is 0.05 s. Our approach can protect
the data by using the middle layer. Moreover, using the Spark ML library with ML algorithms gives better accuracy
and shortens the time required to determine the type of request sent from the user layer.

Keywords: big data, logistic regression, spark ML, SQL injection.

1 Introduction
Security has become a crucial component when developing web apps because of the massive amount of data
sent between businesses and the rising number of everyday users in various areas. Therefore, enterprises’ big
data require a web application architecture that can detect and stop application flaws. The Open Web
Application Security Project considers structured query language injection attack (SQLIA) among the most
dangerous threats to enterprise-scale databases [1,2].
The big data discipline uses a multi-scientific approach to analyzing and forecasting data, combining
computer science, mathematical modeling, and statistics. Access to data and methods for working with it
have emerged as critical factors. Companies may reliably manage big data by using and implementing
artificial intelligence and machine learning (ML) techniques [3].
A growing number of security risks are associated with the widespread use and storage of data online.
These risks arise from the proliferation of attacks that try to gain unauthorized access to the private informa-
tion of people and organizations [4].
SQLIA is among the most harmful assaults on database servers. By taking advantage of security holes, attackers
may compromise users’ and businesses’ data by tampering with, reading, erasing, or making copies of it [5].


* Corresponding author: Omar Salah F. Shareef, Computer Center, University of Fallujah Anbar, Fallujah 55621, Iraq,
e-mail: [email protected]
Rehab Flaih Hasan: Computer Sciences Department, University of Technology Baghdad, Baghdad 19006, Iraq,
e-mail: [email protected]
Ammar Hatem Farhan: Computer Center, University of Fallujah Anbar, Fallujah 55621, Iraq, e-mail: [email protected]

Open Access. © 2023 the author(s), published by De Gruyter. This work is licensed under the Creative Commons Attribution 4.0
International License.
2  Omar Salah F. Shareef et al.

Structured query language (SQL) injection flaws exist in every parameter a program uses to send an attack
to a database. An attacker may use various techniques in this kind of attack to gain unauthorized entry to
databases and extract information. The injection mechanism describes these procedures. The techniques used
are primarily divided into four categories: injection through cookies, injection through user input, injection
through server variables, and second-order or stored injections [6].
The increasing data exchange between individuals and institutions and daily transactions in various fields
have made data vulnerable to many attacks, such as illegal access. One of the most well-known attacks is
SQLIA. Standard methods for detecting and preventing these attacks can provide good results when dealing
with small data. However, these approaches do not work effectively with big data. Hence, another approach
must be developed to deal with big data and detect attacks against them.
This study was conducted to overcome the problems in previous works, which often did not mention the
time taken during the testing phase to detect the type of request sent by the user and whether it contains
harmful or benign payloads. In addition, the data protection method used when the protection model is
alongside the user layer or the data layer was not addressed. Accordingly, the aims of this study are as follows:
• To create a layer that separates the user layer from the data layer to increase data protection and prevent
unauthorized access.
• To protect user and institutional data, ensuring confidentiality, integrity, and prompt availability of data.
• To reduce the time required to classify the payloads sent to the data layer.

In this research, we presented an approach for detecting the real-time SQLIA by applying a logistic
regression (LR) approach in a big data environment using a distributed Spark ML system that acts as a middle
protection layer between the client and the database server to increase data protection and the classification
accuracy of the sent payload. The contributions of this model are as follows:
• The first contribution of this model is that it proposes an approach that uses a middle layer between the data
layer and the user layer to receive SQL payloads from the user layer and analyze them to classify whether
the request is harmful or benign by using the LR approach with the big data framework Spark ML library.
This layer prevents users from directly accessing data, thereby further protecting the data layer from
unauthorized access and from violations of the principles of basic information security.
• The second contribution is that the time taken to classify the request type is reduced by using the Spark ML
library because Spark ML works in the memory in a distributed way, thereby reducing the time taken to
classify the payload type.

The subsequent sections are organized as follows. The second section of this study will address the
proposed methodology for identifying and mitigating SQLIA within a big data setting. The third section
presents the outcomes. The final section provides the conclusions.

2 Methodology
The proposed framework for detecting SQL injection attacks in a big data environment consists of three layers.
The first layer is the user layer, through which user requests are sent. The second layer represents the
protection layer, which includes the proposed framework for classifying requests sent from the first layer.
This layer consists of several stages, as illustrated in the following.
Stage 1: Data that contain both malicious and benign payloads are collected to train the proposed model.
Stage 2: Pre-processing is applied to the acquired data.
Stage 3: The acquired data are divided into two sets for training and testing.
Stage 4: The first dataset is used to train the proposed model using LR.
Stage 5: The second dataset is used to test the model.
Stage 6: The model is evaluated using a confusion matrix and a set of metrics to measure the performance
efficiency.
Analyzing SQL payloads using logistic regression  3

In the second layer, the LR approach determines whether incoming order loads are harmful or not. The
third layer represents the data layer that needs to be protected from attacks by unauthorized individuals who
try to access this data.
The LR approach contains three variables. The first variable is “features,” which represents the condition
feature. The second variable is “label_col,” which represents the decision feature. The third variable

Figure 1: Proposed system to classify submitted queries.


4  Omar Salah F. Shareef et al.

represents the maxiteration, which is used as a time criterion to stop training the model. The flow of the
experiment is shown in Figure 1.
The proposed approach is described in the following subsections.

2.1 SQL-i datasets

Collecting data pertaining to the research subject matter is essential in developing an ML approach. This study
used a dataset comprising 109,518 instances categorized into two groups on the basis of their payloads: those
with malicious and non-malicious intentions. The object in question is divided into two parts.
The following table provides an overview of the dataset (Table 1).

Table 1: Summary of the dataset

Name of dataset Number of cases Learning step Testing step Normal Malicious

SQLIA 109,518 76,670 32,848 52,213 57,305

The dataset used in this research is described in the Figure 2.

Figure 2: Dataset before pre-processing.

The data were collected from the Kaggle website [7]. The challenges initially contained the dataset con-
taining 109,518 samples that contained harmful and benign payloads, but the data were inaccurate because a
filtering process was performed to delete entries that contained data. The data pre-processing process cannot
be applied to convert the initial dataset into a dataset that can be applied by ML algorithms.

2.2 Data pre-processing

In the second stage, the dataset is pre-processed and prepared to be used by the learning techniques. Data
preparation aims to reduce data volume; create connections between datasets; standardize data; and eliminate
outliers, duplicates, and missing values [8].
CountVectorizer is a tool that is used to pre-process the dataset, where textual data are transformed into
numerical vectors. For instance, the terms in articles could reference the characteristics of a specific class, and
Analyzing SQL payloads using logistic regression  5

a single vector could furnish all the phrases. This process is known as vectorization. Common text routing
technologies include CountVectorizer and TF-IDFVectorizer. These vectors convert textual data into vector
format [9].
CountVectorizer is frequently used to derive numerical properties from texts and generate class features.
In the instruction text, only frequently occurring words are taken into account. Using matrix fit transform,
CountVectorizer converts the text into a word occurrence matrix, enabling users to calculate the frequency of
each word [10].

Algorithm 1: CountVectorizer for data pre-processing

Input: Dataset prior to initial processing


Output: array of word
Begin:
Stage 1: Transform text into a collection of words by using CountVectorizer.
Stage 2: Eliminate frequently used terms.
Stage 3: Remove the least frequently used terms.
Stage 4: Eliminate all end phrases.
Stage 5: Convert every word into lowercase letters.
Stage 6: Arrange the vocabulary in ascending order.
If the term is present, then it is indicated by a 1 in the text; if it is absent, then it is indicated by a 0.
Stage 7: Repeat stages 1–6 to convert the text dataset into numbers.
End

The example below shows the process of converting text into numbers using an algorithm to convert text
into an array of words.
Sample 1 “Convert text to an array using CountVectorizer using text datasets”
Sample 2 “Convert text to an array using” (Table 2)

Table 2: Dataset after pre-processing

Sample Array Convert CountVectorizer Datasets Text Using

Sample 1 1 1 1 1 1 1
Sample 2 1 1 0 0 1 1

2.3 Training and testing

The third stage in developing an ML approach is to divide the data into two distinct categories: the training
group and the testing group. The holdout method was used in this investigation, with 80% of the dataset used
for training and 20% for testing and evaluation [11]. The dataset used is a balanced set, containing 45,051
benign payloads and 40,923 malicious payloads out of the total dataset of 85,974.

2.4 Prediction approach

The fourth stage in developing an ML approach is selecting a classification approach for SQL requests sent to
web-based databases. This study uses a supervised ML approach. This method categorizes requests into two
categories (0 and 1), which represent harmless and harmful requests. In addition, this technique aims to create
a classification that accurately describes the relationship between dependent and independent variables.
6  Omar Salah F. Shareef et al.

The effectiveness of the LR method is determined by the linear regression strategy in the following
equation:
j = h 0(i ) = θT i. (1)

The use of equation (1) may need to be more efficient when dealing with binary numbers. By using
equation (2), we may determine whether the communicated request will have a harmful payload (probability
1) or a harmless payload (probability 0) [12].
1
p ( j = 1 | i ) = h θ (i ) = = σ (θT i )
1 + exp( − θT i )
p( j = 0|i ) = 1 − p( j = 1|i ) = 1 − h θ (i ) . (2)

Equation (3), sometimes referred to as the sigmoid function, allows us to keep the value of θT i within the
range [0, 1]. Then, we look for a number such that p( j = 1 | i ) = h θ (i ), i.e., p( j = 0 | i ), is large when i belongs to
the “0” class and small when i belongs to the “1” class [13–15].
1
σ (t ) = . (3)
(1 + e − t )

The LR regression algorithm was chosen to train and test the model. This model was chosen because of its
highly accurate results and the short time it takes to classify benign and harmful loads.
The variable that represents maxiteration was used as a time criterion to stop training the model, where
maxiteration = 100 was chosen, which gave the best accuracy and the shortest time.
Two variables were used to build the model, where the first variable “features” represents the condition feature,
which contains both harmful and benign payloads. We pre-process these features using CountVectorization to extract
the desired features after removing the least and most frequent words that do not affect the model training results.
The “Data pre-processing” subsection in the Methodology provided an example of how to capture the desired features.
The features obtained from the pre-processing results will be used as features in model training.
The second variable “label_col” represents the decision feature.

2.5 Performance evaluation measures of prediction approach

During the final stage of developing the prediction approach, various metrics such as accuracy, time, precision,
and recall were used to evaluate the approach and determine the outcomes.
A confusion matrix with a variety of values was used to calculate these measurements. Table 3 shows the
widespread use of the confusion matrix, which consists of four classes, namely, false positive (FP), false
negative (FN), true negative (TN), and true positive (TP).

Table 3: Confusion matrix

Predicted class

Class X Class Y

True class Class X TN FP


Class Y FN TP

TP: This term is used to describe malicious payloads that the model has correctly predicted.
FN: This term relates to instances in which the prediction approach categorized a benign case as harmful.
FP: This term relates to instances where the prediction approach categorizes harmful conditions as benign.
TN: This term refers to instances that were identified as benign by the prediction approach and are, in fact, benign [15–17].
Analyzing SQL payloads using logistic regression  7

The following equations represent the metrics used to assess the approach and determine its performance
efficacy.
Accuracy: It represents the total number of accurate predictions, both TP and TN. It is mathematically
expressed as follows:
Count of accurately categorized observations (TP + TN)
Accuracy = × 100. (4)
Total number of instances (TP + TN + FP + FN)
Precision: It displays the proportion of TP to the sum of TP and FP. It is mathematically expressed as follows:
No. of true positives (TP)
Precision = × 100. (5)
No. of true positive + false positive (TP + FP)
Recall: It displays the ratio of TP to the total TP and FN. It is mathematically expressed as follows:
No. of true positives (TP)
Recall = × 100. (6)
No. of true positive + false negative (TP + FN)

F1-score: This is the proportional mean of precision and recall. It is mathematically expressed as follows [18]:
Precision × Recall
F 1-score = 2 × × 100. (7)
Precision + Recall

3 Results and discussions


This section presents the results of using the LR approach when dealing with a big data environment, which
can be used to determine whether the payload sent by the user contains malicious or benign payloads.
Building an approach using ML or any other system requires providing a set of basic hardware and
software requirements. Tables 4 and 5 describe the basic requirements used in this study.

Table 4: Software requirement

Software requirement

System type 64-bit operating system, x64-based processor


Programming language Python programming languages (Spyder [Anaconda3])

Table 5: Hardware requirement

Hardware requirement

Processor Intel(R) Core (TM) i7-5500U CPU @ 2.40 GHz


Installed RAM 8 GB
Hard disk 500 GB
GPU AMD Radeon Graphics Processor HD (8500 M)

However, the results were obtained by using two experiments for training and testing the model. The purpose
is to achieve the best classification accuracy and the shortest time for classifying the type of loads.
8  Omar Salah F. Shareef et al.

3.1 First experiment

The first experiment was conducted using a dataset containing 85,974 malicious and benign payloads divided
into two sections. The first section consists of 45,051 payloads representing benign loads, and the second
section consists of 40,923 payloads representing malicious loads. As for the data division method, the holdout
method was used, where 70% of the dataset was chosen for training, and the remaining portion was used for
testing and evaluation.

3.2 Second experiment

The second experiment was conducted using a dataset containing 85,974 malicious and benign payloads
divided into two sections. The first section, which represents the benign payloads, consists of 45,051 samples,
while the second section, representing the malicious payloads, consists of 40,923 samples. As for the data
division method, the holdout method was used, where 80% of the dataset was chosen for training, and the
remaining portion was used for testing and evaluation.
Table 6 shows that the accuracy of the LR approach reached 99.04.

Table 6: Result of first experiment

Seq Name of parameter Value

1. Time complexity 0.10 s


2. Accuracy 98.025
3. Precision 98.055
4. Recall 98.025
5. F-score 98.02
6. Training dataset 59,938
7. Test dataset 26,036

The results of the second experiment were chosen because they provided better accuracy and a shorter
testing time.
The LR approach accurately classified the process of sending SQL queries to databases used by web
applications. The value of TP and TN, which is 99.04%, indicates that malicious and benign payloads may
be discriminated with high accuracy. The detection of the query type took 0.05 s only (Table 7).

Table 7: Result of second experiment

Seq Name of parameter Value

1. Time complexity 0.05 s


2. Accuracy 99.04
3. Precision 98.18
4. Recall 99.89
5. F-score 99.04
6. Training dataset 68,604
7. Test dataset 17,370

The following table shows the results of the comparison between previous studies and this study (Table 8).
Standard methods for detecting and preventing these attacks can obtain optimal results when dealing
with small data. However, these methods are not optimal when used for big data. The significant feature of the
proposed approach when implementing Spark ML and the Spark framework is that it can process large-scale
Analyzing SQL payloads using logistic regression  9

Table 8: Result of comparison between previous studies and this study

Ref Model Accuracy Time complexity Dataset size

[19] SVM 98.6 Non 181,303


[20] Neural network of direct signal propagation 95 Non 30,233
[21] Long short-term memory (LSTM) 95.2 37.1494 s 42,212
[22] Support vector machine 94.92 3.98 s 20,474
Gradient boosting 94.27
Naive Bayes classifier 70.79
REGEX classifier 97.48
[23] Naive Bayes 95 Non sec
LR 92
CNN 97
SVM 79
Passive aggressive 79
[24] CNN-BiLSTM 98 45 s 4,200
Proposed model 99.04 0.05 85,974

datasets efficiently and reliably. Scalability is achieved by distributing processing tasks, thus enabling the
handling of larger, more complex datasets. Achieving high performance requires using memory resources and
executing operations simultaneously. However, the limitation of this work is that the proposed approach has
difficulty dealing with large datasets when applying ML models because they require higher computational
power. In addition, some of the datasets used contain instances that cannot be processed and handled by ML
algorithms because the dataset must be filtered before it can be used by the proposed approach.

4 Conclusion
This work presented a method for detecting SQL attacks using the LR approach in a big data environment. The
dataset contained malicious and benign SQL payloads. The proposed approach then classified user queries as
containing either malicious or benign payloads. Several experiments were conducted, and the performances
were compared. The proposed method achieved the highest accuracy and the shortest running time when
handling large datasets in every experiment.
One of the main contributions of this work is that the proposed method prevents users from directly accessing
the data, and it maintains the data’s confidentiality, integrity, and availability. This protection is achieved by
creating a separation layer, which applies an approach trained on a large dataset for classifying new payloads
sent by the user, thus providing additional protection for the data layer before the request is sent by the user layer.
The second contribution is that the time required to classify the query type submitted by the user is reduced by
using the Spark ML library. Spark ML works in the memory in a distributed manner, thereby reducing the time
required to classify the payload type. Reducing the time to classify the type of request is essential when dealing with
big data because it enables timely access to the data and ensures that the data are available to users and
organizations. This work provides high accuracy and takes a short time to classify requests, thereby achieving
high data protection and maintaining the confidentiality, integrity, and availability of data. However, the proposed
approach can classify the SQL-type attack only. Future work will involve building a model that classifies more than
one type of attack such as cross-site scripting attacks or DDOS attacks using the LSTM algorithm.

Author contributions: Omar Salah F. Shareef conceived of the presented idea. Rehab Flaih Hasan and Ammar
Hatem Farhan designed and performed the experiments, derived the models, and analyzed the data. Omar
Salah F. Shareef supervised the project. Ammar Hatem Farhan wrote the manuscript in consultations Omar
Salah F. Shareef and Rehab Flaih Hasan. All authors discussed the results and contributed to the final
manuscript.
10  Omar Salah F. Shareef et al.

Conflict of interest: Authors state no conflict of interest.

Data availability statement: The data that support the findings of this study are openly available on [Kaggle
website] at https://www.kaggle.com/datasets/gambleryu/biggest-sql-injection-dataset?resource=download.,
reference number [7].

References
[1] Farhan AH, Hasan RF. Detection SQL injection attacks against web application by using K-nearest neighbors with principal
component analysis. In: Proceedings of Data Analytics and Management: ICDAM 2022. Springer; 2023. p. 631–42.
[2] Durai KN, Subha R, Haldorai A. A novel method to detect and prevent SQLIA using ontology to cloud web security. Wirel Pers
Commun. 2021;117(4):2995–3014. doi: 10.1007/s11277-020-07243-z.
[3] Haldorai A, Devi S, Joan R, Arulmurugan L. Big data in intelligent information systems. Mob Netw Appl. 2022;October
2021;27:997–9. doi: 10.1007/s11036-021-01863-w.
[4] Awan MJ, Farooq U, Babar HM, Yasin A, Nobanee H, Hussain M, et al. Real-time ddos attack detection system using big data
approach. Sustain. 2021;13(19):1–19. doi: 10.3390/su131910743.
[5] Alghawazi M, Alghazzawi D, Alarifi S. Detection of SQL injection attack using machine learning techniques: A systematic literature
review. J Cybersecur Priv. 2022;2(4):764–77. doi: 10.3390/jcp2040039.
[6] Crespo-Martínez IS, Campazas-Vega A, Guerrero-Higueras ÁM, Riego-DelCastillo V, Álvarez-Aparicio C, Fernández-Llamas C. SQL
injection attack detection in network flow data. Comput Secur. 2023;127:103093. doi: 10.1016/j.cose.2023.103093.
[7] https://www.kaggle.com/datasets/gambleryu/biggest-sql-injection-dataset? resource = download.
[8] Alasadi SA, Bhaya WS. Review of data preprocessing techniques in data mining. J Eng Appl Sci. 2017;12(16):4102–7.
[9] El Rifai H, Al Qadi L, Elnagar A. Arabic text classification: the need for multi-labeling systems. Neural Comput App.
2022;34(2):1135–59. doi: 10.1007/s00521-021-06390-z.
[10] Yang JS, Zhao CY, Yu HT, Chen HY. Use GBDT to predict the stock market. Procedia Comput Sci. 2020;174(2019):161–71. doi: 10.1016/j.
procs.2020.06.071.
[11] Rafało M. Cross validation methods: Analysis based on diagnostics of thyroid cancer metastasis. ICT Express. 2022;8(2):183–8.
doi: 10.1016/j.icte.2021.05.001.
[12] Arif ZH, Cengiz K. Severity Classification for COVID-19 Infections based on Lasso-Logistic Regression Model. Int J Mathematics,
Statistics, Computer Sci. 2023;1:25–32. doi: 10.59543/ijmscs.v1i.7715.
[13] Yassine S, Stanulov A. A comparative analysis of machine learning algorithms for the purpose of predicting Norwegian air
passenger traffic. Int J Mathematics, Statistics, Computer Sci. 2023;2:28–43. doi: 10.59543/ijmscs.v2i.7851.
[14] Zhu C, Idemudia CU, Feng W. Improved logistic regression model for diabetes prediction by integrating PCA and K-means
techniques. Inform Med Unlocked. 2019;17:100179. doi: 10.1016/j.imu.2019.100179.
[15] Shah K, Patel H, Sanghvi D, Shah M. A comparative analysis of logistic regression, random forest and KNN models for the text
classification. Augmented Hum Res. 2020;5(1):1–16. doi: 10.1007/s41133-020-00032-0.
[16] Shaukat K, Luo S, Varadharajan V, Hameed IA, Xu M. A survey on machine learning techniques for cyber security in the last decade.
IEEE Access. 2020;8:222310–54. doi: 10.1109/ACCESS.2020.3041951.
[17] Abuhaiba ISI, Dawoud HM. Combining different approaches to improve Arabic text documents classification. Int J Intell Syst Appl.
2017;9(4):39–52. doi: 10.5815/ijisa.2017.04.05.
[18] Alarfaj FK, Khan NA. Enhancing the performance of SQL injection attack detection through probabilistic neural networks. Appl Sci.
2023 Mar 29;13(7):4365.
[19] Uwagbole SO, Buchanan WJ, Fan L. Applied machine learning predictive analytics to SQL injection attack detection and prevention.
Proc. IM 2017 - 2017 IFIP/IEEE Int. Symp. Integr. Netw. Serv. Manag; 2017. p. 1087–90. doi: 10.23919/INM.2017.7987433.
[20] Hubskyi O, Babenko T, Myrutenko L, Oksiiuk O. Detection of SQL injection attack using neural networks. Advances in Intelligent
Systems and Computing. Vol. 1265 AISC. 2021. p. 277–86. doi: 10.1007/978-3-030-58124-4_27.
[21] Tang P, Qiu W, Huang Z, Lian H, Liu G. Detection of SQL injection based on artificial neural network. Knowl Syst. 2020;190:105528.
doi: 10.1016/j.knosys.2020.105528.
[22] Kranthikumar B, Velusamy RL. SQL injection detection using REGEX classifier. J Xi’an Univ Archit Technol. 2020;7(6):800–9.
[23] Joshi A, Geetha V. SQL Injection detection using machine learning. In: 2014 International Conference on Control, Instrumentation,
Communication and Computational Technologies, ICCICCT 2014; 2014. p. 1111–5. doi: 10.1109/ICCICCT.2014.6993127.
[24] Aggarwal P, Kumar A, Michael K, Nemade J, Sharma S. Random decision forest approach for mitigating SQL injection attacks.
In: 2021 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT). 2021. p. 1–5.

You might also like