0% found this document useful (0 votes)

38 views5 pages

SQL Injection Detection Using Hybrid Model

Uploaded by

jayanto chowdhury

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views5 pages

SQL Injection Detection Using Hybrid Model

Uploaded by

jayanto chowdhury

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

e-ISSN: 2582-5208

International Research Journal of Modernization in Engineering Technology and Science

( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:03/Issue:06/June-2021 Impact Factor- 5.354 www.irjmets.com

Sql INJECTION DETECTION USING HYBRID MODEL

Dr. Sandeep Kumar *1, Tanya Ahuja*2, Bhavya Choudhary*3
*1 Associate Professor, Department of Computer Science, Maharaja Surajmal Institute of Technology,
New Delhi, India.
*2 Student, Department of Computer Science, Maharaja Surajmal Institute of Technology,
New Delhi, India.
*3 Student, Department of Computer Science, Maharaja Surajmal Institute of Technology,
New Delhi, India.
ABSTRACT
All of us are surrounded by technology. So much information and millions of files are being shared all across the
Internet over web applications. Online payments and Internet banking have also become so common recently.
Web-based applications store crucial information from users in databases. The database in the backend is
integrated with web frontends, which allows injection attacks to be performed. SQL injection means placing
harmful code in the original code by inputting malicious SQL statements. Therefore, testing SQLi vulnerabilities
is important, but at the same time, it is practically impossible to check everything without using a proper
algorithm. This paper attempts to detect SQLi attacks using basic Machine Learning algorithms and to improve
the performance stacking technique was used in which one model was chosen as meta model - Logistic
Regression and different combination of basic algorithms (Logistic regression, k-nearest neighbor, formed the
base models. The reason for using these basic models is to highlight that to improve the performance matrix we
don't necessarily need deep learning models which require large datasets and high computational power.
Keywords- SQLi, KNN, Logistic regression, LDA, Neural Networks, Stacking.
I. INTRODUCTION
Web applications are very popular these days. To increase the exposure, organizations make these applications
available on the Internet to increase their gain. Being exposed to Internet increases the security challenges. All
the transactions that we perform today are mostly done online. The data of these websites is stored in
databases. One such type of database is Relational Database in which information can be fetched through
Structured Query Language, i.e. SQL.
SQL Injection is probably the easiest way of stealing the data from any database which stores data on the basis
of web inputs, hackers can get access to the database and make changes through such attacks. As said by “Open
Web Application Security Project”- “injection attack is a technique used to access information or unauthorized
activity”. As a result, hackers rely on SQL injection for stealing information. There are three things that can be
done i.e. Prevention, detection or correction. Preventing is not an easy task because it requires a lot of
knowledge. Purpose of sql injection attack: It is done mainly because of two reasons. One is to gain benefit by
grabbing others sensitive data and another one is to test the knowledge in learning new tasks and try to prove
them.
In the proposed method we’ll use three approaches - Logistic regression, LDA and KNN classification to detect
SQL Injection attacks. Stacking technique will also be used as an approach for detecting SQLIA to get a better
accuracy. The current work done in this domain uses Deep learning algorithms to improve accuracy but we
want to improve the accuracy by only using Machine Learning algorithm so we used hybrid model which is
combination of various base models and meta models
II. METHODOLOGY
The dataset being used is the “SQL1” and “SQL2” dataset. SQL1 contains 4200 valid/invalid sql queries and
SQL2 contains 33761 valid/invalid sql queries. The data is labelled (0 = valid SQL query, 1 = malicious SQL
query). It is taken from kaggle website. It contains the following 2 fields:
1. label: 0 = valid SQL query, 1 = malicious SQL query
2. sentence: the text of the SQL query

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

[3846]
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:03/Issue:06/June-2021 Impact Factor- 5.354 www.irjmets.com
We have used various supervised Learning Algorithms and to improve the accuracy stacking methods are used.
Supervised learning can be used to classify and process data using machine learning. For which we need
labelled data, for which we already know whether the query is malicious or not, this dataset is then used to
train the model. After the
training is done through various algorithms then this model can be used on unlabeled data for classification of
queries. Following approaches are used -
(1) Linear discriminant Analysis – It picks a new dimension such that it maximizes separation between
means of projected classes and minimize variance within each projected class. For multiple variables,
similar properties are calculated over the multivariate Gaussian. The statistical properties are then
estimated from the data.

(2) Logistic regression - Logistic regression algorithm is used both for classification as well as regression
problems using a set of independent variables i.e. we have only two possible scenarios—either the text
is plain text or it is a malicious text.

(3) KNN classifier - It is a memory based classification algorithm. The steps are as follows-
 First the K-most suitable instances to the data that is being tested are identified.
 Then suitable labels are extracted.
 Labels for data being tested are predicted by combining the data being tested

Figure 1: KNN Illustration

(4) Hybrid model based on stacking algorithm – To improve the accuracy of machine learning models,
three hybrid models are created for which different models are used at different levels to get the best
results.
Initially, the training data(x) has m observations and n features, the training dataset was split into k
folds just like k-cross-validation, then the base model was fitted on k-1 parts and predictions were
made on the kth part, this process was repeated for each part and finally, the base model was fitted on
whole training data to calculate the performance on the test set. This procedure was followed for
different base models KNN, LR, and LDA.
Then the predictions from the training data set were used as features for the second-level model, then
the prediction from the training data set was used as features for the second-level model, then the
second-level model was used to make predictions on the test set.
Brute force analysis was followed – all combinations of base models were tried in the existing models.
And for meta-model, only LR was chosen as in such hybrid models complexity increases when we use
any advanced models. We restricted ourselves to only two levels because after that model becomes
highly overfitted.Three hybrid models are implemented-
a) Base model- KNN and Meta model- LR
b) Base model- LR, KNN and Meta model- LR
c) Base models- LR, KNN, LDA and Meta models- LR
After trying all the approach, the highest accuracy was obtained when LR was used at level 1; KNN,
LDA and LR used at level 0.
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[3847]
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:03/Issue:06/June-2021 Impact Factor- 5.354 www.irjmets.com
The Machine Learning detection method is used to check if the incoming parameters which are inputted by the
user consist of any malicious code or not which can be threat to security of the system. But the real-time
performance of machine learning algorithm is poor. Therefore, stacking technique and deep learning-based
approach are used to detect SQL injection. Stacking technique first finds the suitable features and then these are
used to train the model and finally checked on unlabeled data.
III. RESULTS AND DISCUSSION
Successfully detected SQL injection attacks using the following methods and obtained the following
results-
(1) Linear discriminant analysis

Figure 2: Performance Metrics of LDA Model

(2) Logistic regression

Figure 3: Performance Metrics of Logistic Regression Model

(3) KNN classifier

Figure 4: Performance Metrics of KNN Model

(4) Neural Network

Figure 5: Performance Metrics of NN with LR Model

(5) Stacking Algorithm

Figure 6: Performance metric when Base model - KNN and Meta model - LR

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

[3848]
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:03/Issue:06/June-2021 Impact Factor- 5.354 www.irjmets.com

Figure 7: Performance metric when Base Model - LR, KNN and Meta model – LR

Figure 8: Performance metric when Base Model - LR, KNN, LDA and Meta model – LR
This research paper implemented three machine learning approaches along with neural networks (deep
learning) to detect SQL injection attacks. Accuracy of each approach is-
Table 1: Comparison of accuracy of models
ALGORITHM ACCURACY
1. LR 93%
2. LDA 73%
3. KNN 71%
4. NN 97%
5. STACKING 81% (KNN + LR)
97.5% (KNN, LR + LR)
97.7% (KNN, LDA, LR + LR)

IV. CONCLUSION
SQL injection attack is one of the main security issues in the various sectors mainly in finance and defence
sector where the losses can be huge. In this paper certain algorithms were tried along with neural networks to
improve accuracy in detection of SQL injection attacks. The best accuracy was given by the model in which base
models were Logistic Regression, K-Nearest Neighbour and Linear Discriminant Analysis and the meta model
was Logistic Regression.
The logistic regression was used as meta model because it is one of the most basic models, and also only upto
three models were used as base models because after that complexity increases and also there is no further
improvement in accuracy.
In Future this project can be modified to detect other types of web attacks like dos and css attacks and lot of
work can be done to improve accuracy and performance. The project can further be modified to enhance
usability and efficiency. A larger dataset collected from multiple sources can be used to improve accuracy of
deep learning model. The machine learning model can also be improved for better feature selection. Currently
tokenization approach is used. Different methods can be tried for training the model more effectively. Other
validation techniques like cross validation technique can be used.
V. REFERENCES
[1] Ke Wei, M. Muthuprasanna, Suraj Kothari, “Preventing SQL Injection Attacks in Stored Procedures”
2006 -ASWEC'06 IEEE.
[2] Shikhar Jain & Alwyn R. Pais, “Model Based Approach to Prevent SQL Injection Attacks on.NET
Applications” International Journal of Computer Science & Informatics, Volume-1, Issue-11, 2011.
[3] William G. J. Halfond , Alessandro Orso, “AMNESIA: analysis and monitoring for NEutralizing SQL-
injection attacks”, Proceedings of the 20th IEEE/ACM international Conference on Automated software
engineering, November 07-11, 2005.

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

[3849]
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:03/Issue:06/June-2021 Impact Factor- 5.354 www.irjmets.com
[4] Nausheen, K.: “Detection and Prevention of SQL Injection Attacks by Request Receiver, Analyzer and
Test Model” 2011.
[5] Uwagbole, S., J. Buchanan, and Lu Fan, “Applied machine learning predictive analytics to SQL injection
attack detection and prevention”, Proceeding of the IFIP/IEEE Symposium on Integrated Network and
Service Management (IM), Lisbob, Portugal, 8-12 May, 2017, pp.10871090.
[6] Kemalis, K. and T. Tzouramanis, “SQL-IDS: A Specification-based Approach for SQL injection Detection”,
Proceedings of the ACM symposium on Applied computing (SAC), Fortaleza, Ceará, Brazil, March 16-20,
2008, pp. 2153 2158.
[7] J. Choi, C. Choi, H. Kim, and P. Kim, “Efficient malicious code detection using Ngram analysis and SVM,”
2011 International Conference on Network-Based Information Systems, NBiS 2011, 2011, pp. 678–689.
[8] D. Kar, S. Panigrahi, and S. Sundararajan, “SQLiGoT: Detecting SQL Injection Attacks using Graph of
Tokens and SVM,” Comput. Secur., vol. 60, pp. 200–203, 2016.
[9] Yuji Kosuga, Kenji Kono, Miyuki Hanaoka, “Sania: Syntactic and Semantic Analysis for Automated
Testing against SQL Injection” Inc. 3-22-23, Tokyo, Japan.
[10] Stephen W. Boyd, Angelos D. Keromytis, “SQLrand: Preventing SQL Injection Attacks”, Columbia
University.

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

[3850]

Ids Report
No ratings yet
Ids Report
37 pages
A Study of Machine Learning-Based Approaches For SQL Injection Detection and Prevention
No ratings yet
A Study of Machine Learning-Based Approaches For SQL Injection Detection and Prevention
10 pages
SQL-CB-GuArd: A Deep Learning Mechanism For Structured Query Language Injection Attack Detection
No ratings yet
SQL-CB-GuArd: A Deep Learning Mechanism For Structured Query Language Injection Attack Detection
13 pages
Detection of SQL Injection Attack Using Machine Le
No ratings yet
Detection of SQL Injection Attack Using Machine Le
11 pages
Final Paper
No ratings yet
Final Paper
9 pages
SQL Injection Detection Using Machine Learning Techniques and Mul
No ratings yet
SQL Injection Detection Using Machine Learning Techniques and Mul
28 pages
Sat - 94.Pdf - Detection of SQL Injection Attack Usiing Adaptive Deep Forest
No ratings yet
Sat - 94.Pdf - Detection of SQL Injection Attack Usiing Adaptive Deep Forest
11 pages
Detection of Structured Query Language Injection Attacks Using Machine Learning Techniques
No ratings yet
Detection of Structured Query Language Injection Attacks Using Machine Learning Techniques
14 pages
NP Internship Report
No ratings yet
NP Internship Report
42 pages
Enhancing SQL Injections
No ratings yet
Enhancing SQL Injections
13 pages
Hybrid SQL Injection Detection System
No ratings yet
Hybrid SQL Injection Detection System
5 pages
Intelligent Web Security: Machine Learning-Based SQL Injection Detection and Honeypot Integration
No ratings yet
Intelligent Web Security: Machine Learning-Based SQL Injection Detection and Honeypot Integration
7 pages
213j1a05h6 Data Science Cse-F
No ratings yet
213j1a05h6 Data Science Cse-F
25 pages
A Study On SQL Injection Detection AI-based Perspective
No ratings yet
A Study On SQL Injection Detection AI-based Perspective
4 pages
Final Year Project Presentation (P-1) Format
No ratings yet
Final Year Project Presentation (P-1) Format
22 pages
AI-enabled Natural Language Processing For Prediction of Malicious SQL Codes
No ratings yet
AI-enabled Natural Language Processing For Prediction of Malicious SQL Codes
11 pages
A Machine Learning Approach To Preventing SQL Injection Attack On Critical Information Infrastructure
No ratings yet
A Machine Learning Approach To Preventing SQL Injection Attack On Critical Information Infrastructure
47 pages
Res PPP
No ratings yet
Res PPP
16 pages
Loan Approval Predictor Using Data Science and Machine Learning Project
100% (1)
Loan Approval Predictor Using Data Science and Machine Learning Project
66 pages
SQL Injection Detection Using Machine Learning
No ratings yet
SQL Injection Detection Using Machine Learning
51 pages
Analyzing SQL Payloads Using Logistic Regression I
No ratings yet
Analyzing SQL Payloads Using Logistic Regression I
10 pages
IJSRDV6I10368
No ratings yet
IJSRDV6I10368
2 pages
A Datamining Model For Detection of Fraudulent Behaviour in Water
No ratings yet
A Datamining Model For Detection of Fraudulent Behaviour in Water
36 pages
Chen 2021 J. Phys. Conf. Ser. 1757 012055
No ratings yet
Chen 2021 J. Phys. Conf. Ser. 1757 012055
8 pages
Rs 1
No ratings yet
Rs 1
7 pages
Cyber Attack
No ratings yet
Cyber Attack
131 pages
Mca Format Crime Prediction
No ratings yet
Mca Format Crime Prediction
62 pages
Intrusion Detection
No ratings yet
Intrusion Detection
12 pages
Ss
No ratings yet
Ss
26 pages
Academic Int. Report
No ratings yet
Academic Int. Report
50 pages
A Comparative Study of Classification Techniques For Fraud Detection
No ratings yet
A Comparative Study of Classification Techniques For Fraud Detection
5 pages
ADBMS Lab Manual New
No ratings yet
ADBMS Lab Manual New
24 pages
Project
No ratings yet
Project
63 pages
Machine Learning Complete Notes
No ratings yet
Machine Learning Complete Notes
102 pages
Introduction To Machine Learning PDF
100% (1)
Introduction To Machine Learning PDF
17 pages
Machine Learning Based Intrusion Detection System
No ratings yet
Machine Learning Based Intrusion Detection System
5 pages
Irjet V10i395
No ratings yet
Irjet V10i395
4 pages
Pondicherry University: Project Phase - 1
No ratings yet
Pondicherry University: Project Phase - 1
12 pages
Loan Approval Prediction Using Supervised Learning Algorithm
No ratings yet
Loan Approval Prediction Using Supervised Learning Algorithm
11 pages
Query Generation Using Nadaq System
No ratings yet
Query Generation Using Nadaq System
11 pages
Sat - 100.Pdf - Prediction of Cyber Attacks Using Data Science Technique
No ratings yet
Sat - 100.Pdf - Prediction of Cyber Attacks Using Data Science Technique
11 pages
Project Lit Final1
No ratings yet
Project Lit Final1
15 pages
Class Result Prediction Using Machine Learning
No ratings yet
Class Result Prediction Using Machine Learning
6 pages
Supervised Learning Classification Algorithms Comparison
No ratings yet
Supervised Learning Classification Algorithms Comparison
6 pages
Irjet V5i9192 PDF
No ratings yet
Irjet V5i9192 PDF
6 pages
SQL Injection Detection and Correction Using Machine
No ratings yet
SQL Injection Detection and Correction Using Machine
8 pages
Major Project
No ratings yet
Major Project
20 pages
Submitted in Partial Fulfillment of The Requirement For The Award of The Degree of
No ratings yet
Submitted in Partial Fulfillment of The Requirement For The Award of The Degree of
22 pages
Classifying The Supervised Machine Learning and Comparing The Performances of The Algorithms
No ratings yet
Classifying The Supervised Machine Learning and Comparing The Performances of The Algorithms
17 pages
Me Internship Certificate(s)
No ratings yet
Me Internship Certificate(s)
27 pages
Iot and Machine Learning
No ratings yet
Iot and Machine Learning
12 pages
ML Summer Training
No ratings yet
ML Summer Training
20 pages
Project Report: Application of Machine Learning
No ratings yet
Project Report: Application of Machine Learning
12 pages
Machine Learning Based Intrusion Detection System: Anish Halimaa A Dr. K.Sundarakantham
No ratings yet
Machine Learning Based Intrusion Detection System: Anish Halimaa A Dr. K.Sundarakantham
5 pages
Machine Learning Part: Domain Overview
No ratings yet
Machine Learning Part: Domain Overview
20 pages

SQL Injection Detection Using Hybrid Model

Uploaded by

SQL Injection Detection Using Hybrid Model

Uploaded by

e-ISSN: 2582-5208

International Research Journal of Modernization in Engineering Technology and Science

Sql INJECTION DETECTION USING HYBRID MODEL

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

Figure 1: KNN Illustration

Figure 2: Performance Metrics of LDA Model

Figure 3: Performance Metrics of Logistic Regression Model

(3) KNN classifier

Figure 4: Performance Metrics of KNN Model

(4) Neural Network

Figure 5: Performance Metrics of NN with LR Model

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

You might also like