SQL Injection Detection Using Hybrid Model
SQL Injection Detection Using Hybrid Model
(2) Logistic regression - Logistic regression algorithm is used both for classification as well as regression
problems using a set of independent variables i.e. we have only two possible scenarios—either the text
is plain text or it is a malicious text.
(3) KNN classifier - It is a memory based classification algorithm. The steps are as follows-
First the K-most suitable instances to the data that is being tested are identified.
Then suitable labels are extracted.
Labels for data being tested are predicted by combining the data being tested
Figure 6: Performance metric when Base model - KNN and Meta model - LR
Figure 7: Performance metric when Base Model - LR, KNN and Meta model – LR
Figure 8: Performance metric when Base Model - LR, KNN, LDA and Meta model – LR
This research paper implemented three machine learning approaches along with neural networks (deep
learning) to detect SQL injection attacks. Accuracy of each approach is-
Table 1: Comparison of accuracy of models
ALGORITHM ACCURACY
1. LR 93%
2. LDA 73%
3. KNN 71%
4. NN 97%
5. STACKING 81% (KNN + LR)
97.5% (KNN, LR + LR)
97.7% (KNN, LDA, LR + LR)
IV. CONCLUSION
SQL injection attack is one of the main security issues in the various sectors mainly in finance and defence
sector where the losses can be huge. In this paper certain algorithms were tried along with neural networks to
improve accuracy in detection of SQL injection attacks. The best accuracy was given by the model in which base
models were Logistic Regression, K-Nearest Neighbour and Linear Discriminant Analysis and the meta model
was Logistic Regression.
The logistic regression was used as meta model because it is one of the most basic models, and also only upto
three models were used as base models because after that complexity increases and also there is no further
improvement in accuracy.
In Future this project can be modified to detect other types of web attacks like dos and css attacks and lot of
work can be done to improve accuracy and performance. The project can further be modified to enhance
usability and efficiency. A larger dataset collected from multiple sources can be used to improve accuracy of
deep learning model. The machine learning model can also be improved for better feature selection. Currently
tokenization approach is used. Different methods can be tried for training the model more effectively. Other
validation techniques like cross validation technique can be used.
V. REFERENCES
[1] Ke Wei, M. Muthuprasanna, Suraj Kothari, “Preventing SQL Injection Attacks in Stored Procedures”
2006 -ASWEC'06 IEEE.
[2] Shikhar Jain & Alwyn R. Pais, “Model Based Approach to Prevent SQL Injection Attacks on.NET
Applications” International Journal of Computer Science & Informatics, Volume-1, Issue-11, 2011.
[3] William G. J. Halfond , Alessandro Orso, “AMNESIA: analysis and monitoring for NEutralizing SQL-
injection attacks”, Proceedings of the 20th IEEE/ACM international Conference on Automated software
engineering, November 07-11, 2005.