Analysis of Support Vector Machine-Based Intrusion Detection Techniques
Analysis of Support Vector Machine-Based Intrusion Detection Techniques
https://doi.org/10.1007/s13369-019-03970-z
Received: 31 December 2018 / Accepted: 10 June 2019 / Published online: 2 July 2019
© King Fahd University of Petroleum & Minerals 2019
Abstract
From the last few decades, people do various transaction activities like air ticket reservation, online banking, distance learning,
group discussion and so on using the internet. Due to explosive growth of information exchange and electronic commerce in
the recent decade, there is a need to implement some security mechanisms in order to protect sensitive information. Detection
of any intrusive behavior is one of the most important activity for protecting our data and assets. Various intrusion detection
systems are incorporated in the network for detecting intrusive behavior. In this paper, an analytical study of support vector
machine (SVM)-based intrusion detection techniques is presented. Here, the methodology involves four major steps, namely,
data collection, preprocessing, SVM technique for training and testing and decision. The simulated results have been analyzed
based on overall detection accuracy, Receiver Operating Characteristic and (ROC) Confusion Matrix. NSL-KDD dataset is
used to analyze the performance of SVM techniques. NSL-KDD dataset is a benchmark for intrusion detection technique and
contains huge amount of network records. The analyzed results show that Linear SVM, Quadratic SVM, Fine Gaussian SVM
and Medium Gaussian SVM give 96.1%, 98.6%, 98.7% and 98.5% overall detection accuracy, respectively.
123
2372 Arabian Journal for Science and Engineering (2020) 45:2371–2383
work are presented. Methodology is explained in Sect. 4; tures present in the database. There are other ways to perform
thus, Sect. 5 deals with empirical evaluation and results; detection like the detection of anomalous behavior in the net-
whereas analysis and discussion are presented in Sect. 6, work traffic, but such kind of techniques can lead to high false
and finally, this paper is concluded in Sect. 7. alarm. The detection protocol repository contains the rules
and algorithms on how to detect and prevent intrusions. The
main information about the anomaly detection that needs to
2 Intrusion Detection Systems be sent to the response unit is stored in detection protocol.
Reply unit or response contains those kinds of data which
2.1 IDS Components are intrusive in nature, and they are needed to be handled
carefully. The reply is generated for these data with the help
Figure 1 shows the main phases of generic IDS model. The of reply statements which are present in the reply protocol
first phase is data accumulator in which the events are gener- repository [3]. The decision on how to respond in multiple
ated on the basis of log data. These log data are formed with events is taken by the main response unit and reply protocol
the help of data collected by the target system. The data can database. The reply unit may get multiple inputs from dif-
be network traffic, operating system logs or application logs. ferent analyzers in distributive environment, and thus, it can
The data stored in configuration repository are then used in collaborate on all alarms coming from the main submission.
the attack detection phase. In order to help administrator, the reply unit can perform cer-
In attack detection, there are multiple analyzers which tain tasks after detecting an intrusion like creating an alarm
work simultaneously on the same data. They apply multiple to notify the controller, configuring to shut down the system
match scripts to match text strings, which are unique to vari- or cut off the connection from where the malicious traffic is
ous intrusions. This phase is similar to various anti-viruses in coming.
which the malicious code is matched against various signa-
123
Arabian Journal for Science and Engineering (2020) 45:2371–2383 2373
123
2374 Arabian Journal for Science and Engineering (2020) 45:2371–2383
K (ys , yt ) ys yt (5)
3.2 Optimal Hyperplane
b. Quadratic SVM: At training phase, memory usage for
There is a need to maximize the width of margin in order binary classification is medium and for multiclass classi-
to get an optimal hyperplane. In SVM, a large width fication is large in Quadratic SVM. The prediction speed
of margin gives better results, so our aim should be to for binary classification is fast and for multiclass is slow.
maximize the width of margin. The optimization method In Quadratic SVM, interpretability is hard and model
using Karush–Kuhn–Tucker (KKT) [6] conditions is used to flexibility is medium. Quadratic SVM may be defined
achieve optimal hyperplane. An example of optimal hyper- as:
plane is given in Fig. 3. Optimal hyperplane can be defined
mathematically for linear separable as for binary classifica- K (ys , yt ) (1 + ys yt )2 (6)
tion, the dataset of two pairs is (xi , yi ), i 1, 2,…,m where
xi ∈Rn and yi ∈ {+ 1,− 1}, for the correct classification of c. Fine Gaussian SVM: In Fine Gaussian SVM, at training
datasets [7] phase, memory usage for binary classification is medium
and for multiclass is large. The interpretability is hard and
(w. xi ) + b ≥ + 1 for yi + 1 (1) model flexibility is high in this case because it decreases
with kernel scale setting sqrt (P)/4. It makes elaborate
(w. xi ) + b ≤ − 1 for yi − 1 (2) distinctions amid classes. The prediction speed for binary
classification is fast and for multiclass is slow.
d. Medium Gaussian SVM: The prediction speed at training
where y represents two classes which have a binary value, w
phase for binary classification is fast and for multiclass is
is a weight vector, x is an input vector and b is a threshold
slow. In Medium Gaussian SVM, memory usage for mul-
value. Combining Eqs. (1) and (2), we get
ticlass is large and for binary classification is medium.
The interpretability is hard, and model flexibility of
yi ((w. xi ) + b) − 1 ≥ 0 for i 1, 2, . . . , m (3)
Medium Gaussian SVM is medium.
123
Arabian Journal for Science and Engineering (2020) 45:2371–2383 2375
123
2376 Arabian Journal for Science and Engineering (2020) 45:2371–2383
Fig. 8 ROC of U 2 R
Fig. 10 ROC of DOS
ful. Wang et al. [13] used the concept of “one class SVM”
which includes one set of examples belonging to a particu- and showed the superiority of their method. Cuong et al. [16]
lar class. Reducing the training time can be carried out with gave a modified support vector machine and modified back
the subsampling techniques in which classifier gets quick propagation scheme using covariate shift. In this research
after removing some random training points. If the probabil- work, authors used KDD1999 dataset and used modified
ity distribution of the training and testing data is not same, technique for intrusion detection. Authors solved negative
then this subsampling technique cannot be useful for train- effect problem in dataset shift in IDS and discussed how to
ing process. Guangping et al. [14] proposed an Improved overcome them. Cuong et al. got higher accuracy by using
Harmony Search Algorithm to improve SVM efficiency for modified technique when compared to the original technique.
network intrusion detection. The author used the dataset of a In this research work, authors deployed the Kernel Mean
certain city of China during 2006–2012. Li et al. [15] com- Matching (KMM) and Unconstrained Least-Squares Impor-
bined the fuzzy SVM and multiclass SVM based on binary tance Fitting (ULSIF) technique to modify the support vector
tree. In this research work, authors used KDDcup99 dataset machine and to make this technique work under covariate
123
Arabian Journal for Science and Engineering (2020) 45:2371–2383 2377
Fig. 14 ROC of U 2 R
Fig. 13 ROC of R 2 L
shift. M-SVM gives higher result than the best of nine binary
classifications.
In the paper proposed by Parwekar et al. [17], the different Fig. 15 Overall Confusion Matrix
issues related to networks attacks have been discussed. The
researchers suggested using hybrid approaches for intrusion
detection either in wired network or in wireless networks. and Probe attack when compared to other intrusion detection
These features have many advantages in experimental result. system on the same dataset. To the SVM training, original
Horng, et al. [18] proposed a scheme for intrusion detection large dataset does not provide abstracted, highly qualified
using hierarchical clustering and support vector machine. and reduced dataset, whereas it can provide BIRCH hierar-
Horng et al. combined features of Balanced Iterative Reduc- chical. By using the KDDcup99 dataset in their experiment,
ing and Clustering using Hierarchies (BIRCH) hierarchical they have reached an accuracy of 95.72% with the false posi-
clustering and SVM technique in order to propose a SVM- tive rate of 0.7% on the proposed system. Without sampling,
based intrusion detection system. To evaluate the proposed KDDcup1999 dataset has been used to perform the experi-
system, they used the KDDcup99 dataset. In this paper, ment. The proposed system has the best performance in terms
authors show the best performance in the detection of DOS of accuracy at 95.72%. Azad et al. [19] proposed an IDS
123
2378 Arabian Journal for Science and Engineering (2020) 45:2371–2383
with the help of Decision Tree and Genetic Algorithm. KDD- a pipeline in IDS. Authors make a compact data by clustering
cup99 datasets have been used to test the proposed system. redundant data; Ant Colony Algorithm is used for selecting
Tiwari et al. [20] proposed an IDS using Neural Network, small training set to seize the key feature of network and to
SVM and Neuro Fuzzy. In this research work, authors ana- obtain the classifier with SVM. IDS pipeline has an accuracy
lyzed the results with the help of ROC and achieved 97.71% of 98.62% with tenfold cross–validation, and Matthews Cor-
detection accuracy. Xiaoping [21] proposed a SVM-based relation Coefficient achieved is 0.861161. Li et al. proposed
intrusion detection technique with Nonlinear Dimensional- gradually feature removal method to precise wrapper-based
ity Reduction Algorithm. Isometric mapping has been used feature reduction method. In paper [23], Parwekar et al.
for dimension reduction in this research work. Li et al. [22] proposed an idea to implement robotics in military area appli-
proposed a scheme for intrusion detection based on support cations for intrusion detection. Yogita et al. [24] proposed
vector machine and gradually feature removal method. In a scheme for IDS using SVM. NSL-KDDcup99 dataset has
this research work, authors used the KDDcup99 dataset and been used to conduct some experiments for the proposed sys-
123
Arabian Journal for Science and Engineering (2020) 45:2371–2383 2379
123
2380 Arabian Journal for Science and Engineering (2020) 45:2371–2383
Fig. 26 ROC of U 2 R
Fig. 25 ROC of R 2 L
Fig. 27 Overall Confusion Matrix
benchmark for intrusion detection technique and contains Denial of Service (DOS): A DOS or Distributed Denial-
a huge amount of network records. NSL-KDD dataset has of-Service (DDOS) attack occurs when an attacker makes
41 features, among them some are basic features, some are memory resources or other network resources unavailable or
content-related features, few are time-related features and too full to respond to valid request, for example Smurf, Ping
remaining are host-based traffic features. of death, Neptune, etc.
NSL-KDD dataset is a mixture of attacks and normal
records. The attacks come under four categories, i.e., Denial Remote 2 User (R 2 L): R 2 L is an attack in which
of Services (DOS), Remote 2 User (R 2 L), User 2 Root (U unauthorized access is gained by an attacker and is able to
2 R) and Probing. So, NSL-KDD dataset has five classes: send packets over network. The attacker can easily utilize to
DOS, R 2 L, U 2 R, Probe and Normal. The details of attack gain local access as machine’s user, e.g., Guess_password,
categories are given below [26]: Xclock, etc.
123
Arabian Journal for Science and Engineering (2020) 45:2371–2383 2381
User 2 Root (U 2 R): This attack occurs when attacker tries 11, 12, 13 and 14 show the ROC diagram using Quadratic
to get access of system/account of normal user, for example SVM for DOS, normal, Probe, R 2 L and U 2 R, respectively.
Perl, Xtrem, etc. Figure 15 shows the overall Confusion Matrix for Quadratic
SVM.
Probing: Probing is an attack when an unauthorized user
does screening of system for gaining information about net-
5.3 Fine Gaussian SVM
work. This information can be further used for future attacks
or for bypassing the security control, for example mscan,
This SVM makes finely detailed distinctions between classes,
Satan, etc.
using this kernel with kernel scale set to sqrt (P)/4, where
Preprocessing means refining the collected datasets. Pre-
P represents predictors. Fine Gaussian SVM gives 98.7%
processing is very important because in this phase, there are
overall accuracy and 1.3% overall error. The model building
some options to select the features and eliminate the features
time using Fine Gaussian SVM is 100 s. Figures 16, 17,
in implementation. It reduces the computation time.
18, 19 and 20 show the ROC diagram using Fine Gaussian
After preprocessing, for training and testing the model,
SVM for DOS, normal, Probe, R 2 L and U 2 R, respectively.
SVM technique is applied. Cross-validation scheme is used
Figure 21 shows the overall Confusion Matrix for this type
for validation. Validation scheme is used to examine the pre-
of SVM.
dictive accuracy of the model. Validation scheme is chosen
before the training of the model. Suppose k-fold is selected;
5.4 Medium Gaussian SVM
that is, original sample is divided in k subsample among
which (k − 1) subsample is used for training and rest of the
This creates lesser distinctions than fine Gaussian SVM,
subsample is used for testing the model. Here, in implementa-
using the Gaussian kernel scale set to sqrt (P), where P repre-
tion fivefold cross-validation is used. Finally, decision comes
sents predictors. Medium Gaussian SVM gives 98.5% overall
as either normal or malicious.
accuracy and 1.5% overall error. The model building time
using Medium Gaussian SVM is 50 s. Figures 22, 23, 24,
25 and 26 show the ROC diagram using Medium Gaussian
5 Empirical Evaluation and Results SVM for DOS, normal, Probe, R 2 L and U 2 R, respectively.
Figure 27 shows the overall Confusion Matrix for Medium
5.1 Linear SVM Gaussian SVM.
123
2382 Arabian Journal for Science and Engineering (2020) 45:2371–2383
Table 2 Comparison based on overall detection accuracy and overall False Negative (FN): Number of records not correctly
error detected as normal class.
SVM technique Accuracy (%) Error (%)
TP
True positive rate (TPR) (7)
Linear SVM 96.1 3.9 (TP + FN)
Quadratic SVM 98.6 1.4
FP
Fine Gaussian SVM 98.7 1.3 False positive rate (FPR) (8)
(FP + TN)
Medium Gaussian SVM 98.5 1.5
TP + TN
Detection accuracy (9)
(TP + FP + TN + FN)
Table 3 Comparison based on model building time
The overall detection accuracy and overall error for all
SVM technique Time in seconds
SVM techniques are given in Table 2. Table 3 shows the
Linear SVM 74 time (in seconds) taken by every technique for model build-
Quadratic SVM 53 ing. Comparison based on detection accuracy for individual
Fine Gaussian SVM 100 attack class is presented in Table 4.
Medium Gaussian SVM 50
7 Conclusion
Table 4 Comparison based on detection accuracy for individual type of
attack class
In this paper, a detailed analysis of intrusion detection based
Types of Linear Quadratic Fine Medium on support vector machine (SVM) techniques has been
attack SVM SVM Gaussian Gaussian
demonstrated. The different SVM techniques on NSL-KDD
class SVM SVM
dataset have been applied, and MATLAB has been used
DOS 99.7398 99.7945 99.9644 99.8601 for simulations. The results have been analyzed with the
Normal 98.6537 99.5751 99.8343 99.5752 help of ROC and Confusion Matrix. Model building time
Probe 98.5939 99.4321 99.7743 99.6813 for every SVM technique has been observed. Individual
R2L 98.2637 97.9137 99.3901 99.5728 detection accuracy of attack class has been identified and ana-
U2R 96.3245 91.8876 87.4841 89.4713 lyzed. The analyzed results show that Linear SVM, Quadratic
SVM, Fine Gaussian SVM and Medium Gaussian SVM give
96.1%, 98.6%, 98.7% and 98.5% overall detection accuracy,
respectively, with an overall error of 3.9%, 1.4%, 1.3% and
its preferred class and the class with maximum votes wins.
1.5%, respectively. This analysis concludes that Fine Gaus-
The number of the classifiers increases super linearly when
sian SVM provides best accuracy and least error for intrusion
k grows.
detection.
The simulated results have been analyzed based on the
In future research, the real-time dataset can be used to
overall detection accuracy in %, ROC and Confusion Matrix.
analyze these techniques and SVM optimization techniques
ROC is also called performance curve. It shows the per-
may be involved. SVM-based intrusion detection techniques
formance of selected SVM technique and the relationship
can be used for IOT in the smart world.
between true positive rate and false positive rate. In ROC
plot, x-axis shows the false positive rate and y-axis shows the
detection accuracy [29]. For an ideal case in ROC, the angle
between x-axis and y-axis should be 90° at the upper left cor- References
ner. Confusion Matrix graphically displays the performance 1. Tsai, C.F.; Hsu, Y.F.; Lin, C.Y.; Lin, W.Y.: Intrusion detec-
of selected SVM technique in each attack class. Confu- tion by machine learning: a review. Expert Syst. Appl. 36(10),
sion Matrix shows the relationship between well-classified 11994–12000 (2009). https://doi.org/10.1016/j.eswa.2009.05.029
records and misclassified records. The terms used in analy- 2. Bhati, B.S.; Rai, C.S.: Intrusion detection systems and techniques:
a review. Int. J. Crit. Comput.-Based Syst. 6(3), 173–190 (2016).
sis stage are defined as follows: https://doi.org/10.1504/IJCCBS.2016.079077
True Positive (TP): Number of records correctly detected 3. Lundin, E.; Jonsson, E.: Survey of intrusion detection research.
as attack class. Chalmers University of Technology, Gothenburg (2002)
False Positive (FP): Number of records not correctly 4. Joachims, T. (1998). Text categorization with support vec-
tor machines: learning with many relevant features. In: Euro-
detected as attack class. pean Conference on Machine Learning, pp. 137–142. Springer,
True Negative (TN): Number of records correctly detected Berlin.https://doi.org/10.1007/bfb0026683
as normal class. 5. www.kdnuggets.com. Accessed 07 Sept 2018
123
Arabian Journal for Science and Engineering (2020) 45:2371–2383 2383
6. Huang, C.L.; Wang, C.J.: A GA-based feature selection and 19. Azad, C., & Jha, V. K. (2019). Decision tree and genetic algo-
parameters optimization for support vector machines. Expert Syst. rithm based intrusion detection system. In: Proceeding of the
Appl. 31(2), 231–240 (2006). https://doi.org/10.1016/j.eswa.2005. Second International Conference on Microelectronics, Computing
09.024 & Communication Systems (MCCS 2017), pp. 141–152. Springer,
7. Burges, C.J.: A tutorial on support vector machines for pattern Singapore
recognition. Data Min. Knowl. Discov. 2(2), 121–167 (1998). 20. Tiwari, A.; Ojha, S. K.: Design and analysis of intrusion detection
https://doi.org/10.1023/A:1009715923555 system via neural network, svm, and neuro-fuzzy. In: Abra-
8. Smola, A.J.; Ovari, Z.L.; Williamson, R.C.: Regularization with ham, A., Dutta, P., Mandal, J., Bhattacharya, A., Dutta, S. (eds)
dot-product kernels. In: Advances in Neural Information Process- Emerging Technologies in Data Mining and Information Secu-
ing Systems, pp. 308–314 (2001) rity, Advances in Intelligent Systems and Computing, vol. 755.
9. https://nlp.stanford.edu/IR-book/html/htmledition/nonlinear- pp. 49–63. Springer, Singapore (2019)
svms-1.html. Accessed 09 Aug 2018 21. Li, X.: Support vector machine based intrusion detection method
10. Fischetti, M.: Fast training of support vector machines with Gaus- combined with nonlinear dimensionality reduction algorithm.
sian kernel. Discrete Optim. 22, 183–194 (2016). https://doi.org/ Sens. Transducers 159(11), 226 (2013)
10.1016/j.disopt.2015.03.002 22. Li, Y.; Xia, J.; Zhang, S.; Yan, J.; Ai, X.; Dai, K.: An effi-
11. Xue-qin, Z., Chun-hua, G., & Jia-jun, L. (2006). Intrusion detection cient intrusion detection system based on support vector machines
system based on feature selection and support vector machine. In: and gradually feature removal method. Expert Syst. Appl. 39(1),
2006 First International Conference on Communications and Net- 424–430 (2012). https://doi.org/10.1016/j.eswa.2011.07.032
working in China, pp. 1–5. IEEE.IEEE. https://doi.org/10.1109/ 23. Parwekar, P., & Singhal, R. (2014). Robot assisted emergency
chinacom.2006.344739 intrusion detection and avoidance with a wireless sensor network.
12. Peddabachigari, S.; Abraham, A.; Grosan, C.; Thomas, J.: Mod- In: Proceedings of the International Conference on Frontiers of
eling intrusion detection system using hybrid intelligent systems. Intelligent Computing: Theory and Applications (FICTA) 2013,
J. Netw. Comput. Appl. 30(1), 114–132 (2007). https://doi.org/10. pp. 417–422. Springer, Cham.
1016/j.jnca.2005.06.003 24. Bhavsar, Y.B.; Waghmare, K.C.: Intrusion detection system using
13. Wang, K., Stolfo, S. J. (2003) One-class training for masquerade data mining technique: support vector machine. Int. J. Emerg. Tech-
detection. In: Workshop on Data Mining for Computer Security, nol. Adv. Eng. 3(3), 581–586 (2013). https://doi.org/10.17485/ijst/
Melbourne, Florida Nov 19, pp. 10–19 2017/v10i14/93690
14. Zhou, G.; Shrestha, A.: Efficient intrusion detection scheme based 25. http://nsl.cs.unb.ca/NSL-KDD/. Accessed 09 Jan 2018
on SVM. J. Netw. 8(9), 2128–2134 (2013). https://doi.org/10.4304/ 26. Tavallaee, M., Bagheri, E., Lu, W., & Ghorbani, A. A. (2009). A
jnw.8.9.2128-2134 detailed analysis of the KDD CUP 99 data set. In: 2009 IEEE Sym-
15. Li, L., Gao, Z. P., & Ding, W. Y. (2010). Fuzzy multi-class support posium on Computational Intelligence for Security and Defense
vector machine based on binary tree in network intrusion detec- Applications, pp. 1–6. IEEE.https://doi.org/10.1109/cisda.2009.
tion. In: 2010 International Conference on Electrical and Control 5356528
Engineering, pp. 1043–1046. IEEE.IEEE. https://doi.org/10.1108/ 27. Hsu, C.W.; Lin, C.J.: A comparison of methods for multiclass sup-
ics-04-2013-0031 port vector machines. IEEE Trans. Neural Netw. 13(2), 415–425
16. Cuong, T. D., & Giang, N. L. (2012). Intrusion detection under (2002). https://doi.org/10.1109/72.991427
covariate shift using modified support vector machine and modi- 28. Friedman, J. H. (1996). Another approach to polychotomous
fied backpropagation. In: Proceedings of the Third Symposium on classification. Technical Report, Statistics Department, Stanford
Information and Communication Technology, pp. 266–271. ACM. University.
https://doi.org/10.1145/2350716.2350756 29. Kumar, P.A.R.; Selvakumar, S.: Distributed denial of service
17. Parwekar, P.; Satapathy, S. C.: Leveraging Bigdata Towards attack detection using an ensemble of neural classifier. Com-
Enabling Analytics Based Intrusion Detection Systems in Wire- put. Commun. 34(11), 1328–1341 (2011). https://doi.org/10.1016/
less Sensor Networks. CSI Communications, 12 (2012) j.comcom.2011.01.012
18. Horng, S.J.; Su, M.Y.; Chen, Y.H.; Kao, T.W.; Chen, R.J.; Lai,
J.L.; Perkasa, C.D.: A novel intrusion detection system based on
hierarchical clustering and support vector machines. Expert Syst.
Appl. 38(1), 306–313 (2011). https://doi.org/10.1016/j.eswa.2010.
06.066
123