0% found this document useful (0 votes)
22 views6 pages

Comparison of Classification Techniques For Intrus

Uploaded by

Như Vũ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views6 pages

Comparison of Classification Techniques For Intrus

Uploaded by

Như Vũ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/286733595

Comparison of classification techniques for intrusion detection dataset using


WEKA

Conference Paper · May 2014


DOI: 10.1109/ICRAIE.2014.6909184

CITATIONS READS

38 1,296

2 authors:

Tanya Garg Surinder Khurana


Gulzar Group of Institutes Central University of Punjab
4 PUBLICATIONS 51 CITATIONS 5 PUBLICATIONS 38 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Tanya Garg on 02 February 2018.

The user has requested enhancement of the downloaded file.


IEEE International Conference on Recent Advances and Innovations in Engineering (ICRAIE-2014), May 09-11, 2014, Jaipur, India

Comparison of Classification Techniques for


Intrusion Detection Dataset Using WEKA

Tanya Garg (M.Tech Student) Surinder Singh Khurana (Assistant Professor)


Centre for Computer Science & Technology Centre for Computer Science & Technology
Central University of Punjab, Bathinda Central University of Punjab, Bathinda
India India
[email protected] [email protected]

Abstract-As the network based applications are growing evaluate performance of classifiers. In this work, NSL-KDD
rapidly, the network security mechanisms require more compatible classification algorithms have been evaluated
attention to improve speed and precision. The ever evolving using WEKA tool. The performance of the classifiers have
new intrusion types pose a serious threat to network security.
been measured by considering Accuracy, Roc value, Kappa,
Although numerous network security tools have been
Training time, Mean absolute error, FPR and Recall value.
developed, yet the fast growth of intrusive activities is still a
Ranks have also been assigned to these algorithms by
serious issue. Intrusion detection systems (IDSs) are used to
detect intrusive activities on the network. Machine learning applying Garret's ranking technique [9].
and classification algorithms help to design "Intrusion In this paper, initially, WEKA tool and various
Detection Models" which can classify the network traffic into classification algorithms have been discussed in section II
intrusive or normal traffic. In this paper we present the and III respectively. Chosen dataset has been introduced in
comparative performance of NSL-KDD based data set section IV. In section V, the parameters considered to
compatible classification algorithms. These classifiers have evaluate the performance of classifiers have been discussed.
been evaluated in WEKA (Waikato Environment for
Results are reported in Section VI and conclusions are
Knowledge Analysis) environment using 41 attributes. Around
mentioned in Section VII.
94,000 instances from complete KDD dataset have been
included in the training data set and over 48,000 instances have
been included in the testing data set. Garrett's Ranking II. WEKA (WAII<ATO ENVIRONMENT FOR
Technique has been applied to rank different classifiers KNOWLEDGE ANALYSIS)
according to their performance. Rotation Forest classification
approach outperformed the rest. WEI<A is a Data Mining, Machine Learning Tool which
was first implemented in the University of Waikato, New
Keywords--Machine Learning; Classification Techniques;
Zealand in 1997 [1]. It is a collection of number of Machine
NSL-KDD Dataset; Data Mining; WEKA; Network Intrusion
Learning and Data Mining algorithms. This software is
Detection Dataset; Garret's Ranking Technique.
written in Java language and contains a GUI Interface to
interact with data Files. It contains 49 data pre-processing
I. INTRODUCTION tools, 76 classification algorithms, 15 attribute evaluators
and ten search algorithms for feature selection [2]. It
Due to huge volume of data on network the data content contains three algorithms to find association rules. It has
is vulnerable to various kinds of attacks i.e. increase in three Graphical User Interfaces: "The Explorer", "The
intrusions is growing day by day. Intrusion detection is very Experimenter" and "The Knowledge Flow." The WEKA
important to prevent the intruders to break into or misuse supports data stored in ARFF file format. ARFF stands for
your system. To defend against various network attacks and Attribute Relation File Format. It also includes tools for
computer viruses a lot of methods have been developed. visualization. It has a set of panels that can be used to
Among these methods NID (Network Intrusion Detection) perform specific tasks. WEKA provides the capability to
has been considered as the most promising method to develop and include the new Machine Learning algorithm in
protect from complex and dynamic intrusion behaviors [1]. it. The algorithms can be directly applied to dataset.
Intrusion Detection Systems classifies data into various
categories for example normal and intrusive. Various III. CLASSIFICATION ALGORITHMS
classification algorithms have been proposed to design an
effective Intrusion Detection Model. The performance of a Classification algorithms also known as classifiers are
classifier is an important factor affecting the performance of used to classify the network traffic as normal or an
Intrusion Detection Model. Hence the selection of accurate intrusion. There are basically eight categories of classifiers
classifier helps to improve the performance of intrusion and each category contains different machine learning
detection system. There are several performance matrices to algorithms. In this section these categories have been briefly
introduced.

[978-1-4799-4040-0/14/$31.00 ©2014 IEEE]


IEEE International Conference on Recent Advances and Innovations in Engineering (ICRAIE-2014), May 09-11, 2014, Jaipur, India

Bayes Classifier: They are also known as Belief Networks, Rules Classifier: In this category of classifier, assocIatIOn
belongs to the family of probabilistic Graphical Models rules are used for correct prediction of class among all the
(GM'S) [3], These graphical models are used to represent attributes and those correct predictions are called as
knowledge about uncertain domains, Random variables are coverage and it is expressed in terms of percentage of
denoted by nodes in the graph and probabilistic accuracy. They may predict more than one conclusion.
dependencies are assigned as weights to the edges Rules are mutually exclusive. These are learned one at a
connecting corresponding random variable nodes. These time [1], There are II classifiers under this category out of
types of classifiers are based upon the idea of predicting the which 8 are compatible with our dataset that are:
class on the basis of value of members of the features. This Conjunctive Rule, Decision Table, DTNB, JRip, OneR,
category has 13 classifiers out of which 3 classifiers (Bayes Zero R, Part, Ridor.
Net, NaiVe Bayes and NaiVe Bayes Updateable) are
compatible with the chosen dataset. Trees: These are popular classification techniques in which
a tlow- chart like tree structure is produced as a result in
Function Classifier: Functional Classifier uses the concept which each node denotes a test on attribute value and each
of neural network and regression [I]. They maps input data branch represents an outcome of the test. They are also
to output. There are eighteen classifiers under this category known as Decision Trees. The tree leaves represents the
out of which only RBF Network and SMO classifiers are classes that are predicted. They design a model that is both
compatible with our dataset RBF classifiers can model any predictive and descriptive. There are 16 classifiers under
nonlinear functions easily. It does not use raw input data. this category out of which 10 are compatible with our chose
The processing of RBF Networks is like neural networks i.e. dataset that are: Decision Stump, j48, j48 graft, LAD Tree,
iterative in nature. The problem with RBF is the tendency to NB Tree, REP Tree, Random Forest, Simple Cart, Random
over train the model [4]. Tree, User Classifier.

Lazy Classifier: To construct the classification model lazy IV. DATA SET DESCRIPTION
classifiers demand to store complete training data i.e. such
The dataset used for experimental evaluation consists of
classifiers do not support inclusion of new samples in
randomly selected instances from NSL-KDD Dataset. It
training set while building the model. These types of
classifies network traffic in five categories. The number of
classifiers are simple and effective. Lazy classifiers are each class instances included in training and testing dataset
mainly used for classification on data streams [5], There are are mentioned in table-I.
five classifiers under this category out of which two are
compatible with our dataset that are: IB1 and IBK.
TABLE 1. DESCRIPTION OF DATASET

Meta Classifier: These types of classifiers find the optimal


Class Type Instances in Instances in
set of attributes to train the base classifier with these Training Dataset Testing Dataset
parameters [6], This trained base classifier will be used for
Normal 48522 1478
further predictions. There are 26 classifiers under this
Dos 41699 37490
category out of which 21 are compatible with our dataset:
Probe 3608 8301
AdaBoost M1,LogistBoot, Attribute Selection Classifier,
U2R 21 28
Bagging ,Dagging Classification via Clustering,
R2L 50 1075
Classification via regression, End Multiclass
Multischeme , Grading, Vote , Ordinal Class Classifier ,
Rotation Forest , Random Subspace , CV Parameter
V. PERFORMANCE METRICS
Selection , Raced Incremental Logi Boost , Random
Committee , Stacking , Stacking C. The performance metrics used to evaluate classification
techniques are:
Mi Classifier: Mi stands for Multi- Instance Classifiers. This
Confusion Matrix: It contains information about actual and
category of classifier consists of 12 classifiers out of which
predicted classifications done by a classification system.
no classifier is compatible with our dataset. Mi classifier is
TABLE II. CONFUSION MATRIX
variant of supervised learning technique. It has multiple
instances in an example but can only observe one class [7].
Predicted
These type of classifiers are originally made available
through a separate software package. Normal Attack
Actual
Nonnal TP FN

Misc Classifier: There are three classifiers under this Attack FP TN

category out of which two are compatible with our dataset.


These compatible classifiers are Hyperpipes and VFI. False positive (FP): It defines the number of detected
attacks which are actually normal.
IEEE International Conference on Recent Advances and Innovations in Engineering (ICRAIE-2014), May 09-11, 2014, Jaipur, India

False negative (FN): It means wrong prediction i.e. it


means it detects instances a normal but in actual they are Training Time
attacks. 0.13
0.14
True positive (TP): It means instances that are correctly

I
0.12
predicted as normal. 0.1
0.07
0.08 0.06
True negative (TN): It means instances that are
0.04

I
0.06

I
correctly classified or detected as attack. 0.03

I
0.04
Accuracy: It is the percentage of correct predictions. On the
basis of Confusion Matrix it is calculated by using the
0.02
0 •
formula below: <- e."
10.0 q) e."
�'" �e.
Accuracy= TP+TN/n
�q; �
�� q)«.Q �e.
Here n is total number of instances. �o "!o..e. �"c.;
�<l. -<:-.:,.,� .§
Mean Absolute Error: It is the mean of overall error made G �

by classification algorithm. Least the error and best will be �
the classifier.
TPR: True Positive Rate is same as accuracy so we have not
Fig. I. The graph for training time results for top five
considered this metrics.
classifiers is
FPR: False Positive Rate is calculated by using the formula:
FPR=FPITN+FP
Recall: It is the proportion of instances belonging to the
positive class that are correctly predicted as positive. FPR
Recall=TPITP+FN 0.006

Precision: It is a measure which estimates the probability O.oos.


0.005
that a positive prediction is correct 0.004
Precision =TPITP+FP 0.004
0.003
Training time: It is the time taken by Classifier to build the 0.003

I
model on dataset. It is usually measured in seconds. 0.002
0.002
Kappa: Its value ranges from 0 to 1. 0 means totally

I
0.001
disagreement and I means full agreement. It checks the 0.001
reliability of Classifying algorithm on dataset.
ROC (Receiver Operating Characteristics): It is used to
0 II
Rotation Random NB Tree SMO Decision
design the curve between TPR and FPR and the area under
Forest Forest: Table
curve is called as AUC gives the value of ROC. More the
Fig. 2. The graph representing top five classifiers on the
area under curve and more will be the value of ROC.
basis of value of False Positive Rate is

VI. DISCUSSIONS AND RESULTS

We have performed experiments to evaluate the


performance of all the compatible classifiers on our chosen ACCURACY
NSL-KDD Dataset. These classifiers have been evaluated in
96.5 96.4
WEKA (Waikato Environment for Knowledge Analysis) 96.4

I
environment using 41 attributes. Around 94,000 instances 96.3
96.2 96.14· 96.12 96.11
from complete KDD dataset have been included in the 96.08
96.1

I I I
training data set and over 48,000 instances have been
included in the testing data set. Garrett's Ranking
96
95.9 I
Technique has been applied to rank different classifiers q,
ee �
"<t>' </?'"
e
-<.,...
according to their performance. The figures I to 3 ..,0 ...cf ':X�
summarizes the performance of top 5 classifiers according .,;::.o� 'bo�
'bo
� ,--0 /� �-.,;
�7><:"
to Training time, FPR and Accuracy. ��'I> 4'1><::< o�
q,.1>�

Fig. 3. The graph representing top five classifiers on the


basis of accuracy
IEEE International Conference on Recent Advances and Innovations in Engineering (ICRAIE-2014), May 09-11, 2014, Jaipur, India

TABLE III, RANKING OF CLASSIFIERS

Name of Classifier ROC FPR Accura Kappa Mean Absolute Recall Precisi Training RANK
Area cy Error on Time

Rotation Forest 0.998 0.001 96.4 0.9053 0.0173 0.964 0.983 342.81 I
Random Tree 0.980 0.002 96.14 0.8993 0.0154 0.961 0.979 0.57 2
Random Committee 0.996 0.002 96.11 0.8982 0.0181 0.961 0.982 6.53 3
Random Forest 0.966 0.002 96.12 0.8932 0.0203 0.961 0.979 6.29 4
IBK 0.993 0.002 96.08 0.8973 0.157 0.961 0.977 0.04 5
Random Sub Space 0.999 0.002 96.07 0.8967 0.0227 0.961 0.971 18.71 6
IBI 0.979 0.002 96.08 0.8973 0.157 0.961 0.977 0.07 7
Part 0.976 0.001 95.41 0.8818 0.0188 0.954 0.978 20.16 8
Jrip 0.994 0.002 95.24 0.8778 0.0193 0.952 0.978 69.25 9
NB Tree 0.997 0.003 95.26 0.8774 0.018 0.953 0.956 195.06 10
J48 0.975 0.002 94.81 0.8672 0.0212 0.948 0.972 12.17 II
VFI 0.992 0.006 95.15 0.8765 0.2496 0.952 0.962 0.31 12
Simple Cart Classifier 0.977 0.002 94.76 0.8661 0.0218 0.948 0.972 146.45 13
Ridor 0.972 0.002 94.7 0.8646 0.0212 0.947 0.977 116.29 14
REP Tree 0.993 0.003 94 0.8493 0.0249 0.94 0.957 2.89 15
END 0.991 0.002 94.73 0.8653 0.1415 0.947 0.977 161.77 16
J48 Graft 0.974 0.002 94.34 0.8562 0.023 0.943 0.974 19.12 17
Classification via 0.997 0.004 94.07 0.85 0.0278 0.941 0.973 94.95 18
Regression

Ordinal Class Classifier 0.973 0.003 94.46 0.8589 0.0233 0.945 0.976 43.16 19
Bagging 0.997 0.003 93.86 0.8453 0.0235 0.939 0.952 24.28 20
Logistic 0.981 0.002 94.48 0.8616 0.0233 0.945 0.966 356.05 21
Dagging 0.996 0.004 91.92 0.8032 0.2447 0.919 0.971 42.6 22
Meta Logi boost 0.997 0.004 90.359 0. . 7706 0.0633 0.904 0.948 56.85 23
Raced Incremental Logi 0.985 0.006 91.61 0.7962 0.0353 0.916 0.943 10.47 24
Boost

SMO 0.995 0.004 93.01 0.8266 0.243 0.93 0.966 222.87 25


Attribute Selection 0.977 0.012 84.36 0.6613 0.0693 0.844 0.94 6.14 26
Classifier

RBF Network 0.988 0.011 89.7 0.7557 0.1451 0.897 0.929 104.18 27
Nai"veBayes Updateable 0.959 0.017 78.624 0.5725 0.0815 0.786 0.922 0.72 28
Nai"ve Bayes 0.959 0.017 78.624 0.5725 0.0815 0.786 0.922 0.75 29
Decision Table 0.977 0.005 85.45 0.6838 0.0881 0.855 0.971 59.86 30
LAD Tree 0.989 0.016 87.45 0.7112 0.0561 0.874 0.907 302.28 31
Multiclass 0.982 0.006 86.39 0.6991 0.2873 0.864 0.949 220.5 31
Hyper pipes 0.912 0.016 47.73 0.1797 0.3132 0.477 0.819 0.06 32
One R 0.793 0.019 60.4 0.3827 0.1584 0.604 0.924 0.66 33
Vote 0.5 0.031 3.055 0 0.3513 0.031 0.001 0.03 34
Zero R 0.5 0.031 3.055 0 0.3513 0.031 0.001 0.03 34

CV Parameter Selection 0.5 0.031 3.055 0 0.3513 0.031 0.001 0.04 35


DTNB 0.973 0.009 73.84 0.5214 0.1143 0.738 0.948 1567.11 36
Multi scheme 0.5 0.031 3.05 0 0.3513 0.031 0.001 0.13 37
Decision Stump 0.713 0.078 20.21 0.1162 0.3089 0.202 0.064 1.08 38
Stacking 0.5 0.031 3.055 0 0.3513 0.031 0.001 0.32 39
Grading 0.5 0.031 3.055 0 0.3878 0.031 0.001 0.52 40
Ada Boost MI 0.951 0.078 20.214 0.1162 0.2883 0.202 0.0604 10.56 41
Classification via 0.567 0.067 20.12 0.1235 0.3195 0.201 0.073 9.03 42
Clustering

Stacking c 0.5 0.031 3.055 0 0.3513 0.031 0.001 1.82 43


Conjunctive Rule 0.713 0.078 20.21 0.1163 0.3085 0.202 0.064 27.26 44
User Classifier 0.5 0.031 3.055 0 0.3513 0.031 0.001 15.32 45
IEEE International Conference on Recent Advances and Innovations in Engineering (ICRAIE-2014). May 09-11. 2014, Jaipur, India

VII. CONCLUSION [19] Rich, E., Artificial Intelligence (3rd Ed.). Tata Mc Graw Hill, 2013.

[20] Pang-Ning, T. v., Introduction to Data Mining. Pearson. 2013.


[21] Alpaydin, E. , Introduction to Machine Learning (2nd Ed.). PHI, 2010.

In this work, performance of the NSL-KDD dataset [22] Flach, P., Machine Learning the Art and Science of Algorithms that
Make sense of Data. Cambridge, 2012.
compatible classification algorithms has been evaluated.
[23] J iawei Han, M. K. , Data Mining Techniques and Concepts (3rd Ed.).
According to their performance, ranl(s have been assigned to
Morgan Kauffman, 2013.
these classification algorithms by applying Garret's ranking
[24] Ian H. Witten, F. E. , Practical Machine Learning Tools and
technique. Equal importance has been given to all the Techniques (2nd ed.), 2012.
performance metrics considered for evaluation. Rotation [25] V. Labatut and H. Cherifi, Evaluation of Performance Measurse for
Forest classification approach outperformed the rest. One Classifiers Performance

can choose the classifier based upon this ranking or by


modifying this ranking by assigning different weightage to
performance metrics according to the requirement.

REFERENCES

[1] R. Dash, Selection of the Best Classifier from Diflerent Datasets


Using WEKA, HERT, Vo1.2 Issue 3, March 2013.

[2] H. Nguyen and D. Choi, Application of Data Mining to Network


Intrusion Detection: Classifier Selection Model, @Springer Verlag
Berlin Heidelberg, 2008.

[3] F. Ruggeri, F. Faltin and R. Kennet, Bayesian Networks,


Encyclopedia of Statistics in Quality & Reliability, Wiley & Sons
(2007).

[4] M. Panda and M. Patra, A Comparative Study of Data Mining


Algorithms For Network Intrusion Detection, IEEE, First
International Conference on Emerging Trends in Engineering and
Technology." 2008.

[5] M. Panda and M. Patra, Ensembling Rule Based Classifiers for


Detecting Network Intrusions, IEEE Conference on Advances in
Recent Technologies in Communication and Computing, 2009.

[6] B. Neethu" Classification of Intrusion Detection Dataset using


machine learning Approaches, IJECSE,2013.

[7] Multi-instance Classifiers at http://weka.wikispaces.comlMulti­


instance+classification.
[8] C. Elkan, Evaluating Classifiers, 2012.

[9] Garret's Ranking Technique at


''http://shodhganga. inflibnet. ac. in/bitstreaml10603/3455/7/07_chapter
%203.pdf

[10] S. Garcia and F. Herrera, An Extension on "Statistical Comparisons


of Classifiers over Multiple Data Sets" for all Pairwise Comparisons"
, Journal of Machine Learning Research 9, 2008.

[11] 1. Demsar, Statistical Comparison of Classifiers over Multiple


Datasets, Journal of Machine Learning Research 7, 2006.

[12] M. Othman and T. Yau, Comparison of Different Classification


Techniques using WEKA for Breast Cancer, 2012.

[13] Kdd cup 99 intrusion detection data set..Online


AvailabIe:http://kdd.ics.uci. edu/databases/kddcup99/kddcup99.html.

[14] L. Kuncheva, "Combining Pattern Classifiers: Methods and


Algorithms (Kuncheva, L1; 2004) [book review]." Neural Networks,
IEEE Transactions on18.3 (2007): 964-964.

[15] P. Sangkatsanee, N. Wattanapongsakorn and C. Charnsripinyo.


"Practicle Real- time intrusion detection using Machine learning
approaches." Computer Communications (2011): 2227-2235.

[16] S. Yang, K. Chi Chang , H. Wei and C. Lin. "Feature weighting and
selection for a real-time network intrusion detection system based on
GA with KNN." Intelligence and Security Informatics (2008): 195-
204.

[17] S. Mukherjee and N. Sharma "Intrusion Detection using Naive Bayes


Classifier with Feature Reduction" Procedia Technology 4 ( 2012 )
119 - 128

[18] S�ija, A. A, Knowledge Based Systems. Jones and Bartlett, 2012.

View publication stats

You might also like