2006MONAM
2006MONAM
25
96,98
20
96,96
15
is the lowest one (52 instances over 494, 021). Therefore, it is 96,94
10
difficult to learn this category using neural networks. Our first 96,92 5
96,9 0
goal does not consist in outperforming previous work done 0,4 0,5 0,6 0,7 0,8 0,9 0,4 0,5 0,6 0,7 0,8 0,9
Threshold value Threshold value
over KDD 99 intrusion detection contest. However, we want
to understand why all these algorithms fail to detect the last
two attack classes, namely U2R and R2L. We also note that Fig. 3. Different classes PSP variation according to the considered threshold
the two attacks U2R and R2L are often detected as a normal value.
traffic (86.84% for U2R and 73.20% for R2L) in almost all
the techniques that are used for this purpose. There are many
classes are not predicted in their corresponding actual class
R2L and U2R instances that are new (see Table II) in the test
when considering the threshold equal to 0.70 for R2L and 0.90
data sets since their corresponding attack type is not present
for the Probing attack class. This means that the instances of
in the learning data set.
the test corresponding to these two classes are not well learned
In order to detect these new attacks we improve the clas-
or are not close to their corresponding instances in the learning
sification process of the neural networks as the following. A
data set.
threshold θ is defined. Therefore, if the value of the highest
output neuron is below this threshold, the corresponding Normal(New%) Probing(New%)
1,2 80
connection is considered momentarily anomalous however a 1 70
60
diagnosis should be performed for further investigation. The 0,8
50
0,6 40
diagnosis is not a goal here. Figure 1 presents this algorithm. 0,4 30
20
Figure 2 shows the variation of the percentage of successful 0,2
10
0
versus the variation of the a priori fixed threshold θ. 0,4 0,5 0,6 0,7 0,8 0,9
0
0,4 0,5 0,6 0,7 0,8 0,9
Threshold value Threshold value
DoS(New%) U2R(New%)
PSP% 0,6 25
94,2
0,5 20
94
0,4
15
93,8 0,3
10
93,6 0,2
0,1 5
93,4
93,2 0 0
0,4 0,5 0,6 0,7 0,8 0,9 0,4 0,5 0,6 0,7 0,8 0,9
93 Threshold value Threshold value
0,4 0,5 0,6 0,7 0,8 0,9
Threshold value R2L(New%)
35
30
25
20
Fig. 2. PSP variation according to the considered threshold value. 15
10
5
The results shown in Figure 2 are performed over the same 0
0,4 0,5 0,6 0,7 0,8 0,9
neural network using the best parameters. We mention that Threshold value
C LASSIFICATION USING THE POST PRUNED RULES . STANDARD C4.5 RULES ALGORITHM .
From Table V, the two classes R2L and U2R are badly pre-
Using this principle, a default class from the learning data
dicted. On the other hand, many probing and DoS instances are
set is assigned to any observed instance that may be a normal
misclassified within the normal category. Most misclassified
connection, known or unknown attack. This classification is
instances are predicted as normal. This is due to the supervised
useful only if it is exclusive. Since we are interested in
C4.5rules algorithm that assigns a default class among known
detecting novel attacks this classification would not be able
classes as explained in Section IV-B. We note that the class
to detect new attacks that normally are not covered by any
that has the highest number of uncovered instances according
rule from the tree built during the learning step.
to the different pruned rules in the learning data set is the
To overcome this problem, instances that do not have a
normal class corresponding to the normal traffic.
corresponding class in the training data set are assigned to a
Hence, if a new instance is presented that is different
default class denoted new class. Therefore, if any new instance
(see for instance definition 4.1 below) from all other known
does not match any of the rules generated by the decision tree
normal or abnormal instances in the learning step, it is
then it is classified as a new class instead of assigning it to a
automatically classified as the default class normal.
default class. Let us call this algorithm enhanced C4.5.
Definition 4.1: An instance A is different from all other
To illustrate the effectiveness of this new classification, we
instances present in the training data set, according to the
conduct, in Section IV-C, our experiments on the KDD 99
different generated rules, if none of the rules matches this
database since it contains many new attacks in the test data set
instance.
that are not present in the training data set as shown in Table
The confusion matrix obtained when we use the enhanced
II. On the other hand, we applied this technique to a real
C4.5rules algorithm that considers the default class as a new
traffic in our laboratory network. This traffic contains some
instance is presented in Table VI.
new attacks that were not available when DARPA98 was built
such as the slammer worm and the different DDoS attacks. Predicted %Normal %Probing %DoS %U2R %R2L %New
These experiments shown the effectiveness of our algorithm. Actual
Normal(60,593) 99.43 0.40 0.12 0.01 0.00 0.04
We do not present them here because of space limitation (for Probing (4,166) 8.19 72.73 2.45 0.00 6,58 10.06
more details, see [2], Chapter 5). DoS (229,853) 2.26 0.06 97.14 0.00 0.18 0.36
U2R (228) 21.93 4.39 0.44 7.02 5.26 60.96
This proposal may be generalized to any problem similar R2L (16,189) 79.41 14.85 0.00 0.70 2.85 2.20
to the KDD 99 contest that seeks to find new instances in the P SP = (92.30 + 0.57)%, CP T = 0.2228
test data set where some classes should be detected as new TABLE VI
ones but not as one of the categories listed in the training data C ONFUSION MATRIX WHEN USING THE GENERATED RULES FROM THE
set. The fact that new attacks are not considered is one of the ENHANCED C4.5 ALGORITHM .
reasons that does not enable the different methods applied to
KDD 99 contest to predict any new attack.
By using the enhanced C4.5 algorithm, the detection rate of
C. Experimental Analysis of KDD 99 the U 2R class is increased by 60.96% (corresponding to the
We first present the different experiments and results ob- httptunnel attack) which decreases the false negative rate of
tained when using the different rules generated from the this class from 82.89% (189/228) to 21, 93% (50/228). The
standard C4.5 algorithm. Applying this algorithm, a default detection rate of the Probing class is also enhanced by 10, 06%
corresponding to 413 instances which are not classified as a instances. Unfortunately, the C4.5 induction algorithm has
normal traffic but as a new class. We note that the different efficiently learned the different instances of the training set,
ratios presented in Table VI are the same as those in Table according to Table VII, but could not classify new instances,
V except the normal column where the corresponding ratios for the moment, into their appropriate category according to
have decreased from Table V to VI. This is expected since bad results that are reported in Table V.
the normal class is the default class, whereas in the second We also examined in details the classification of the new
experiment all the instances that are classified using the default instances belonging to the R2L class presented in Table II;
class are classified in the new class. namely {named, sendmail, snmpgettattack, snmpguess, worm,
We should mention that the highest ratio for the U2R class xlock, xsnoop}. Table VIII presents the confusion matrix
has never exceeded 14% according to the different results corresponding to these new R2L attacks in the test data set.
available in the literature. Using our approach, this attack class
is detected as an abnormal traffic with a detection rate of Predicted %Normal %Pro- %DoS %U2R %R2L %New
Actual bing
67.98%. The false positive rate is increased by a small ratio named (17) 70.59 0.00 0.00 0.00 0.00 29.41
corresponding to 24 instances (0.04%). However, the false sendmail (17) 100 0.00 0.00 0.00 0.00 0.00
snmpget- 100 0.00 0.00 0.00 0.00 0.00
negative rate of the R2L class remains stable. attack(7,741)
We also performed two different tests to check the coher- snmpguess (2,406) 99.88 0.04 0.00 0.00 0.00 0.08
worm (2) 100 0.00 0.00 0.00 0.00 0.00
ence of the learning and test databases of KDD 99. xlock (9) 100 0.00 0.00 0.00 0.00 0.00
In the first case, we use the default training data set of xsnoop (4) 50.00 0.00 0.00 25.00 25.00 0.00
KDD 99 as the training data set and in the second test we P SP ' 0.00% (P SP ' 0.00%)
use the test data set as the training set. In each test, we TABLE VIII
examine the percentage of successful prediction (PSP) using C ONFUSION MATRIX RELATIVE TO NEW R2L ATTACKS USING THE
the learning data set of each test as a test set. The objective ENHANCED C4.5 ALGORITHM .
of this analysis is to help us discover whether the two data
sets (learning and test data sets) are incoherent. Therefore, the
different prediction ratios of the different data sets may help us From Table VIII, there is only one instance of type xsnoop
to find out whether the enhanced C4.5 algorithm we proposed that is classified properly as R2L attacks and another in the
is inefficient or the different KDD 99 data sets present some U 2R class and one instance of type snmpguess is classified
anomalies such as incoherence. as a probing attack and these are common results of the two
Definition 4.2: A database is said coherent if all the training algorithms standard C4.5 and enhanced C4.5. However, there
instances characterized by the same attributes’ values belong are only two instances of type snmpguess that are classified
to the same class. It is said incoherent if there are at least as new attacks and five others of type named.
two instances having the same attributes values but different All the remaining instances concerning the new R2L attacks
classes. are predicted as normal connections, i.e 10, 186 (resp. 10, 193)
Table VII presents the confusion matrix obtained from using the enhanced C4.5 algorithm (resp. the standard C4.5
testing the enhanced C4.5 algorithm over the training data set algorithm).
as a learning and a testing data set. The false negative rate of the new R2L attacks present in the
test data set is about 99.10% (resp. 99.97%) for the enhanced
Predicted %Normal %Probing %DoS %U2R %R2L %New C4.5 algorithm (resp. the standard C4.5 algorithm).
Actual
Normal(97,278) 99.94 0.01 0.00 0.00 0.00 0.05 These results show that these new R2L connections are not
Probing (4,107) 0.17 99.78 0.00 0.00 0.00 0.05 distinct from the normal connections issued after transforma-
DoS (391,458) 0.00 0.00 99.99 0.00 0.00 0.01
U2R (52) 1.92 1.92 0.00 90.39 0.00 5.77
tion done by MADAM/ID.
R2L (1,126) 0.62 0.00 0.00 0.09 98.93 0.36 In the second test, we invert the two databases.Using the
P SP = 99.99% standard and the enhanced C4.5 algorithms, we obtained the
TABLE VII confusion matrix presented in Table IX.
C ONFUSION MATRIX OBTAINED USING THE ENHANCED C4.5 ALGORITHM
Predicted %Normal %Probing %DoS %U2R %R2L %New
ON THE INITIAL KDD 99 LEARNING DATABASE .
Actual
Normal(60,593) 98.34 0.02 0.03 0.01 1.50 0.11
Probing (4,166) 0.19 99.35 0.07 0.00 0.00 0.38
DoS (229,853) 0.01 0.00 99.99 0.00 0.00 0.00
We notice that the different classes are predicted with high U2R (228) 2.19 0,00 0.00 96.93 0.00 0.88
rates using the learning database to construct the tree and to R2L (16,189) 36.40 0,02 0.01 0.05 63.33 0.19
generate the different rules. The successful prediction ratio is P SP = 97.70%
P SP = 99.99%. TABLE IX
In the field of supervised machine learning techniques, a C ONFUSION MATRIX RELATIVE TO FIVE CLASSES USING THE RULES
method is said powerful if it learns and predicts the different GENERATED BY THE ENHANCED C4.5 ALGORITHM OVER THE LEARNING
instances of the training set with a low detection error and DATABASE OF THE SECOND TEST.
then generalizes its knowledge to predict the class of new
0,udp,snmp,SF,105,146,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,
0.00,0.00,255,254,1.00,0.01,0.00,0.00,0.00,0.00,0.00,0.00,snmpgetattack. the neural networks are very interesting for generalization
0,udp,snmp,SF,105,146,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00, and very poor for new attacks attack detection, the decision
0.00,0.00,255,254,1.00,0.01,0.00,0.00,0.00,0.00,0.00,0.00,normal.
trees have proven their efficiency in both generalization and
TABLE X new attacks detection. The results obtained with these two
SNMPGETATTACK ATTACK AND NORMAL CONNECTION SIMILARITY.
techniques outperform the winning entry of the KDD 99 data
intrusion detection contest. Another interesting point done here
is the introduction of the new class to which new instances
Although the percentage of successful prediction rate, from should be classified for anomaly intrusion detection using
confusion matrix IX, is P SP = 97.70%, it is considered supervised machine learning techniques. Since the different
very low since it consists in classifying the known labeled MADAM/ID programs [10] are not available and present
instances of the learning data set. This rate is considered very many shortcomings, we have written the different programs
low in the machine learning domain because it could not learn that transform tcpdump traffic into connection records. The
the instances whose classes are known a priori. This means objective of our contribution in this paper is twofold. It first
that the C4.5 algorithm failed to learn instances with their consists in extending the notion of anomaly intrusion detection
appropriate labels. On the other hand, the R2L class is highly by considering both normal and known intrusions during
misclassified. The classifier has learned only 63.33% from all the learning step. The second is the necessity to improve
the R2L labeled instances. machine learning methods by adding a new class into which
Most misclassified R2L instances are predicted as normal novel instances should be classified since they should not be
connections. This result justifies our observation stated in the classified as any of the known classes present in the learning
first test: i.e. after transformation, the new R2L attacks are not data set. As future work, we are investigating the use of this
distinct from the normal connections. technique with explicit or semi explicit alert correlation tools.
Since there are similarities between many attack connec- Since these tools do not deal with unknown attacks, we are
tions and many normal connections, the question one has to currently investigating their extension to handle these new
ask is why different attacks have the same attributes as those attacks generated by the new anomaly detection to integrate
of the normal connections? The corresponding tcpdump traffic them in the ongoing correlation attack scenarios.
of the different attacks is similar to that of normal connections
or the transformation done over these data sets is incorrect? ACKNOWLEDGMENT
All instances of snmpgetattack are predicted as normal This work was funded by the RNRT OSCAR and ACI
(within R2L class in Table VIII). Indeed, the snmpgetattack DADDi projects.
traffic is recognized as normal because the attacker logs in
R EFERENCES
as he were a non malicious user since he has guessed the
password. Table X shows that the connections corresponding [1] J. P. Anderson. Computer Security Threat Monitoring and Surveillance.
Technical report, James. P. Anderson Co., Fort Washington, Pennsylva-
to the snmpgetattack are the same as those of the normal nia, 1980.
traffic. However, the snmpguess category should be recognized [2] Y. Bouzida. Principal Component Analysis for Intrusion Detection and
as a new attack or as a dictionary attack. Unfortunately, there Supervised Learning for New Attack Detection. PhD Thesis, March
2006.
is not any attribute among the 41 attributes to test the SNMP [3] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification
community password in the SNMP request as it is the case and Regression Trees. 1984.
with some attributes that verify if it is a root password or a [4] J. Cannady. Artificial Neural Networks for Misuse Detection. In Pro-
ceedings of the 1998 National Information Systems Security Conference
guest password. This is considered only in the case of telnet, (NISSC’98), Arlington, VA, USA, October 5-8 1998.
rlogin, etc., services. The corresponding connections of the [5] H. Debar, M. Becker, and D. Siboni. A neural network component
snmpguess category are the same as those of the normal traf- for an intrusion detection system. In Proceedings of the 1992 IEEE
Symposium On Research in Computer Security and Privacy, Oakland,
fic after transformation using MADAMA/ID programs [10]. CA, May 1992.
Hence, some interesting information, with which we might [6] D. Denning. An Intrusion Detection Model. IEEE Transactions on
have distinguished the traffic, generated by the snmpguess Software Engineering, 13(2):222–232, 1987.
[7] C. Elkan. Results of the KDD’99 Classifier Learning. ACM SIGKDD,
attack with the normal traffic is lost after transformation. We 1:63–64, 2000.
set necessary conditions that should be satisfied by a rich [8] E. B. Hunt. Concept Learning: An Information Processing Problem.
transformation function to prevent these similarities (for more Wiley, 1962.
[9] KDD 99 Task. Available at: http://kdd.ics.uci.edu/
details, see [2]). databases/kddcup99/task.html, 1999.
[10] W. Lee. A Data Mining Framework for Constructing Features and
V. C ONCLUSION Models for Intrusion Detection Systems. PhD Thesis, June 1999.
[11] T. M. Mitchell. Machine Learning. McGraw Hill, 1997.
In this paper, we investigated two different techniques for [12] J. R. Quinlan. Induction of decision trees. Machine Learning, 1:1–106,
anomaly intrusion namely neural networks and decision trees. 1986.
[13] J. R. Quinlan. C4.5: Programs for machine learning. Morgan Kaufmann
These two techniques fail to detect new attacks that are Publishers, 1993.
not present in the training data set. We improve them for [14] D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning represen-
anomaly intrusion detection and test them over the KDD 99 tations by back-propagating errors. Nature, 323:533–536, 1986.
data sets and over real network traffic in real time. While