0% found this document useful (0 votes)
38 views7 pages

Identifying Important Characteristics in The KDD99 Intrusion Detection Dataset by Feature Selection Using A Hybrid Approach

UJH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views7 pages

Identifying Important Characteristics in The KDD99 Intrusion Detection Dataset by Feature Selection Using A Hybrid Approach

UJH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

2010 17th International Conference on Telecommunications

Identifying Important Characteristics in the


KDD99 Intrusion Detection Dataset by Feature
Selection using a Hybrid Approach

Nelcileno Arajo1 Ruy de Oliveira2 Ailton Akira Shinoda3 Bharat Bhargava5


1 3 5
Institute of Computing EdWilson Ferreira4 Department of Electrical Department of Computer
Federal University of Mato 2,4
Department of Informatics Engineering Science
Grosso Federal Institute of Mato State University Jlio de Purdue University
Cuiab, MT, Brazil Grosso Mesquita Filho West Lafayette, IN, USA
[email protected] Cuiab, MT, Brazil Ilha Solteira, SP, Brazil [email protected]
[email protected] [email protected]
[email protected]

Abstract Intrusion detection datasets play a key role in fine only regular patterns. Everything that is not regular is taken as
tuning Intrusion Detection Systems (IDSs). Using such datasets anomalous and consequently may be linked to an intrusion [2].
one can distinguish between regular and anomalous behavior of a
given node in the network. To build this dataset is not Comparing the two IDS approaches, one can say that the
straightforward, though, as only the most significant features of misuse detection provides accurate results in recognizing
the collected data for detecting the nodes behavior should be patterns, but it is limited to the known attacks. This means
considered. We propose in this paper a technique for selecting new attacks that are not included in the signature database
relevant features out of KDD99 using a hybrid approach toward cannot be detected. On the other hand, the anomaly detection
an optimal subset of features. Unlike existing work that only based approach provides good performance in detecting new
detect attack or no attack conditions, our approach efficiently
identifies which sort of attack each register in the dataset refers
forms of attacks, but gives high false positive rates (false
to. The evaluation results show that the optimized subset of alarms), due to the difficult of characterizing a practical
features can improve performance of typical IDSs. normal behavior pattern for the nodes in the network. In fact,
regardless of the IDS approach in place, for the sake of the
Keywords: KDD99. Feature Selection, Hybrid Approach, K-Means, reliability, it is needed to choose appropriate detection metrics
Information Gain Ratio to either represent the attack pattern efficiently or define the
regular behavior expected for the network.
I. INTRODUCTION
In order to choose proper intrusion detection metrics,
Over the past ten years, the number of security related several training datasets for IDSs have been created. One of
incidents registered at CERT.br (Center for Studies, Answers the most popular such a dataset is the Knowledge Discovery
and Handling of Security related Incidents in Brazil) has and Data Mining KDD99 [3], which was developed, by the
increased about 100-fold [1]. This demonstrates the inherent Massachusetts Institute of Technology - MIT, during the
vulnerability of the Internet, which calls for permanent international competition on data mining in 1999. In this
development of efficient security mechanisms. As a result, dataset each connection (TCP Connection) is represented by
various security tools, such as firewall, cryptography, and 41 features, but experiments have shown that using all these
Intrusion Detection Systems (IDSs) have been developed features does not guarantee efficiency for attacks based on the
rendering computing systems more reliable. package contents [4].
In particular, the IDSs have received great attention from With that in mind, this paper proposes optimizing the
researchers all over the globe because of their ability to keep existing metrics in the KDD99 training dataset through a
track of the network behavior, so that abnormal behavior can feature selection technique using a hybrid approach, which
be detected quickly. The detection can occur in two distinct will generate an optimal dataset of features. Differently from
ways. One technique uses previously known attack patterns to existing work, our approach takes into account all the
infer intrusions. This technique is normally called misuse categories of connections in KDD99 (attacks or no attacks),
detection. Another way of detection is called anomaly i.e., Normal, DoS, Probe, U2R, R2L. It is also a purpose of
detection, in which there are no known attack patterns, but this paper to check the impact of using such a dataset on the
IDS accuracy.

U.S. Government work not protected by U.S. copyright


552
The remainder of this paper is organized as follows. In machine is already invaded, but the attacker attempts to
Section 2, we describe the KDD99 intrusion detection dataset, gain access with superuser privilegies.
where the procedures to generate the dataset are discussed.
The files generated during the data collection were put in a
Section 3 addresses the selection of the most relevant features
standard format that contains 41 features for each registered
in KDD99 through a hybrid approach that combines the
connection. A connection here refers to a sequence of TCP
information gain ratio and the k-means classifier toward the
packets with well defined time duration and transmitted over a
optimized dataset. Yet in this section, it is shown comparative
well defined protocol between a source machine and a
results on detection accuracy for ten distinct datasets
destination machine [3]. Each connection is labeled as either
generated from the so called 10%KDD99" training dataset. In
normal or under a specific sort of attack. Each connection
Section 4, we conclude the work and outline suggestions for register is about 100 bytes long.
future work.
The combination of the 41 features of each connection
II. KDD99 INTRUSION DETECTION DATASET determines to which of the five connection categories
mentioned above the audited connection belongs to. We call
This dataset is composed of various training and test data this procedure categorization of the connections. Accordingly,
for IDSs. It was developed from a project at MIT Lincoln to better understand the contribution of each of the features
Labs, in 1999, where comparative evaluations among several within the dataset to this categorization, they were gathered
distinct methodologies for intrusion detection were conducted. into four groups [2], [3], [4], as follows:
Fig. 1 illustrates the simulated network topology used for
KDD99. It is a fictitious military network with three target- Basic features identify the properties in the packet
machines running various operating systems and services. header, which represent critical metrics in a
Moreover, there are three additional machines for generating connection.
traffic from different sources. The Sniffer captures the
network flow in TCP Dump format. The simulation ran for Content features these are information extracted
seven weeks. from the packets that are only useful for experts who
are able to associate them to known forms of attacks.
Example of such metrics is the number of non-
authorized access attempts into a given machine.
Time based traffic features show the features that
occurred in a traffic profile computed during a time
interval of two seconds. Crucial information related to
some sort of attacks can only be obtained if the time
duration is taken into consideration. A good example
here is the number of connections to a single machine
in a time interval of two seconds.
Host based traffic features In this case, the metrics,
which show the traffic profile, are calculated from a
historical data that is estimated from the last hundred
used connections. A metric employed in this group is
Figure 1. Topology of the simulated network for KDD99 the number of connections to the same destination
machine.
The logs from the sniffer were divided into five categories
[2], [3], [4]: KDD99 is actually composed of three datasets. The
largest one is called Whole KDD, which contains about
Normal connections that fit the expected profile in the 4 million registers. This is the original dataset created out
military network. of the data collected by the Sniffer.
Denial of Service (DoS) c o n n e c t i o n s t r y i n g to Since the amount of data to be processed is too high, it is
prevent legitimate users from accessing the service in the interesting to reduce the computational costs involved as much
target-machine. as possible. Thus, a subset containing only 10% of the training
data, taken randomly from the original dataset was created.
Scanning (Probe) connections scanning a target
This resulted in the 10% KDD dataset used to train the IDS.
machine for information about potential vulnerabilities.
In addition to the 10% KDD and Whole KDD, there is
Remote to Local (R2L) connections in which the a testing dataset known as Corrected KDD. This dataset
attacker attempts to obtain non-authorized access into a does not have the same distribution of probability of attacks as
machine or network. is the case in the other bases. This happens because the
User to Root (U2R) connection in which a target Corrected KDD includes 14 new types of attacks aiming at

553
checking the IDS performance to unknown forms of attacks. subset having 6 features only. They used the Support Vector
Note that in the complete dataset (Whole KDD) and in the Machines (SVM), Multivariate Adaptive Regression Splines
training dataset (10% KDD) there are 22 types of attacks in (MARS) and Linear Genetic Program (LGP) algorithms to
total [4]. associate a weight to each feature. The Sequential Backward
Search technique was employed in [11] and [12] to indentify
It is also important to mention that the KDDs training the subset of relevant features. In their approach the whole
dataset contains a large number of connections for the dataset is initially used and after each iteration a feature is
categories normal, probe and DoS. They represent removed from the dataset, until the desirable precision for the
approximately 99.76% of the whole dataset. classifier is reached. Another popular approach, known as
hybrid approach, combines the both techniques: filter and
III. OPTIMIZING THE KDD 99 INTRUSION DETECTION wrapper. The work in [6] shows the efficiency of the hybrid
DATASET USING A HYBRID APPROACH FOR FEATURE SELECTION approach with large datasets, in which the calculation demand
of the optimal subset of features is similar to the one of the
In general, it is not a good idea to feed the IDSs learning filter approach. In [5], the authors use the hybrid approach over
mechanisms with the originally collected dataset. It needs to a dataset obtained in an infrastructure wireless network based
be optimized, since there are features that are either irrelevant on the IEEE 802.11 model. They applied the Gain Information
or redundant for the learning algorithm. Without a proper Ratio metric to classify the original dataset of features on the
treatment of the dataset, the detector accuracy is degraded and basis of the reached grade, and a so-called k-means classifier to
the test and training procedures may get really slow [5]. build an optimal subset of features that increases the detector
Hence, it is important to determine an optimal set of features accuracy and at the same time reduces the learning time.
that accurately represents the characteristics of the traffic
being evaluated. Experiments have shown that proper set of B. Proposed Model
features results in up to 50% of time reduction for the IDS Our proposed scheme for feature selection is based on the
test and training phases [6]. hybrid approach published in [5]. Nevertheless, while the
work in [5] evaluates the quality of the optimal subset of
A. Feature Selection features considering only whether the connection is either
Feature selection is crucial for designing the intrusion normal or under attack, our evaluation takes into account all
detection models. In this process, only the most relevant the categories of connections in KDD99, i.e., Normal, DoS,
features are extracted from the whole dataset. This prevents Probe, U2R, R2L. Besides, the captured data in our
the irrelevant features from causing noise in the categorization evaluations were not collected in an infrastructure wireless
of the connections. network but in a wired military-like network. We also used the
two metrics to evaluate the capability of detection of the IDS:
Currently, there exist two main approaches to carry out the detection ratio of the whole dataset and the
feature selection: filter and wrapper. In the former, an acknowledgment accuracy ratio of each connection category.
independent metric, such as correlation and PCA [6], is used
to compute the relevance of a set of features, resulting in the The feature selection algorithm proposed here is shown in
optimal subset of features that contains the important features Fig. 2. Initially, the information gain ratio for each of the 41
classified in accordance with the measured values of the used features of KDD99 are computed, and then ranked in
metric. The latter uses machine learning algorithms for rating accordance to their values. In the sequence, after each
the importance of one or more features in order to build an iteration, the k-means classifier extracts the feature with the
optimal subset of features with the most representative highest IGR from the dataset and assesses the detection rate of
features. Wrapper is more complex in terms of computing the optimal subset of features. Additionally, the accuracy level
than the filter approach, but gives better results [6], [7], [8], in detecting the right category for the connection in the
[9]. optimal subset is verified. The selecting process stops when
either the classifier accuracy is above the adjusted threshold or
These approaches have some drawbacks. For instance, the the accuracy value is below the previous calculated value.
classifier input using random features can result in biased
outcomes, and the search for the optimal set of features can The IGR metric was used here mainly because of its good
result in thousands of combinations in the classifier, which results shown in the filter approach, as well as its low
leads too high computational costs. For example, the KDD99 computational cost [4], [5], [13]. This metric is computed as
dataset encompasses 41 features, and considering all possible shown in (1) [14].
combinations in the classifier to verify which set best
IGR( D, A)
Gain( D, A) (1)
contributed to the detection models, we will have hundreds of
SplitInformation( D, A)
billions of feature combinations that can render the use of the
dataset unviable.
where,
Different techniques have been employed to mitigate the
feature selection problem. In [10], the authors used D training data with N features.
classification algorithms to reduce the set of features out of the A set of features in the dataset.
KDD99 dataset (originally with 41 parameters) into an optimal

554
Dv Dv (4)
Split inf( P) log 2
Algorithm Feature Selection based on IGR/K-means v Attributes( A) D D
Input:
D Training data with N features
IGR Information Gain Ratio
The K-Means algorithm [16] is one of the oldest and more
C k-means Classifier important algorithms available in the literature for performing
AC Current Accuracy grouping. Although it has been published over forty years ago,
AP Previous Accuracy it is still largely used these days. The main reasons for this
Threshold Gain accuracy threshold
Output:
popularity include its simplicity and high performance. K-
Soptimum Optimal subset of features Means complexity is O(nK), being n the cardinality of original
Begin dataset and K is the amount of groups [9]. Besides, K-Means is
//Filter Approach of ease implementation and has been evaluated quite a lot in
For each feature f compute IGR(f)
Classify the features in D based on IGR(f)
recente years, which leveraged the development of various
//Wrapper Approach novelties in the way it works. Because of these characteristics,
Initialize S = EMPTY and AC = 0 noting that K-Means performance over similar tools is much
Repeat better, we have adopted it in our scheme.
AP = AC
f = getNext(D)
Soptimum = Soptimum U {f} C. Experimental Evaluations
D = D {f} In order to evaluate the efficiency of the hybrid approach,
AC = ACCURACY(C,SOPTIMUM)
UNTIL (AC-AP) < threshold or AC<AP using IGK/K-means, on optimizing the KDD99, we used for
End the experiments the parameters setup shown in Table I. The
Figure 2. Feature Selection Algorithm based on IGR/k-means
subset 10% KDD99 was chosen because it was created
exactly to be used in training IDSlearning modules [3]. This
The information gain ratio is a quantitative measure used to subset is composed of approximately 490.000 samples
grade the relevance of the features based on the values of such including all kinds of connection categories defined in KDD99
features in the dataset [15]. Nonetheless, before computing the (Normal, DoS, Probe, U2R, R2L).
information gain ratio, it is necessary to check the noise The feature selection was carried out by the data mining
(misclassification) inserted in the training set. This checking is tool called WEKA [17]. This tool performed efficiently in
called Entropy and is computed using (2). related work such as [5], [6], [10], [11], [12], [13], and so we
n (2) adopted it here as well.
Entropy ( P) pi log 2 pi
i 1
TABLE I. PARAMETERS SETUP FOR THE EXPERIMENTS
where,
Pi probability of a given feature (or attribute) value to be Components Configuration
in the sampled set of the dataset.
Dataset 10% KDD99
n maximum value assigned to a feature. Programming tools WEKA, MS-Excel 2007
After computing the entropy of D, the formula for the Computer Notebook processor Intel Celeron M 440,
information gain ratio in (3) is used to determine the best 1.86GHz, 2GB RAM, 250 GB of hard disc
feature to be used as root. Operating system Microsoft Windows XP Professional (SP2)

Regarding the evaluated scenarios, two distinct scenarios


Dv (3) were considered. The first one was used to optimize the KDD
Gain( D, A) Entropy ( D) Entropy ( Dv) dataset and the second one to check the optimizations effects
v Attribute( A) D
on the performance of an IDS, as follows:
where, Scenario 1: the IGR was applied to the dataset
Dv amount of samples of the dataset that contain 10%KDD99 to measure the relevance of the 41
repetitions of the evaluated feature. features, resulting in a sorted classification. Then, the
K-means classifier is used to compute the optimal
D total samples of the training dataset. subset of features;
The Entropy gives us information about the probability of Scenario 2: the dataset 10%KDD99 was divided
a given feature value to be in a dataset (pi). The split into ten subsets, containing about 49000 registers of
information represents the potential information to be connections each. Subsequently, each subset was
generated by dividing the base D into m subsets, as defined in processed by an IDS based on the decision trees
(4). algorithm, and for this, the optimal subset of features
was used.

555
In Fig. 4(a) one can notice that the best results are obtained
when the optimal subset of features has the 14 most important
features of the evaluated dataset. With less features than that,
the U2R class has accuracy close to zero, which means that
despite the high detection rate depicted in Fig 4(b), the
algorithm does not provide enough accuracy in recognizing
U2R connections. Hence, whenever the optimal subset of
features contains categories of connections with large
percentage of samples, the detection rate is not a good
criterion to use to evaluate the quality of such a subset. For
Figure 3. Descending classification of the IGR for the features in the dataset this evaluated dataset, the DoS category accounts to 80% of
10%KDD99.
the whole sampled connections. The optimal subset of features
In both scenarios, the validation of the results was 10%KDD99 comprises the following features: dst host diff
conducted with the so-called 10-fold-cross technique [18]. srv rate, logged in, dst host srv diff host rate, diff srv rate,
The idea here was to obtain low error rates and find out the destination bytes, root Shell, is guest login, urgent, service, dst
intrusion detection rate. host count, srv diff host rate, source bytes e protocol type.

1) Results for the Scenario 1 2) Results for the Scenario 2

Fig. 3 shows the classification of the 41 features of the The purpose of the second scenario is to provide us with
dataset 10%KDD99 sorted in a descending order through good insights into the effects of an optimized dataset on the
the information gain ratio. Most of the features have IGR performance of an IDS. The dataset 10%KDD99 was
under the average of the dataset, (IGR average = 0,22). In fact, divided into 10 subsets, as depicted in Fig. 5. Each subset has
only 18 features are above the average. This shows that the its own distribution of categories of connections but the
original database has data concentration in a small group of subsets 5 and 6. It is possible to distinguish a pattern in most
values. Features that result in a convergence of connection subsets, since there are a lot more DoS connections registers
categories within a small group of values are little significant than registers of the other connections. This can be interesting
to describe a node behavior. This indicates that the original to evaluate our previous statement that subsets with a strong
dataset may contain irrelevant data for the IDS and so needs to prevalence of a single connection category might render the
be optimized. adjusted detection rate unfeasible.

After obtaining the ranked set of features through the IFG Subsequently, we used the features inside each generated
the optimal subset of features were determined by the k-means optimal subset of features to feed an IDS based on a decision
classifier. After each iteration of the classifier the most trees algorithm. The outcome is shown in Fig. 6 through three
relevant feature, in accordance to the IGR, was added to the parameters: detection rate, accuracy rate and true positive rate.
optimal subset of features. The classifier keeps track of the The false positive parameter was ignored because the assumed
accuracy rate of the connection categories in the new subset, values are too close to zero, which does not contribute to the
and once either the accuracy reaches 90% or it is lower than evaluation of the quality of the optimal subset of features.
the value calculated in the previous iteration the classification
process ends and the optimal subset of features is determined.

(a) Accuracy rate (b) Detection rate

Figure 4. Performance of alternate subsets (optimal subsets) of 10%KDD99 by two stop criteria.

556
Figure 5. Composition of datasets generated from the 10%KDD99.

From the results we find out more about the quality of the main reason lies in the differences found in the weight that
optimal subset of features in terms of connection categories each category of connection in the training dataset has on the
detection. As shown in Fig. 6, the detection rate for all subsets proposed mechanism. Categories with low weight face
surpasses 99%. To ensure reliability in our evaluations, we problems of detection despite the detection rate remains high,
also included the accuracy rate in our evaluations. Fig.6 shows which occurs due to the Giant categories of connections in
that all connections categories provided high accuracy (over place.
90%) except the U2R category that in the best scenario gave
To address this problem, we propose here using jointly the
60% of accuracy. This is a result of the low relevance of such
detection rate and the accuracy rate. By using the dataset
a category in the sample space of each subset, which
features in a fairer way, without favoring any category, the
corroborates our finding that the detection rate parameter does
accuracy rate corrects the distortions caused by the Giant
not impact the evaluation process of the quality of the optimal
categories of connections.
subset of features.
Since the computational cost for large dataset are non-
Finally, the high values for the true positive rate
negligible, and the results here showed that the optimized
strenghtens the viability of the our proposal, as this indicates
dataset provided similar outcome to the original dataset (with
that the IDS is capable of recognizing the connection
41 features), we can say that our proposal is indeed
categories efficiently. It is important to note that in some cases
worthwhile. By using it, an IDS will be trained much faster
the rate is below 80%, which occurred again due to the low
than it would do with the original dataset.
impact of the evaluated category on the sample space of the
dataset. As an example, we have in the category 7 a total of 11 The following tasks are left for future work. Application of
connection of the category Normal but none of them is the technique used here for feature selection in dataset
recognized by the IDS. collected from other network environments such as sensor,
mesh, and WiMax wireless networks. Alternate programming
IV. CONCLUSIONS AND OUTLOOK tools, such as C and FORTRAN, for conducting the feature
selection, as the WEKA [7] algorithm, that was used here for
We have proposed the use of a hybrid approach to select feature selection, is based on JAVA and so demanded too
the best features from the training dataset KDD99 toward a much both memory and processing capabilities of the machine
reduced dataset to improve an IDS efficiency. The hybrid used in the experiments. And finally, the use of Metaheuristics
approach combines the information gain ratio (IGR) and the k- (genetic algorithms, tabu search, and simulated annoling) to
means classifier. The former is responsible for classifying the perform feature selection through the computation of the
features on the basis of IGR measure. The latter generates an optimal subset of features.
optimal subset of features by evaluating the features accuracy
from the ranked data provided by the IGR.
The evaluation results suggest that the detection rate on its
own does not provide reliability in detecting intrusions. The

Figure 6. Results obtained by the decision tree based IDS on the 10 datasets generated from the "10% KDD99".

557
[16] J. Mcqueen, Some methods for classification and analysis of
ACKNOWLEDGMENT multivariate observations, in Proceedings of the Fifth Berkeley
Symposium on Mathematical Statistics and Probability, 1967, pp. 281
This material is based on a research project funded by the 297.
Foundation for Research Support of Mato Grosso (FAPEMAT) [17] R. R. Bouckaert et al., WEKA manual for version 3-7-0.
on the supervision of the Network and Security Research http://www.cs.waikato.ac.nz/ml/weka/, Last Access: August 2009.
Group (GPRS). GPRS is managed by the Federal Institute of [18] Y. Bengio & Y. Grandvalet, No unbiased estimator of the variance of
Mato Grosso (IFMT) in conjunction with the Federal k-fold cross validation, Journal of Machine Learning Research, vol.5,
University of Mato Grosso (UFMT), State University Jlio de pp. 1089-1105, 2004.
Mesquita Filho (UNESP) and Federal University of Uberlandia
(UFU). The authors acknowledge the facilities and equipment
provided by IFMT for the development of this work.

REFERENCES
[1] CERT.br Computer Emergency Response Team Brazil.
http://www.cert.br/stats/incidentes/, Last Access: August 2009.
[2] P. Souza, Study about anomaly based intrusion detection systems: an
approach using neural networks, M.Sc. Thesis, Salvador
University/Salvador, 2008.
[3] R. Lippmann, J. W. Haines, D. J. Fried, J. Korba & K. Das, The 1999
DARPA off-line intrusion detection evaluation, Computer Networks,
vol.34, n.4, pp. 579-595, 2000.
[4] H. G. Kayacik, A. N. Zincir-Heywood & M. I. Heywood, Selecting
features for intrusion detection: a feature relevance analysis on KDD
99, in Proceeding of third annual conference on privacy, security and
trust, 2005.
[5] M. Guennoun, A. Lbekkouri & K. El-Khatib, Optimizing the feature set
of wireless intrusion detection systems, International Journal of
Computer Science and Network Security, vol. 8, n. 10, pp. 127-131,
2008.
[6] Y. Chen, Y. Li, X. Cheng & L. Guo, Survey and taxonomy of feature
selection algorithms in intrusion detection system, Lecture Notes in
Computer Science, vol. 4318, pp. 153-167, 2006.
[7] H. Liu & H. Motoda, Feature selection for knowledge discovery and
data mining, Kluwer Academic, 1998.
[8] R. A. M. Horta & F. J. dos S. Alves, Data mining techniques in feature
selection for prediction of insolvency: implementation and evaluation
using recent brazilian dataset, in Proceedings of the XXXII Meeting of
ANPAD, 2008, pp. 1-15. [Digests XXXII Encontro da ANPAD, 2008, p.
152.]
[9] J. de A. Soares, Preprocessing data in data mining: a comparative study
in inputation, D.Sc. Thesis, Federal University Rio de Janeiro/Rio de
Janeiro, 2007.
[10] H. Sung & S. Mukkamala, The feature selection and intrusion detection
problems, in Proceedings of the 9th Asian Computing Science
Conference, Lecture Notes in Computer Science, 2004, vol. 3321, pp.
468-482.
[11] H. Sung & S. Mukkamala, Identifying Important Features for Intrusion
Detection Using Support Vector Machines and Neural Networks,
in Proceedings of the 2003 Symposium on Applications and the internet,
2003, pp. 209-217.
[12] G. Stein, B. Chen, A. S. Wu & K. A. Hua, Decision tree classifier for
network intrusion detection with GA-based feature selection,
in Proceedings of the 43rd Annual Southeast Regional Conference,
2005, vol. 2, pp. 136-141.
[13] Bsila, S. Gombault. & A. Belghith, Improving traffic transformation to
detect novel attacks, in Proceeding of 4th International Conference:
Sciences of Eletronic, Technologies of Information and
Telecommunications, 2007.
[14] O. Maimom & L. Rokach., Decomposition methodology for knowledge
discovery and data mining theory and applications, World Scientific
Publishing Co, 2005.
[15] T. M. Mitchell, Machine Learning, McGraw Hill, 1997.

558

You might also like