0% found this document useful (0 votes)
22 views15 pages

Abadeh 2007

The document describes a fuzzy genetics-based learning algorithm for detecting intrusions in a computer network. It tests the algorithm on the 1998 DARPA intrusion detection evaluation program dataset, which contains information on normal network behavior and different types of intrusive behavior. The algorithm uses a genetic algorithm and Michigan-style learning approach to generate and optimize fuzzy rules for classifying network connections as normal or abnormal based on various connection features. The results show the performance of the generated fuzzy rules in detecting intrusions in a computer network.

Uploaded by

ichbal dasilva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views15 pages

Abadeh 2007

The document describes a fuzzy genetics-based learning algorithm for detecting intrusions in a computer network. It tests the algorithm on the 1998 DARPA intrusion detection evaluation program dataset, which contains information on normal network behavior and different types of intrusive behavior. The algorithm uses a genetic algorithm and Michigan-style learning approach to generate and optimize fuzzy rules for classifying network connections as normal or abnormal based on various connection features. The results show the performance of the generated fuzzy rules in detecting intrusions in a computer network.

Uploaded by

ichbal dasilva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

ARTICLE IN PRESS

Journal of Network and


Computer Applications 30 (2007) 414–428
www.elsevier.com/locate/jnca

Intrusion detection using a fuzzy genetics-based


learning algorithm
M. Saniee Abadeha,, J. Habibia, C. Lucasb
a
Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
b
Department of Electrical Engineering, University of Tehran, Tehran, Iran
Received 21 September 2004; received in revised form 15 May 2005; accepted 27 May 2005

Abstract

Fuzzy systems have demonstrated their ability to solve different kinds of problems in
various applications domains. Currently, there is an increasing interest to augment fuzzy
systems with learning and adaptation capabilities. Two of the most successful approaches to
hybridize fuzzy systems with learning and adaptation methods have been made in the realm of
soft computing. Neural fuzzy systems and genetic fuzzy systems hybridize the approximate
reasoning method of fuzzy systems with the learning capabilities of neural networks and
evolutionary algorithms. The objective of this paper is to describe a fuzzy genetics-based
learning algorithm and discuss its usage to detect intrusion in a computer network.
Experiments were performed with DARPA data sets [KDD-cup data set. http://kdd.ics.u-
ci.edu/databases/kddcup99/kddcup99.html], which have information on computer networks,
during normal behaviour and intrusive behaviour. This paper presents some results and
reports the performance of generated fuzzy rules in detecting intrusion in a computer network.
r 2005 Elsevier Ltd. All rights reserved.

Keywords: Intrusion detection; Fuzzy logic; Genetic algorithm; Rule learning

Corresponding author. Tel.: +98 9133117532; fax: +98 216019246.


E-mail addresses: [email protected] (M.S. Abadeh), [email protected] (J. Habibi),
[email protected] (C. Lucas).

1084-8045/$ - see front matter r 2005 Elsevier Ltd. All rights reserved.
doi:10.1016/j.jnca.2005.05.002
ARTICLE IN PRESS
M.S. Abadeh et al. / Journal of Network and Computer Applications 30 (2007) 414–428 415

1. Introduction

The number of intrusions into computer systems is growing. The reason


is that new automated hacking tools are appearing every day, and these
tools along with various system vulnerability information are easily available
on the web. The problem of intrusion detection has been studied extensively
in computer security (Heady et al.; Amoroso, 1999; Allen et al., 1999;
Axelsson, 2000), and has received a lot of attention in machine learning and
data mining (Sundar et al., 1998; Crosbie, 1995; Lee et al., 1998). Basically,
there are two models of intrusion detection (Axelsson, 2000): Anomaly
Detection: This model first builds the normal profile that contains metrics
derived from the system operation. While monitoring the system, current
observation is compared with the normal profile in order to detect changes
in the patterns of utilization or behaviour of the system. Signature or Misuse
Detection: This technique relies on patterns of known intrusions to match and
identify intrusions. In this case, the intrusion detection problem is a classification
problem.
The technique which we have used to detect intrusion in a computer
network is based on fuzzy genetic learning. Fuzzy systems based on fuzzy
if-rules have been successfully used in many applications areas (Sugeno, 1985;
Lee, 1990). Fuzzy if–then rules were traditionally gained from human
experts. Recently, various methods have been suggested for automatically
generating and adjusting fuzzy if–then rules without using the aid of human
experts (Wangm and Mendel, 1992; Ishibuchi et al., 1992; Abe and Lan, 1995;
Mitra and Pal, 1994). Genetic algorithms (Holland, 1975; Goldberg, 1989)
have been used as rule generation and optimization tools in the design of
fuzzy rule-based systems (Ishibuchi et al., 1999; Herrera and Verdegay, 1995;
Carse et al., 1996; Valenzuela-Rendon, 1991; Ishibuchi et al., 1995; Ishibuchi
and Nakashima, 1999). Those GA-based studies on the design of fuzzy
rule-based systems are usually referred to as fuzzy genetics-based machine
learning methods (fuzzy GBML methods), each of which can be classified
into the Pittsburgh or Michigan approach as non-fuzzy GMBL methods.
Many fuzzy GMBL methods (Ishibuchi et al., 1999; Herrera and Verdegay, 1995;
Carse et al., 1996) are categorized as the Pittsburgh approach (Smith, 1980)
where a set of fuzzy if–then rules is coded as an individual. Some studies
(Valenzuela-Rendon, 1991; Ishibuchi et al., 1995; Ishibuchi and Nakashima, 1999)
are categorized as the Michigan approach (i.e., classifier systems Holland, 1975;
Goldberg, 1989; Booker et al., 1989) where a single fuzzy if–then rule is coded as an
individual. In this paper we have used the Michigan approach (Fig. 1) to detect
intrusion in a computer network.
This paper is organized as follows: First we discuss intrusion detection and the
data set which we have used to test the presented learning algorithm. In the next
section we propose the fuzzy genetics-based learning algorithm. The following
section will discuss the experimental results which we have obtained. In the last
section of the paper we derive some conclusions.
ARTICLE IN PRESS
416 M.S. Abadeh et al. / Journal of Network and Computer Applications 30 (2007) 414–428

Classifier System

Rule Generation
Mechanism
Credit
Rule-based Apportionment
Rule System
System Base

Inference
Engine

Input Output
Facts Interface
Interface

Perceptions
Environment Actions

Payoff

Fig. 1. Learning with the Michigan approach (Cordon et al., 2004).

2. Related work

Detecting unauthorized use, misuse and attacks on information systems is defined


as intrusion detection (Denning, 1987; Kumar and Spafford, 1994). The most well-
known method to detect intrusions is using audit data generated by operating
systems and by networks. Since almost all activities are logged on a system, it is
possible that a manual inspection of these logs would allow intrusions to be detected.
It is important to analyze the audit data even after an attack has occurred, for
determining the extent of damage occurred, this analysis helps in attack trace back
and also helps in recording the attack patterns for future prevention of such attacks.
An IDS can be used to analyze audit data for such insights. This makes IDS a
valuable real-time detection and prevention tool as well as a forensic analysis tool.
Soft computing techniques are being widely used by the IDS community due to
their generalization capabilities that help in detecting known and unknown
intrusions or the attacks that have no previously described patterns. Earlier studies
have utilized a rule-based approach for intrusion detection, but had a difficulty in
detecting new attacks or attacks that had no previously described patterns
(Anderson et al., 1995; Ilgun, 1993; Lunt et al., 1992; Mukkamala et al., 2002).
Lately, the emphasis is being shifted to learning by examples and data mining
paradigms. Neural networks have been extensively used to detect both misuse and
anomalous patterns (Cannady, 1998; Debar and Dorizzi, 1992; Debar et al., 1992;
Mukkamala and Sung, 2003; Riedmiller and Braun, 1993). Recently, kernel-based
methods, SVMs and their variants are being used to detect intrusions. Many
researchers used data mining techniques to identify key patterns that help in
ARTICLE IN PRESS
M.S. Abadeh et al. / Journal of Network and Computer Applications 30 (2007) 414–428 417

detecting intrusions (Jianxiong and Bridges, 2000; Moller, 1993; Steinberg et al.,
1999).
Distributed agent technology is being suggested by a few researchers to overcome
the inherent limitations of the client–server paradigm and to detect intrusions in real
time (Crosbie and Spafford, 1995; Dasgupta, 1999; Helmer et al., 2003; Porras and
Neumann, 1997).

3. Intrusion dataset

In the 1998 DARPA (KDD-cup data set) intrusion detection evaluation


programme, an environment was set up to get raw TCP/IP dump data for a
network by simulating a typical US Air Force LAN. The LAN was operated like a
real environment, but was blasted with several attacks. For each TCP/IP connection,
41 various quantitative and qualitative features were extracted. Of this database, a
subset of 494 021 data were used, of which 20% represent normal patterns. The four
different categories of attack patterns are as follows (Srinivas et al., 2004). It is
important to mention that in this paper, we have demonstrated the capability of the
suggested learning method to detect abnormal behaviours via normal behaviours.
The operation of presented method on detection of intrusion type is left as our future
work.

3.1. Probing

Probing is a class of attacks where an attacker scans a network to gather


information or find known vulnerabilities. An attacker with a map of machines and
services that are available on a network can use the information to look for exploits.
There are different types of probes: some of them abuse the computer’s legitimate
features; some of them use social engineering techniques. This class of attacks is the
most commonly heard and requires very little technical expertise.

3.2. Denial of service (DoS) attacks

DoS is a class of attacks where an attacker makes some computing or memory


resource too busy or too full to handle legitimate requests, thus denying legitimate
users access to a machine There are different ways to launch DoS attacks: by abusing
the computer’s legitimate features; by targeting the implementations bugs; or by
exploiting the system’s misconfigurations. DoS attacks are classified based on the
services that an attacker renders unavailable to legitimate users.

3.3. User to root attacks

User to root exploits are a class of attacks where an attacker starts out with access
to a normal user account on the system and is able to exploit vulnerability to gain
root access to the system. Most common exploits in this class of attacks are regular
ARTICLE IN PRESS
418 M.S. Abadeh et al. / Journal of Network and Computer Applications 30 (2007) 414–428

buffer overflows, which are caused by regular programming mistakes and


environment assumptions.

3.4. Remote to user attacks

A remote to user (R2L) attack is a class of attacks where an attacker sends packets
to a machine over a network, then exploits machine’s vulnerability to illegally gain
local access as a user. There are different types of R2U attacks: the most common
attack in this class is done using social engineering.

4. Fuzzy genetics-based learning

In this section, we will discuss the Fuzzy Genetics-based Learning method. Note
that the mentioned learning method has been used for classification problems
(Ishibuchi et al., 1995; Ishibuchi and Nakashima, 1999; Smith, 1980; Booker et al.,
1989; Cordon et al., 2004; Ishibuchi and Murata, 1999). In this paper, we have used
this method to develop our intrusion detection system.
First, let us explain the method of coding fuzzy rules. Each fuzzy if–then rule is
coded as s string. The following symbols are used for denoting the five linguistic
values and ‘‘don’t care’’ (Fig. 2):
For example, the following fuzzy if–then rule is coded as ‘‘1#4#’’: If x1 is small and
x2 is don’t care and x3 is medium large and x4 is don’t care, then Class C j with
CF ¼ CFj.
The outline of the learning system can be written as follows:

(1) Generation of an initial population of fuzzy if–then rules.


(2) Evaluation of fuzzy if–then rule in the current population.
(3) Genetic operations to generate new fuzzy if–then rules for the next population.
(4) Replacing a pre-specified number of fuzzy if–then rules in the current population
with the newly generated rules.
(5) Termination test. If the algorithm is not terminated, (2)–(5) are repeated.

Let us explain each of the above steps briefly in the following subsections. The
fundamental of the learning process is according to Ishibuchi and Murata (1999).
Membership
Membership

1.0 1.0
S MS M ML L DC
0.0 xi 0.0 xi
0.0 1.0 0.0 1.0
Attribute value Attribute value

Fig. 2. The used antecedent fuzzy sets in this paper (Ishibuchi and Murata, 1999). 1: Small, 2: medium
small, 3: medium, 4: medium large, 5: large, #: don’t care.
ARTICLE IN PRESS
M.S. Abadeh et al. / Journal of Network and Computer Applications 30 (2007) 414–428 419

The main idea of this paper is about the fitness function which will be
discussed in this section. (In the following subsections, we have assumed that we
are solving a classification problem with c classes, n continues attributes and m
training patterns.)

4.1. Generation of an initial population

Let us denote the number of fuzzy if–then rules in each population by


N pop (i.e., N pop is the population size). To construct an initial population,
N pop fuzzy if–then rules are generated randomly. This means that we assign
one of the 6 linguistic value symbols to each of 41 places in the IF part for each
fuzzy rule.

4.2. Evaluation of fuzzy if– then rules

The evaluation procedure which evaluates fuzzy if–then rules capable of


classifying input patterns into two normal and abnormal classes is as follows (note
that this procedure consists of two steps. The first step is assigning a class and
certainty grade to each fuzzy rule. This step is based on a method which is
introduced in (Ishibuchi and Murata, 1999). The second step is assigning a fitness
value to each fuzzy rule):

1. Calculate the compatibility grade of each training pattern xp with the fuzzy if–then
rule Rj by the product operation as
mRj ðxp Þ ¼ mAji ðxp1 Þ      mAjn ðxpn Þ; p ¼ 1:2; . . . ; m, (1)
where mAji ð:Þ is the membership function of Aji .
2. Calculate the sum of compatibility grades for each class as follows:
X
bClassh ðRj Þ ¼ mRj ðxp Þ; h ¼ 1; 2; . . . c. (2)
xp 2Classh

3. Find the consequent class C j that has the maximum value of bClassh ðRj Þ among the
c classes:
bclassCj ðRj Þ ¼ maxfbclass1 ðRj Þ; :::; bclassc ðRj Þg. (3)
When a single class has the maximum value in (3), that class is
used as the consequent class of the fuzzy if–then rule Rj . That is, the consequent
class C j is demonstrated by (3). If two or more classes have the same
maximum value, the consequent class cannot be uniquely specified. In this
case, we assign as empty class to C j (i.e. C j ¼ f) and the zero certainty
grade to CF j (i.e., CF j ¼ 0). Such a fuzzy if–then rule with an empty
consequent class and a zero certainty grade is referred to as a dummy rule in
this paper.
ARTICLE IN PRESS
420 M.S. Abadeh et al. / Journal of Network and Computer Applications 30 (2007) 414–428

4. When the consequent class C j is determined by (3), the certainty grade CF j is


specified as
bClassC ðRj Þ  b
CF j ¼ Pc j , (4)
h¼1 bClassh ðRj Þ

where
P
haC j bClassh ðRj Þ
b¼ . (5)
c1

To evaluate each of the fuzzy rules in the population, we need a fitness function.
The main idea of this paper is focused on a new fitness function which is as follows:
X
fitnessðRj Þ ¼ SRPP ¼ PPF R
p .
j
(6)
p2ClassC j

In (6) SRPP denotes the name of the fitness function suggested in this paper. SRPP
is stands for Single Rule Positive Power which means the power of a single rule
which can classify a pattern correctly when we do not consider the existence of other
rules. This number can be calculated by summarization of PPF for all of the training
patterns. PPF is the Positive Power Factor and is calculated according to
( )
1 mRj ðxp Þ40
Rj
PPF ¼ . (7)
0 else

We have also used NCP as the fitness function (Ishibuchi and Murata, 1999), to
compare its performance to SRPP.

4.3. GA operators

A pair of fuzzy if–then rules is selected from the current population to generate
new fuzzy if–then rules for the next population. Each fuzzy if–then rule in the current
population is selected by the following selection probability:
fitnessðRÞ  fitnessmin ðsÞ
PðRÞ ¼ P ,
Rk S ffitnessðRk Þ  fitnessmin ðsÞg

where fitnessmin ðsÞ is the minimum fitness value of the fuzzy if–then rules in the rule
set S. This procedure is iterated until a pre-specified number of pairs of fuzzy if–then
rules are selected (note that the selection method is the proportionate selection or the
roulette wheel method).
From each of the selected pairs of fuzzy if–then rules, two fuzzy if–then rules are
generated by the uniform crossover for the antecedent fuzzy sets. The consequent
class of each of the generated fuzzy rules is determined by the method which is
illustrated before. Each antecedent fuzzy set of the generated fuzzy if–then rules by
the crossover operation is randomly replaced with a different fuzzy set with the
ARTICLE IN PRESS
M.S. Abadeh et al. / Journal of Network and Computer Applications 30 (2007) 414–428 421

mutation probability. The uniform crossover and the random mutation method are
presented in Fig. 4.

4.4. Replacement

A certain percentage of fuzzy if–then rules (say, Prep ) in the current population are
replaced with new fuzzy if–then rules generated by the crossover and mutation
operations. In our fuzzy classifier system, all the fuzzy if–then rules in the current
population are arranged in decreasing order of the fitness values, then the last Prep 
N pop rules are replaced with the new fuzzy if–then rules. That is, the worst Prep  N pop
rules with the smallest fitness values are removed from the current population and
the new fuzzy if–then rules are added.

4.5. Termination test

We can use various stopping criteria for terminating the execution of our fuzzy
classifier system. In computer simulations shown in the next section, we used the
total number of generations as a stopping criterion (Fig. 3).

* * * *
Parent 1 4 3 # 1 5 # # 2

Parent 2 2 # # # 3 # 1 #

Crossover

*
Offspring 1 2 3 # # 5 # 1 #
* *
Offspring 2 4 # # 1 3 # # 2

Mutation

*
Offspring 1' 2 3 1 # 5 # 1 #
* *
Offspring 2' 4 # # # 3 # 4 2

Fig. 3. Illustration of genetic operations used in the fuzzy classifier system (Ishibuchi and Murata, 1999).
ARTICLE IN PRESS
422 M.S. Abadeh et al. / Journal of Network and Computer Applications 30 (2007) 414–428

5. Experiments

In our experiments, we perform two-class classification. The training data set


contains 988 randomly generated points from the two classes, with the number of
data from each class proportional to its size. The normal data belong to class 1 and
abnormal data belong to class 2. A different randomly selected set of 9880 points of
the total data set (494 021) is used for testing different fuzzy genetic learning
techniques.
In this section, we will compare the performance of the fuzzy genetics-based
learning algorithm using two different fitness functions. The first fitness function is
SRPP which is suggested in this paper and the second one is NCP (Ishibuchi and
Murata, 1999). This section consists of three subsections. In the first and second
subsections we discuss about the performance of the learning algorithms by the use
of different Prep values. We will compare the performance of the genetic learning
algorithm using the two fitness functions within the last subsection. Also, we will
investigate the performance of the two fuzzy genetics-based learning algorithms
(SRPP and NCP) near other similar methods for Intrusion Detection. In this section,
we have TO consider the following assumptions:
N pop ¼ 50,
Maximum generation (Termination condition) ¼ 100, Prep (Replacement
percentage) ¼ 20, 50 and 80%, Crossover rate ¼ 90%, Mutation rate ¼ 10%.

5.1. The Learning Algorithm and SRPP method

If we use SRPP as our fitness function in the genetic algorithm which learns fuzzy
rules, we are focusing on single rules which are capable of classifying normal or
abnormal behaviors. Fig. 4 shows the performance of the learning algorithm by the
use of SRPP as the fitness function.
As we can see in Fig. 4 the detection rate of the learning algorithm for both train
and test data for replacement percentage 20% is more than 50% and 80%.

5.2. The Learning Algorithm and NCP Method

Using NCP as our fitness function in the genetic algorithm which learns fuzzy
rules will lead to a classifier which can derive fuzzy rules capable of detecting
intrusion considering the whole classifier. Fig. 5 shows the performance of the
learning algorithm by the use of NCP as the fitness function.
As we can see in Fig. 5 the performance of the learning algorithm for both train
and test data for replacement percentage 20% is more than 50 and 80. It is important
to consider that in Figs. 4 and 5 we have shown the average performance of the two
methods. This means that we have set the performance of the learning algorithm by
running the algorithm 5 times.
By considering Figs. 4 and 5, the final classifier performance decreases as the
replacement percentage increases.
ARTICLE IN PRESS
M.S. Abadeh et al. / Journal of Network and Computer Applications 30 (2007) 414–428 423

99.20 8.00
99.00 7.00

False Alarm Rate


98.80 6.00

Detection Rate
98.60 5.00
98.40 4.00
98.20 3.00
98.00 2.00
97.80 1.00
97.60 0.00
Train 80 Train 80
50 50
Test 20 Test 20 tage
(a) ent Percen
tage (b) Replacem
ent Percen
Replacem

98.00
97.00

Classification Rate
96.00
95.00
94.00
93.00
92.00
91.00
90.00
Train 80
50
Test 20
(c) tage
ent Percen
Replacem

Fig. 4. Performance of the Learning algorithm using SRPP as the fitness function. (a) Detection rate. (b)
False alarm rate. (c) Classification rate.

According to the above discussion, in the next subsection, we will use 20% as the
replacement percentage for both SRPP and NCP methods.

5.3. SRPP and NCP methods comparison

In this subsection, we will compare the performance of the goal classifier using
different fitness functions. We have denoted this in Fig. 6. As we can see in this
figure, the SRPP method outperforms the NCP method because of the following
reasons:

(1) The detection rate of the SRPP method is more than the NCP method.
(2) In the SRPP method the classifier detection rate power increases as genetic
algorithm develops. However, if we use the NCP method the improvement of
classifier performance stops very soon. (Fig. 6, part a)
(3) The searching capability of the SRPP method is much more than the NCP
method, because the distortion in the classifier’s performance changing graph in
the SRPP method is much more than the NCP method (Fig. 6, parts c and d).
This means that in the SRPP method, the classifier tries to find more and various
kinds of classification rules.

According to these reasons we can conclude that the total performance of the
SRPP method is higher than the NCP method.
ARTICLE IN PRESS
424 M.S. Abadeh et al. / Journal of Network and Computer Applications 30 (2007) 414–428

98.70 1.80
1.60
98.60
1.40

False Alarm Rate


Detection Rate
98.50 1.20
1.00
98.40
0.80
98.30 0.60
0.40
98.20
0.20
98.10 0.00
Train 80 Train 80
50 50
Test 20 tage Test 20
ent Percen tage
Replacem ent Percen
(a) (b) Replacem

97.90
97.80

Classification Rate
97.70
97.60
97.50
97.40
97.30
97.20
97.10
Train 80
50
Test 20
tage
ent Percen
(c) Replacem

Fig. 5. Performance of the Learning algorithm using NCP as the fitness function. (a) Detection rate. (b)
False alarm rate. (c) Classification rate.

Table 1 compares the different algorithms performances. According to this table


we can conclude that the total performance of the fuzzy genetics-based learning
algorithms is higher than other algorithms to detect intrusion in a computer
network. Table 1 also shows that if we use the SRPP method as our classifier’s fitness
function instead of the NCP method, although the false alarm rate will be slightly
increased, we will obtain an acceptably higher detection rate.

6. Conclusions and future work

In this paper, the application of fuzzy genetics-based learning methods was


introduced on intrusion detection problem. By computer simulations, a high
performance of these algorithms was demonstrated.
Moreover, the paper suggested a new fitness function called SRPP. The
characteristic features of the proposed fitness function are as follows:

(1) The algorithm is capable of producing fuzzy rules which are more effective for
detecting intrusion in a computer network (Table 1).
(2) The improvement of classifier performance continues as the generations of
genetic algorithm develops (Fig. 6, part a).
ARTICLE IN PRESS
M.S. Abadeh et al. / Journal of Network and Computer Applications 30 (2007) 414–428 425

SRPP
SRPP

NCP NCP

0 10 20 30 40 50 90.00 92.00 94.00 96.00 98.00 100.00


(a) Generation (b)
80% 50% 20% Classification Rate Detection Rate

100 100

80 80

60 60

40 40

20 20

0 0
0 20 40 60 80 100 0 20 40 60 80 100
(c) Generation (d) Generation

Detection Rate False Alarm Rate Classification Rate Detection Rate False Alarm Rate Classification Rate

Fig. 6. SRPP and NCP methods comparison graphs: (a) Best population occurrence graph for different
replacement percentages and fitness functions. (b) Performance comparison for SRPP and NCP methods.
(c) Changes of the classifier performance during the operation of the GA for SRPP method. (d) Changes of
the classifier performance during the operation of the GA for the NCP method.

Table 1
Different algorithms performances comparison

Algorithm False alarm rate % Detection rate % Complexity

SRPP 3.85 99.08 OðnÞ


NCP (Ishibuchi and Murata, 1999) 0.66 98.78 OðnÞ
EFRID (Gomez and Dasgupta, 2001) 7 98.95 OðnÞ
RIPPER-Artificial Anomalies (Fan et al., 2001) 2.02 94.26 Oðn  log2 nÞ
SMARTSIFTER (Yamanishi et al., 2000) 0 82 Oðn2 Þ

(3) The searching capability of the algorithm improves. This result is due to the fact
that the distortion in the classifier’s performance development graph decreases
significantly (Fig. 6 parts c and d).

It is necessary to mention that although using SRPP fitness function increases the
detection rate, it increases the rate of false alarm as well. However, if we combine the
two fitness function methods (NCP and SRPP) in a single classifier, we can use the
advantages of both fitness functions concurrently.
ARTICLE IN PRESS
426 M.S. Abadeh et al. / Journal of Network and Computer Applications 30 (2007) 414–428

Our future work will be about detection of intrusion type, which will change our
classification problem to a multi-class problem. We will also focus on producing
meaningful fuzzy rules from the genetic algorithm. Fuzzy rules which are more
interpretable for a human expert. On the other hand our final goal is to produce
fuzzy IF–THEN rules which are capable of distinguishing each kind of attack
clearly. To perform this, the following criteria should be met (Cordon et al., 2004):

(1) The number of our fuzzy rules should be decreased as much as possible
(2) The IF part of our fuzzy rules should be short.

By performing the above objectives, we will have a compact and interpretable


classification system. This system will be more useful for a human expert.

References

Abe S, Lan M-S. A method for fuzzy rules extraction directly from numerical data and its application to
pattern classification. IEEE Transactions on Fuzzy Systems 1995;3(1):18–28.
Allen J, Christie A, Fithen W, McHugh J, Pickel J, Stoner E. State of the practice of intrusion detection
technologies. Technical report CMU/SEI99-TR-028, ESC-99-028, Carnegie Mellon, Software
Engineering Institute, Pittsburgh, Pennsylvania; 1999.
Amoroso E. Intrusion detection. Intrusion.net Books, January 1999.
Anderson D, Lunt TF, Javitz H, Tamaru A, Valdes A. Detecting unusual program behavior using the
stastistical component of the next-generation intrusion detection expert system (NIDES). SRI-CSL-95-
06. Menlo Park, CA: SRI International; 1995.
Axelsson S. Intrusion detection systems: a survey and taxonomy. Technical report no. 99-15, Department
of Computer Engineering, Chalmers University of Technology, Sweden. March 2000.
Booker LB, Goldberg DE, Holland JH. Classifier systems and genetic algorithms. Artificial Intelligence
1989;40(1–3):235–82.
Cannady J. Artificial neural networks for misuse detection. In: National information systems security
conference, 1998. p. 368–81.
Carse B, Fogarty TC, Muntro A. Evolving fuzzy rule based controllers using genetic algorithms. Fuzzy
Sets and Systems 1996;80(3):273–93.
Cordon O, Gomide F, Herrera F, Hoffmann F, Magdalena L. Ten years of genetic fuzzy systems: current
framework and new trends. Fuzzy Sets and Systems 2004;141:5–31.
Crosbie M. Applying genetic programming to intrusion detection. In: Proceedings of the AAAI 1995 fall
symposium series, November 1995.
Crosbie M, Spafford EH. Defending a computer system using autonomous agents. Technical report CSD-
TR-95-022, 1995.
Dasgupta D. Immunity-based intrusion detection system: a general framework. In: Proceedings of 22nd
the national information systems security conference (NISSC), 1999. p. 147–60.
Debar H, Becke B, Siboni D. A neural network component for an intrusion detection system. In:
Proceedings of the IEEE Computer Society symposium on research in security and privacy, 1992.
p. 240–50.
Debar H, Dorizzi B. An application of a recurrent network to an intrusion detection system. In:
Proceedings of the international joint conference on neural networks, 1992. p. 78–83.
Denning D. An intrusion-detection model. IEEE Transactions on Software Engineering 1987;
SE-13(2):222–32.
Fan W, Lee W, Miller M, Stolfo SJ, Chan PK. Using artificial anomalies to detect unknown and know
network intrusions. In: Proceedings of the first IEEE international conference on data mining, 2001.
ARTICLE IN PRESS
M.S. Abadeh et al. / Journal of Network and Computer Applications 30 (2007) 414–428 427

Goldberg DE. Genetic algorithms in search, optimization, and machine learning. Reading, MA: Addison-
Wesley; 1989.
Gomez J, Dasgupta D. Evolving fuzzy classifiers for intrusion detection. In: Proceedings of the 2002
IEEE workshop on information assurance, United States Military Academy, West Point, NY,
June 2001.
Heady R, Luger G, Maccabe A, Sevilla M. The architecture of a network-level intrusion detection system,
Technical report, CS90-20, Department of Computer Science, University of New Mexico,
Albuquerque, NM 87131.
Helmer G, Wong J, Honavar V, Miller L. Lightweight agents for intrusion detection. Journal of Systems
and Software 2003:109–22.
Herrera ML, Verdegay JL. Tuning fuzzy logic controllers by genetic algorithms. International Journal of
Approximate Reasoning 1995;12(3/4):299–315.
Holland JH. Adaptation in natural and artificial systems. Ann Arbor, MI: University of Michigan Press;
1975.
Ilgun K. USTAT: a real-time intrusion detection system for UNIX. In: Proceedings of the 1993 Computer
Society symposium on research in security and privacy, Oakland, CA, May 24–26, Los Alamitos, CA.
IEEE Computer Society Press; 1993. p. 16–29.
Ishibuchi H, Murata T. Techniques and applications of genetic algorithms-based methods for designing
compact fuzzy classification systems. Fuzzy theory systems techniques & applications, V.3, section 40.
New York: Academic Press; 1999. p. 1081–109.
Ishibuchi H, Nakashima T, Kuroda T. A hybrid fuzzy genetics-based machine learning algorithm:
hybridization of Michigan approach and Pittsburgh approach. In: Proceedings of 1999 IEEE
international conference on systems, man, and cybernetics, vol. I, October 12–15, Tokyo, Japan, 1999.
p. 296–301.
Ishibuchi H, Nakashima T, Murata T. A fuzzy classifier system that generates fuzzy if–then rules for
pattern classification problems. In: Proceedings of second IEEE international conference on
evolutionary computation, Perth, Australia, November, 1995. p. 759–64.
Ishibuchi H, Nakashima T, Muratam T. Performance evaluation of fuzzy classifier systems for multi-
dimensional pattern classification problems. IEEE Transactions on Systems, Man, and Cybernetics
1999.
Ishibuchi H, Nozaki K, Tanaka H. Distributed representation of fuzzy rules and its application to pattern
classification. Fuzzy Sets and Systems 1992;52(1):21–32.
Jianxiong L, Bridges SM. Mining fuzzy association rules and fuzzy frequency episodes for intrusion
detection. International Journal of Intelligence Systems 2000;15(8):687–704.
KDD-cup data set. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
Kumar S, Spafford EH. An application of pattern matching in intrusion detection. Technical report CSD-
TR-94-013, Purdue University; 1994.
Lee CC. Fuzzy logic in control systems: fuzzy logic controller, Part I and Part II. IEEE Transactions on
Systems, Man, and Cybernetics 1990;20(2):404–35.
Lee W, Stolfo SJ, Mok KW. Mining audit data to build intrusion detection models. In: Proceedings of
international conference on knowledge discovery and data mining (KDD’98), 1998. p. 66–72.
Lunt T, Tamaru A, Gilham F, Jagannathan R, Jalali C, Neumann PG, Javitz HS, Valdes A, Garvey TD.
A real time intrusion detection expert system (IDES)—final report. Menlo Park, CA: SRI Inter-
national; 1992.
Mitra S, Pal SK. Self-organizing neural network as a fuzzy classifier. IEEE Transactions on Systems, Man,
and Cybernetics 1994;24(3):385–99.
Moller AF. A scaled conjugate gradient algorithm for fast supervised learning. Neural Networks
1993(6):525–33.
Mukkamala S, Janoski G, Sung AH. Intrusion detection using neural networks and support vector
machines. Proceedings of IEEE international joint conference on neural networks, 2002. p. 1702–7.
Mukkamala S, Sung AH. Feature selection for intrusion detection using neural networks and support
vector machines. Journal of the Transport Research Board National Academy, Transport Research
Record No. 1822 2003:33–9.
ARTICLE IN PRESS
428 M.S. Abadeh et al. / Journal of Network and Computer Applications 30 (2007) 414–428

Porras A, Neumann PG. EMERALD: event monitoring enabling responses to anomalous live distur-
bances. In: Proceedings of the national information systems security conference, 1997. p. 353–65.
Riedmiller M, Braun H. A direct adaptive method for faster back propagation learning: the RPROP
algorithm. Proceedings of the IEEE international conference on neural networks, San Francisco, 1993.
Smith SF. A learning system based on genetic algorithms. Ph.D. dissertation, University of Pittsburgh,
Pittsburgh, PA, 1980.
Srinivas M, Andrew HS, Ajith A. Intrusion detection using an ensemble of intelligent paradigms. Journal
of Network and Computer Applications, 2004, in press, corrected proof, available online 28 February
2004.
Steinberg D, Colla PL, Kerry M. MARS user guide. San Diego, CA: Salford Systems; 1999.
Sugeno M. An introductory survey of fuzzy control. Information Sciences 1985;36(1/2):59–83.
Sundar J, Garcia-Fernandez J, Isaco D, Spafford E, Zamboni D. An architecture for intrusion detection
using autonomous agents, Technical report 98/05, Purdue University, 1998.
Valenzuela-Rendon M. The fuzzy classifier system: a classifier system for continuously varying variables.
In: Proceedings of fourth international conference on Genetic algorithms, University of California, San
Diego, CA, July, 1991. p. 346–53.
Wangm LX, Mendel JM. Generating fuzzy rules by learning from examples. IEEE Transactions on
Systems, Man, and Cybernetics 1992;22(6):1414–27.
Yamanishi K, Jun-ichi Takeuchi, Williams G. On-line unsupervised outlier detection using finite mixtures
with discounting learning algorithms. In: Proceedings of the sixth ACM SIGKDD international
conference on knowledge discovery and data mining, 2000. p. 320–4.

You might also like