An Evaluation of Negative Selection in An Artificial Immune System For Network Intrusion Detection
An Evaluation of Negative Selection in An Artificial Immune System For Network Intrusion Detection
G en era te
R a nd om M atch D e te cto r S et
no
Figure 1 Gene Expression Process String s
ye s
However, this mechanism introduces a critical problem.
The new antibody can bind not only to harmful antigens
but also to essential self cells. To help prevent such R e ject
serious damage, the human immune system employs
negative selection. This process eliminates immature Figure 2 Detector Set Generation of
antibodies, which bind to self cells passing by the thymus a Negative Selection Algorithm (Forrest et al, 1995)
and the bone marrow. From newly generated antibodies,
only those which do not bind to any self cell are released
N e w Strin gs
from the thymus and the bone marrow and distributed
throughout the whole human body to monitor other living
cells. Therefore, the negative selection stage of the human
immune system is important to assure that the generated D e te cted
D e te ctor S et M a tch
antibodies do not to attack self cells. yes N o n-se lf
1
Available at http://iris.cs.uml.edu:8080/ network.html.
for one specific connection can describe the normal Table 1: Self Profiles
network activity of that connection.
Inter-connection
Thus, in total, self profile fields had 33 different fields for Class Number of
the data set. Even though the network profile fields were Connection
extracted to describe a single connection activity, the data
used in this research was too limited to apply this initial {(2, *), (*, 80)} 5292
profile. The limit was that the data was collected for a {(2, *), (*, 53)} 919
quite short time, around 15~20 minutes. During this brief
period, most different connections were established only {(2, *), (*, 113)} 255
once. An insufficient quantity of data was collected to {(2, *), (*, 25)} 192
build different connection profiles. Therefore, it was
necessary to group different connections into several {(2, *), (*, well-known)} 187
meaningful categories until each category had a sufficient {(2, *), (*, not well-known)} 756
number of connections to build a profile. Consequently, a
total number of connections for each potential profile {(2, 53), (*, *)} 940
category were counted. {(2, 25), (*, *)} 352
First of all, the data was categorised into two different
groups: ‘inter-connection’ and ‘intra-connection’. Inter- {(2, 113), (*, *)} 145
connection was the group of connections that were {(2, well-known), (*, *)} 114
established between internal hosts and external hosts, and
intra-connection was the group of connections that were {(2, not well-known), (*, *)} 6050
established between internal hosts. Furthermore, to Intra-connection
preserve anonymity, all internal hosts had a single fake
address ‘2’ and any extra information about external hosts {(2, *), (2, well-known)} 190
and network topology was not provided. Therefore, the {(2, *), (2, not well-known)} 189
profiles according to specific hosts were insufficient.
Instead, in this research, only the profiles of specific ports
on any hosts were considered. 5 EXPERIMENT OBJECTIVE
According to various possible categories, the Although previous work using a negative selection
established connection number of each profile was algorithm for anomaly detection (Forrest et al 1994;
counted. From each case, apart from a profile class that Dasgupta 1998; Hofmeyr, 1999) showed promising
had more than 100 connections, other profile classes were results, there had been little effort to apply this algorithm
again grouped into other different classes until each class on vast amounts of data. One distinctive feature of a
had more than 100 connections. Finally, 13 different self network intrusion detection problem is that the size of
profiles were built. Their class names and the number of data, which defines “self” and “non-self”, is enormous. In
established connections are shown in table 1. order for this algorithm to be adopted to a network-based
In table1, the class column of inter-connection is IDS, it is important to understand whether this algorithm
shown as: {(a,b),(c,d)}, where ‘a’ is an internal host, ‘b’ is is capable of generating detectors in a reasonable
a internal port number, ‘c’ is a external host address and computing time. In addition, it is essential to examine
‘d’ is an external port number. Hence, the connection is whether its tuning method, which derives an appropriate
established between (a,b) and (c, d). For the class column number of detectors to gain a good non-self detection rate,
of intra-connection, ‘a’ is an internal host address, ‘b’ is works when it is used on the huge size of real network
an internal port number, ‘c’ is an internal host address and data. Therefore, a series of experiments were performed
‘d’ is an internal, port number. * indicates ‘any’ host to investigate these two significant features of the
address and ‘any’ port number. In addition, “well-known” negative selection algorithm.
shows the ports in the range 0 to 1023 are trusted ports.
These ports are restricted to the superuser: a program
must be running as root to listen to a connection. The port 6 DATA AND PARAMETER SETTING
numbers of commonly used IP services, such as ftp,
telnet, http, are fixed and belong to this range. But, many
common network services employ an authentication 6.1 SETTING
procedure and intruders often use them to sniff As presented in section 4, the data used in this work
passwords. It is worthwhile to monitor these ports produced thirteen different self profiles. From 13 different
separately from the other ports. Therefore, if the number self sets, one self set, {(2, *), (*, 25)} in table 1, which
of connections for any profile category, which is based on has relatively smaller number of examples, 192, was
a specific port on any hosts, is not sufficient, these selected for the following experiments. From the total of
categories are regrouped into two new classes, a “well- 192 examples of the selected self profile, 154 examples
known” port and a “not well-known” port. were used for generating detectors and 38 examples were
applied for testing generated detectors. In addition, the avoid the matching a self profile. N r and N r0 in table 2
detectors were tested on five different test sets. The first
follows the same tendency.
four sets were collected when four different intrusions
were simulated (as explained in section 4) and the last set
was created by generating random strings. These five sets Table 2 Number of required detectors, Nr and number of trials
have 273, 190, 1151, 273 and 500 examples respectively. to generate required number of detectors, Nr0 when false
As described in section 3, the negative selection
algorithm used in this paper employed the r-contiguous negative error, Pf, and the threshold r of r-contiguous matching
matching function. For the following experiments, its function are given. These numbers are calculated when a self
matching threshold should be defined. In order to define string length, l = 33, an alphabet cardinality, m = 10 and the
this number, the formulas to approximate the appropriate number of self strings, N S = 192.
number of detectors when a false negative error is fixed
(D’haeseleer, 1997; Forrest et al, 1994) were used. These Pf r =3 r =4
formulas are as follows (Forrest et al, 1994) : N r0 N r0
Nr Nr
Pm m -r [(l r )(m 1) / m 1] . .….. (1) 0.2 51 21953 535 955
1
Pm ……………………….. (2) 0.1 73 31382 766 1366
Ns
0.05 95 40829 997 1777
where, 0.01 146 62765 1532 2733
Pm the matching probability between a detector string and
a randomly chosen self string,
N s the number of self strings, Even though this formula is clearly useful to predict
the appropriate number of detectors and its generation
m the detector genotype alphabet cardinality, number, its predicted number showed how infeasible this
l the detector genotype string length and
approach is when it is applied on a more complicated but
r = the threshold of r-contiguous matching function. more realistic search space. For instance, when the
expected false negative error rate is fixed as 20%, its
predicted detector generation trial number is 51 and the
appropriate number of generated detectors is 21935 for
Since N s , m, l are already known, r can be calculated by the matching threshold is 3. Similarly, when we define the
using equation (1) and (2). The calculated r was used in matching threshold as 4, it predicted 535 for the former
the following equation in order to derive an appropriate and 955 for the latter. In addition, it was observed that
when we fixed the matching threshold number as four and
number of detectors, N r , and a total number of trials to
ran the system, the system could not manage to generate
generate these detectors, N r0 , when the false negative any single valid detector after one day. None of these
error, Pf , is fixed (Forrest et al, 1994). cases seem to provide any feasible test case in terms of
computing time. This results certainly did not follow the
predicted detector generation trial number.
ln P f Thus, for the following experiments, we generated
Nr …………………(3) and
Pm valid detectors by setting a matching threshold number
that allowed a system to generate a valid detector in a
ln Pf reasonable time. It was observed that the average time of
Pm 1 Pm N S …………(4)
N r0 single successful detector generation took about 70sec
CPU time and the average number of trials to generate a
valid detector was 2~3 when a matching threshold was
The selected self set, {(2, *), (*, 25)}in table 1, was used nine. These results were gained after running the negative
for calculating N r and N r0 when Pf is fixed. Table 2 selection algorithm for preliminary experiments. This
number is used as the matching threshold for the
shows calculated N r and N r0 using (3) and (4) when following experiments. The details of these experiment
Pf and r have various values. results are described in the next section.
detection rate. After taking into account practically except intrusion 5. This implies that the collected self and
reasonable time to generate a whole data set, up to 1000 non-self sets perhaps have some overlapping patterns
valid detectors were generated per run. All experiments because they showed quite similar detection rates. Thus
were run on a PC with AMD K6-2 400Mhz processor and generated detector sets completely failed to distinguish
128M RAM. the hidden self and non-self patterns.
These poor results were anticipated. This is because the
matching threshold was set in order to obtain a reasonable
Table 4 Time is an avarage time of single detector generation detector generation time. If, for example, we wanted a
and Trial is an average trial number to generate a single detector. more usable 80% non-self detection rate, 643775165
The average values are followed by the standard deviations in detectors would be required (this number is also obtained
parentheses. from equation 3). The largest size of a generated detector
System
set, 1000, was much smaller than this number and this
Time (Sec) Detector
Run Generation caused such poor results. In addition, each run already
Trial took about 20 hours2 to generate 1000 detectors. If we
1 58.71(26.85) 2.80(2.16) wished to generate 643775165 detectors, it would require
2 67.29(28.88) 2.21(1.65) 12517850.4 hours, or about 1,429 years on the same
3 73.75(33.72) 2.81(2.22) computer. According to Moore's Law, the processing
4 78.48(39.86) 3.12(2.69) speed of computers doubles every 18 months. We would
5 69.64(26.62) 2.72(2.07) have to wait around 35 years before the average
Average 71.81(32.75) 2.63(2.14) processing speed of computers became fast enough to
generate these detectors in an hour - and this is for just
15~20 minutes of a tiny subset of the network traffic data.
Table 3 shows the average time of single successful
detector generation and the average number of trials to
generate a valid detector. Compared to the result when the 8 ANALYSIS
matching threshold is four, which did not generate any In contrast to the promising results shown in Hofmeyr’s
single detector after 24 hours, these results certainly look negative selection algorithm for network intrusion
more applicable. We monitored five different non-self detection (Hofmeyr, 1999; Hofmeyr and Forrest, 2000),
sets and one previously unseen self sets after every 100 the results of these experiments raise doubt whether this
detector generation and the monitor results of five algorithm should be used for network intrusion detection.
different runs are shown in table 4. The overall non-self In order to answer this question, the negative selection
detection rate was very poor: less than 16%. In particular, algorithm for network intrusion detection is analysed in
the non-self detection rate for the last intrusion set, which detail.
was artificially generated by random strings, is extremely The main problem of the negative selection algorithm
low and its maximum average non-self detection rate is a severe scaling problem. Unlike previous work using
reaches only 2.28%. In addition, its average false positive
detection rate, which is self detection rate by a detector
2
set, shows 12.63% and this rate is not hugely different Since it took, on average, 72 seconds to generate each detector, 72000
seconds were needed to produce 1000 detectors. 72000 seconds are 20
from the other four average non-self detection rates hours.
the negative selection algorithm for anomaly detection, also suggested in order to extract this type of correlation
here we apply a much larger “self” set to the negative from given self and non-self network traffic examples.
selection algorithm. The definition of larger “self” set was But, if any new matching function is employed,
essential to cover diverse types of network intrusions. For D’haeseleer’s (1997) formula is no longer valid. There is
instance, (Hofmeyr 1999; Hofmeyr and Forrest, 2000) no way to tune the right number of detectors for negative
defines “self” as a set of normal pairwise connections selection. Therefore, this difficulty may force the negative
between computers. These include connections between selection algorithm to adopt an arbitrary number of
two computers in the LAN and between one computer in detectors and this may cause an unexpectedly low
the LAN and external computers. The connection between detection accuracy or inefficient computation by
computers is defined by “data-path-triple”: (the source IP generating more than sufficient number of detectors. In
address, the destination IP address, the port called for this addition, D’haeseleer’s (1997) new detector generation
connection). This self definition is chosen based on the algorithms using a linear-time algorithm and a greedy
work by (Heberlein, et al, 1990). However, as other IDS algorithm that guarantees a liner time of detector
literature pointed out (Lee, 1999), this self definition is generation is also not applicable when a different
very limited in order to detect various types of network matching function is used.
intrusions and it will certainly be impossible to detect In summary, it is necessary to use a more sophisticated
some intrusions that occur within a single normal matching function to determine the degree of correlation
connection such as unauthorised access from a remote among significant network connection events and
machine. temporal co-occurrences of events. This requires deriving
However, as observed in section 4, when the self a new way to tune an appropriate number of detectors,
definition widens, a binary string to encode a detector which can be used for more sophisticated matching
lengthens. As the result of long length of binary detectors, function.
an appropriate number of detectors to gain an acceptable These drawbacks of the negative selection algorithm
false negative error becomes huge and thus requires an made the AIS struggle to monitor vast amount of a
unacceptably long computation time. Our previous network self set despite its other important features3.
experiment results clearly show this problem. Consequently, the initial results of our experiments
It should be noted that Hofmeyr (1999) developed a motivated us to re-define the role of negative selection
refined theory and multiple secondary representations and stage within an overall network-based IDS and design a
these help to reduce the number of trials to generate more applicable negative selection algorithm, which
detectors on structured self as much as three orders follows a newly defined role. As much of the other
magnitude less. These methods made the distribution of a immunology literature (Tizard, 1995) addresses that the
self set clump and it resulted in the reduction of the antigen detection powers of human antibodies rise from
number of detector generation trials. However, the refined the evolution of antibodies via a clonal selection stage.
theory and secondary representations add extra space and While the negative selection algorithm allows the AIS to
computing time. More importantly, all of the suggested be an invaluable anomaly detector, its infeasibility to be
secondary representations, such as pure permutation, applied on a real network environment is caused from
imperfect hashing and substring hashing, are matching allocating a rather overambitious task to it. To be more
rules which check matching only on genotypes. precise, the job of a negative selection stage should be
Unfortunately, matching rules that operate only at the restricted to tackle a more modest task that is closer to the
genotype level have a weakness to be applied for a role of negative selection of human immune system. That
network intrusion detection problem. This deficiency can is simply filtering the harmful antibodies rather than
be explained by unravelling the problem of r-contiguous generating competent ones. This view has been
matching function. corroborated by further work (Kim and Bentley, 2001)
We used the r-contiguous rule to check the match which has recently shown how succesful the use of clonal
between a given detector and antigen. The main purpose selection with a negative selection operator can be for this
of using it was in order to employ the formula to type of problem.
approximate an appropriate number of detectors to gain a
certain non-self detection rate. However, the r-contiguous
matching rule is too simple to determine the matching 3
Hofmeyr and Forrest (2000)’s final system employs some other
between rather complicated and high-dimensional extensions to support the operation of AIS under a real network
patterns. It has been already known that most rules to environment. Among them, affinity maturation and memory cell
represent intrusion signatures describe correlation among generation follow the clonal selection concept and these provide
a kind of evolution of a detector set distributed on monitored
significant network connection events and temporal co- hosts. However, it still uses only the negative selection
occurrences of events (Lee, 1999; Porras, 1998). Since the algorithm to generate an initial detector set. Even though it may
r-contiguous bit matching only measures the contiguous conform to human immune systems more closely, this approach
bits of genotypes of given two strings, it is hard to could require excessive computation time to generate the initial
detector set, if a broader definition of self is used. In addition,
guarantee that the r-contiguous bit matching can catch this the usefulness of initial detectors is not proven before they are
kind of correlation from given self and non-self patterns. distributed to other hosts. This may also cause a waste of other
The wider range of self definition shown in section 4 is computing resources.
9 CONCLUSIONS 14th Intl. Joint Conf. on Artificial Intelligence, Montreal,
August, pp.985-996.
This paper has investigated the role of negative selection
in an artificial immune system (AIS) for network Kim, J. and Bentley, P. (1999a), “The Human Immune
intrusion detection. The negative selection stage within System and Network Intrusion Detection”, 7th European
our AIS was implemented following the algorithm created Conference on Intelligent Techniques and Soft Computing
by Forrest et al (1994; 1997) and applied to real network (EUFIT '99), Aachen, Germany.
data. The experiments showed the infeasibility of this
algorithm for this application: the computation time Kim, J. and Bentley, P. (1999b), “The Artificial Immune
needed to generate a sufficient number of detectors is Model for Network Intrusion Detection, 7th European
completely impractical. Conference on Intelligent Techniques and Soft Computing
This result directs this research to re-define the role of (EUFIT’99), Aachen, Germany.
negative selection algorithm within our overall artificial Kim, J. and Bentley, P. (2000), “Negative Selection
immune system framework. Current work is now within an Artificial Immune System for Network
investigating the intrusion detection mechanism of the Intrusion Detection”, the 14th Annual Fall Symposium of
clonal selection stage. A new understanding of the task of the Korean Information Processing Society, Seoul, Korea.
the clonal selection stage has now resulted in the
Kim, J. and Bentley, P. (2001), The Artificial Immune
development of a more appropriate use for negative
System for Network Intrusion Detection: An
selection as an operator within a novel clonal selection Investigation of Clonal Selection with a Negative
algorithm (Kim and Bentley, 2001). Selection Operator. Submitted to CEC2001, the Congress
on Evolutionary Computation, Seoul, Korea, May 27-30,
References 2001.
D’haeseleer, P, (1997), “A Distributed Approach to Lee, W., (2000) A Data Mining Framework for
Anomaly Detection”, ACM Transactions on Information Constructing Features and Models for Intrusion
System Security. http://www.cs.unm.edu/~patrik/ Detection Systems, PhD Thesis, Dept of Computer
Science, Columbia University.
Dasgupta, D., (1998), “An Overview of Artificial Immune
Systems and Their Applications”, In Dasgupta, D. Mykerjee, B., et al, (1994), "Network Intrusion
(editor). Artificial Immune Systems and Their Detection", IEEE Network, Vol.8, No.3, pp.26-41.
Applications, Berlin: Springer-Verlag, pp.3-21.
Paul, W. E., (1993), “The Immune System: An
Forrest, S. et al, (1994) “Self-Nonself Discrimination in a Introduction”, in Fundamental Immunology 3rd Ed., W. E.
Computer”, Proceeding of 1994 IEEE Symposium on Paul (Ed), Raven Press Ltd.
Research in Security and Privacy, Los Alamos, CA: IEEE
Computer Society Press. Porras, P. A., (1992), STAT: A State Transition Analysis
Tool for Intrusion Detection, MSc Thesis, Department of
Forrest, S., et al, (1997), “Computer Immunology”, Computer Science, University of California Santa Babara.
Communications of the ACM, 40(10), 88-96. Porras, P. A. and Valdes, A., (1998), “Live Traffic
Forrest, S and Hofmeyr, S. (2000) "Immunology as Analysis of TCP/IP Gateways”, Proceeding of ISOC
Information Processing", in Design Principles for Immune Symposium of Network and Distributed System Security.
Systems and Other Distributed Autonomous Systems, (Ed) http://www.csl.sri.com/emerald/downloads.html
Segal, L.A. & Cohen, I. R. eds., Oxford University Press. Tizard, I. R., (1995), Immunology: Introduction, 4th Ed,
Heberlein, L. T., et al. (1990), "A Network Security Saunders College Publishing.
Monitor", Proceeding of 1990 Symposium on Research in
Security and Privacy, Oakland, CA, pp.296-304, May,
1990.
Hofmeyr, S., (1999) An Immunological Model of
Distributed Detection and Its Application to Computer
Security, Phd Thesis, Dept of Computer Science,
University of New Mexico.
Hofmeyr, S., and Forrest, S., (2000), “Architecture for an
Artificial Immune System”, Evolutionary Computation,
vol.7, No.1, pp.45-68.
Kephart, J. O., et al, (1995), "Biologically Inspired
Defenses Against Computer Viruses", the Proceeding of