Unit 1 Artificial Immune System
Unit 1 Artificial Immune System
Introduction:
Abstract
• Optimization,
• Automatic Control,
• Bioinformatics,
• Image Processing.
Special attention is paid on analyzing the Shape-Space Model which provides the necessary
mathematical formalism for the transition from the field of Biology to the field of Information
Technology. This chapter focuses on the development of alter-native machine learning algorithms
based on Immune Network Theory, the Clonal Selection Principle and the Theory of Negative
Selection. The proposed machine learning algorithms relate specifically to the problems of:
• Data Clustering,
• One-Class Classification.
Artificial immune system (AIS) is an inter-discipline research area that aims to build
computational intelligence models by taking inspiration from Biological immune system (BIS).
This chapter first gives some knowledge of BIS and briefly introduces the origin and developments
of AIS. Then, several AIS models are described in detail. This chapter then summarizes the main
features and applications of AIS. Finally, the AIS-based anti-spam is presented and detailed.
People have a keen interest on the biosphere since ancient times and have gotten inspiration from
the structures and functions of biological systems and their regulatory mechanisms continuously.
Since mid-twentieth century, researchers have focused on the simulation of the biological systems,
especially the structures and functions of human beings. For examples, artificial neural network is
to simulate the structure of the nerve system of the human brain, fuzzy control is very similar to
the fuzzy thinking and inaccurate reasoning of human beings, and evolutionary computation
algorithms are the direct simulations of the evolved processes of natural creatures. In recent years,
BIS has become an emerging bioinformatics research area. The immune system is a complex
system consisting of organs, cells, and molecules. The immune system is able to recognize the
stimulation of self and non self, make a precise response, and retain the memory. It turns out from
many researches that the immune system is of a variety of functions such as pattern recognition,
learning, memory acquisition, diversity, fault-tolerant, distributed detection, and so on. These
attractive properties of the BIS have drawn extensive attention of engineering researchers who
have proposed many novel algorithms and techniques based on those principles of immunology.
After introducing the concept of immunity, many researches in engineering have obtained more
and more promising results, such as computer network security, intelligent robots, intelligent
control, and pattern recognition and fault diagnosis.
These researches and applications not only can help us to further understand the immune system
itself, but also can help us to reexamine and solve practical engineering problems from the
perspective of information processing way in BIS. Building a computer security system in
principle of the immune system opens a new research field of information security. Many
structures, functions, and mechanisms of the immune system are very helpful and referential to the
research of computer security, such as antibody diversity, dynamic coverage, and distribution. We
believe that the excellent features of the immune system are the roots and original springs for us
to build perfect computer security systems.
Definition
The field of artificial immune systems (AIS) is concerned with abstracting the structure and
function of the immune system to computational systems, and investigating the application of these
systems towards solving computational problems from mathematics, engineering, and information
technology. AIS is a sub-field of biologically inspired computing, and natural computation, with
interests in machine learning and belonging to the broader field of artificial intelligence.
Artificial immune systems (AIS) are adaptive systems, inspired by theoretical immunology and
observed immune functions, principles and models, which are applied to problem solving.
AIS is distinct from computational immunology and theoretical biology that are concerned with
simulating immunology using computational and mathematical models towards better
understanding the immune system, although such models initiated the field of AIS and continue to
provide a fertile ground for inspiration. Finally, the field of AIS is not concerned with the
investigation of the immune system as a substrate for computation, unlike other fields such as DNA
computing.
Abstract We analyze the biological background of Artificial Immune Systems, namely the
physiology of the immune system of vertebrate organisms. The relevant literature review outlines
the major components and the fundamental principles governing the operation of the adaptive
immune system, with emphasis on those characteristics of the adaptive immune system that are of
particular interest from a computation point of view. The fundamental principles of the adaptive
immune
Techniques
The common techniques are inspired by specific immunological theories that explain the function
and behavior of the mammalian adaptive immune system.
Clonal selection algorithm: A class of algorithms inspired by the clonal selection theory of
acquired immunity that explains how B and T lymphocytes improve their response
to antigens over time called affinity maturation. These algorithms focus on
the Darwinian attributes of the theory where selection is inspired by the affinity of antigen–
antibody interactions, reproduction is inspired by cell division, and variation is inspired
by somatic hyper mutation. Clonal selection algorithms are most commonly applied
to optimization and pattern recognition domains, some of which resemble parallel hill
climbing and the genetic algorithm without the recombination operator.
Negative selection algorithm: Inspired by the positive and negative selection processes that
occur during the maturation of T cells in the thymus called T cell tolerance. Negative selection
refers to the identification and deletion (apoptosis) of self-reacting cells that is T cells that may
select for and attack self tissues. This class of algorithms are typically used for classification
and pattern recognition problem domains where the problem space is modeled in the
complement of available knowledge. For example, in the case of an anomaly detection domain
the algorithm prepares a set of exemplar pattern detectors trained on normal (non-anomalous)
patterns that model and detect unseen or anomalous patterns.
Immune network algorithms: Algorithms inspired by the idiotic network theory proposed
by Niels Kaj Jerne that describes the regulation of the immune system by anti-idiotic antibodies
(antibodies that select for other antibodies). This class of algorithms focus on the network
graph structures involved where antibodies (or antibody producing cells) represent the nodes
and the training algorithm involves growing or pruning edges between the nodes based on
affinity (similarity in the problems representation space). Immune network algorithms have
been used in clustering, data visualization, control, and optimization domains, and share
properties with artificial neural networks.
Dendritic cell algorithms: The dendritic cell algorithm (DCA) is an example of an immune
inspired algorithm developed using a multi-scale approach. This algorithm is based on an
abstract model of dendritic cells (DCs). The DCA is abstracted and implemented through a
process of examining and modeling various aspects of DC function, from the molecular
networks present within the cell to the behavior exhibited by a population of cells as a whole.
Within the DCA information is granulated at different layers, achieved through multi-scale
processing.
Both the innate and the adaptive systems depend upon the activity of a large number of immune
cells [13, 15, 25] such as white blood cells. In particular, innate immunity is mediated mainly
by granulocytes and macrophages, while adaptive immunity is mediated by lymphocytes. The
cells of the innate immune system are immediately available for combatting a wide range of
antigens, without requiring previous exposure to them. This reaction will occur in the same
way in all normal individuals.
1.2.1 Overview
BIS is a highly complex, distributed, and paralleled natural system with multiple levels, which
can identify the self, exclude the nonself, for maintaining the security and stability in the
biological environment. It makes use of the innate immunity and adaptive immunity to
generate accurate immune response against the invading anti-gens outside. BIS is robust to
noise, distributed, self-organized, no central control, and with enhanced memory [37]. The
original substance in an organism is called self such as normal cells. The non-original
substance in the organism is called nonself like the invading antigens.
BIS consists of innate immunity (also known as nonspecific immune) system and adaptive
immunity (also known as specific immune) system. The two systems mutually cooperate
together to resist the invasion of external antigens. Specifically, innate immune response starts
the adaptive immune response, influences the type of adaptive immune responses, and assists
adaptive immune to work. Adaptive immune response provides a more intense specific
immune response . Innate immune system is an inherent defense system that comes from a
long-term evolutionary process. It is the first line of defense against antigens, which provides
the innate immune function of the body. Usually, the innate immune system makes use of
innate immune cells to recognize the common pattern formed by a variety of nonself.
Therefore it can identify a variety of antigens, effectively preventing from the invasion of most
antigens. If an antigen breaks up the body’s innate immune defense barrier, the adaptive
immune system of the human body will be invoked and responsible for the immune response
to that specific antigen.
Specific memory cells are able to remember the corresponding antigens. When the same
antigen invades the body again, the memory cells will propagate and divide rapidly, providing
a more intense immune response to it.
The lymphocytes that react with self will undergo apoptosis, and the remaining lymphocytes
will go to lymphoid organs and tissues, cycling in the organism with the lymphatic blood. This
process in the BIS is called the negative selection process [175]. Based on the negative
selection mechanism, the BIS is able to successfully identify self and nonself, without the need
of any nonself information.
The ability to present resistance against pathogens is shared by all living organisms. The nature
of this resistance, however, defers according to the type of organism. Traditionally, research
in immunology has studied almost exclusively the vertebrate (i.e. of animals containing bones)
defense reactions and, in particular, the immune system of mammals. Vertebrates have
developed a preventive defense mechanism since its main characteristic is to prevent infection
against many kinds of antigens that can be encountered, both natural and artificially
synthesized.
The most important cells of the adaptive immune system are the lymphocytes. They are present
only in vertebrates which have evolved a system to proportionate a more efficient and versatile
defense mechanism against future infections. On the other hand, the mechanisms of the innate
immune system are less evolved. However, the cells of the innate immune system have a
crucial role in the initiating and regulating the adaptive immune response.
Each naive lymphocyte (i.e. each lymphocyte that has not been involved in an immune
response) that enters the blood stream carries antigen receptors of single specificity. The
specificity of these receptors is determined by a special mechanism of gene re-arrangement
that acts during lymphocyte development in the bone marrow and thymus. It can generate
millions of different variants of the encoded genes. Thus, even though an individual carries a
single specificity receptor, the specificity of each lymphocyte is different. There are millions
of lymphocytes circulating throughout our bodies and therefore they present millions of
different specificities.
In 1959, Burnet formalized the selective theory which was finally accepted as the most
plausible explanation for the behavior of an adaptive immune response. This theory revealed
the reasons why antibodies can be induced in response to virtually any antigen and are
produced in each individual only against those antigens to which the individual was exposed.
Selective theory suggested the existence of many cells that could potentially produce different
antibodies. Each of these cells had the capability of synthesizing an antibody of distinct
specificity which by binding on the cell surface acted as an antigen receptor. After binding
with the antigen, the cell is activated to proliferate and produce a large clone. A clone can be
understood as a cell or set of cells that are the progeny of a single parent cell. In this context,
the clone size refers to the number of offspring’s generated by the parent cell. These cells
secrete antibodies of the same specificity to its cell receptor. This principle was named clonal
selection theory or clonal expansion theory and constitutes the core of the adaptive immune
response. Based upon this clonal selection theory, the lymphocytes can therefore be considered
to undergo a process similar to natural selection within an individual organism as it was
originally formulated by Darwin in [5]. Only those lymphocytes that can interact with the
receptor of an antigen are activated to proliferate and differentiate into effector cells.
• Distributivity:
• Multi-layered:
The biological immune system has a multi-layer structure. A single layer of the
biological immune system cannot protect the organism from all invasions, but the
cooperation of multiple layers is able to achieve the security protection of the system.
Although this feature is not unique to the biological immune system, it is a very
important feature of the biological immune system. Studies and implementations of the
multi-layered feature in the artificial immune system for computer systems can greatly
enhance secu-rity of computer systems.
• Diversity:
In nature, although the bodies protected by the biological immune system are the same
on the whole, each body has its own differences. The diversity of different bodies is
also very helpful to the protection against invasions. Diversity is from two aspects:
one is the body’s own diversity, the other is the diversity of the biological immune
system. The combination of the two aspects increases the “diversity” greatly and is
very important to the protection of our body. In the field of computer system security,
the implementation of the diversity can be also achieved in two aspects—the diversity
of computer operating systems and the diversity of the artificial immune system.
Disposability:
No immune cells in the biological immune system are indispensable. Every
immune cell has a lifecycle. In the study of artificial immune systems, we can
borrow the mechanism to achieve the lifecycle of immune antibodies.
Autonomy:
The biological immune system does not require a central control node. They
can automatically recognize and destroy invading antigens and unitize the
illness and death of immune cells to update themselves, achieving the
immunologic function on their own.
• Adaptability:
The biological immune system is able to learn newfound invad-ing
pathogens, and form the memory. The speed of response to the same path-oge
invasion will be accelerated. Learning mechanisms of the biological immune
system are very important to the artificial immune system. The artifi-cial immune
system should not only remember the abnormal immune infor-mation found in the
past, but also dynamically learn the immune rules to handle the emerging unknown
anomalies.
• No secure layer:
• Anomaly detection:
The biological immune system is able to recognize the
pathogen that is never seen. This phenomenon is called anomaly detection.
This feature is conducive to the artificial immune system for achieving the
function to detect unknown anomalies or to find new viruses in the field of
computer security.
• Incomplete detection:
• Numbers game:
The numbers game mainly refers to the time of the invasion and the protective response.
Immune response must be faster than the speed of invasion, otherwise the immune protection
will be overwhelmed by the invasion. Researchers of artificial immune system indicate that
more attention should be paid to the lightweight of the system.
models achieved great success. In the artificial immune algorithm, the antigen corre-
sponds to the objective function for solving problems and constraints, the antibody
suspended, it was the best match with the antigen-antibody, which has been opti-
mized to the solution that solves the problem successfully.
Step 4 Check the lifecycle of each antibody and update the antibody,
Step 5 If the abort condition, then go to Step 6; otherwise steer for Step 3,
Step 3 Abort condition judgment; if the variant does not contain a sufficient
concentration detector detector, steer for Step 2, otherwise abort.
and achievements, attracting more and more researchers. Nevertheless, these artifi-
cial immune systems and malware detection methods are not perfect. Most of them have
some deficiencies and shortcomings, which stimulates researchers to explore
more efficient models and algorithms.
1.5 ARTIFICIAL IMMUNE SYSTEM MODELS AND
ALGORITHMS
1.5.1 Negative Selection Algorithm
Inspired by the generation process of T cells in immune systems, Forrest
and associates proposed a negative selection algorithm, which has become
one of the most famous AIS algorithms. Biological immune systems have
the ability to distinguish between self cells and nonself cells, which makes
it able to recognize invading antigens. T cells play a key role in this process.
The generation of T cells includes two stages: the initial generation stage
and the negative selection stages.
First, T-cell receptors are generated by a random combination of genes. In
order to avoid the erroneous recognition of self, T cells are filtered in the
thymus (i.e., negative selection process). The T cells that can recognize the
self cells will be removed, while others that are approved by the T cells are
able to participate in the immune response. Forrest and associates applied
the same principle to the distinction of self and nonself in computer systems.
They generated the detector set by a negative selection process to recognize
the nonself that invaded the computers.
The process of negative selection algorithm is shown in Algorithm 3. The
negative selection algorithm includes the detector set generation stage and
the nonself detection stage. In the detector set generation process, the self
gene library is constructed from the self files. Then the detector set is
randomly generated. The detectors that match the self gene library are
removed from the detector set according to the negative selection principle.
The main role of the detector set is that it can fully cover the nonself data
space.
Therefore, the number of detectors tend to be more substantial. In the stage
of nonself detection, the algorithm conducts the r—contiguous bits match
between the sample and the detectors in the detector set one by one. Once a
match occurs, the sample will be labeled as “nonself.”
8. Continue.
9. End if
12. D D∪N.
4. P Pr∪M.
5. Select n antibodies with best affinities from P, denote the set as Pn.
6. C φ
8. Clone a to get new antibodies, denote the set as A, the size of the set A is
9. C C∪A
13. C∗ C∗∪a
15. Select the best antibodies from C∗ ; replace M with the best set.
17. Replace the d antibodies with lowest affinities in Pr and M with antibodies
in Nd.
18. End while.
In accordance with Danger theory Bretscher and Chon proposed a two signal model.
According to this
Signal1 : this is used for antigen recognition. Basically to determine the cell is a foreign cell.
Signal2 : this is used for co-stimulation. This refers that the cell is really dangerous. So in
accordance to the two signal model the danger theory operates by 3 steps
Step1 : Become activated if you receive signals one and two together. Die if you receive signal
one in the absence of signal two. Ignore signal two without signal one.
Step2 : Accept signal two from antigen-presenting cells only. Signal one may come from any cell.
Step3 : After activation revert to resting state after a short time . The challenge is clearly to define
a suitable danger signal. The danger signal helps to identify which subset of feature vectors is of
interest. A suitably defined danger signal overcomes many of the limitations of self–nonself
selection. It restricts the domain of nonself to a manageable size, removes the need to screen
against all self, and deals adaptively with scenarios where self (or nonself) changes over time.
Immune Concentration:
Conventional radio-immunoassays (RIA) and conventional enzyme immunoassays (EIA) are
highly sensitive but may require several hours to obtain results. In these conventional assays, the
solid phase is often very large, e.g. a plastic bead, and much of the time required to perform the
assay is a result of inefficient or slow diffusion of the antigen to the solid phase. The current trend
in clinical laboratory and physician office testing is towards highly sensitive assays that are quick,
easy to perform and require minimal instrumentation. The use of membrane or glass fibre filters
as the assay solid phase has recently given rise to a new family of immunoassays that meet these
criteria.
The ICON® (immunoenzymatic assay, Hybritech) device combines a large solid phase surface
with directional flow of sample. The sample is passed through a membrane (the assay solid phase),
thus concentrating and accelerating the binding of antigen. This technology has provided an
answer to the need for simple, highly sensitive assays that give easily interpreted results in minutes
and has also evolved to include such advances as internal references, calibrators and control
The high affinity and specificity of antibodies allow immunological reactions to be used for the
detection of numerous analytes. Antigenantibody tests are classified according to whether the test
is dependent upon a primary interaction between the antibody and antigen or is dependent upon a
secondary manifestation such as precipitation or agglutination.
The primary interaction is the specific recognition and binding of an antigenic determinant with
the binding sites of its corresponding antibody (1). Quantitative tests dependent entirely on the
primary interaction between antigen and antibody include fluoroimmunoassay,
radioimmunoassay, and immunoenzymatic assays.
The primary tests are more sensitive than the secondary tests and are not dependent upon the
variables which control secondary reactions Primary tests require a purified antigen or antibody
in solution and a technique to quantitate the antigen or antibody. Labelling of antibodies with
radioisotopes, fluorophores, or enzymes has been a useful technique in detecting small quantities
of biological substances
While studying the behaviour of radiolabelled insulin, Berson and Y alow made several
observations that led to the development of a radioimmunoassay for plasma insulin (2). Since then
a wide range of highly sensitive and specific radioimmunoassays have been developed for a variety
of compounds.
A typical radioimmunoassay requires an antibody specific for the antigen, radiolabelled antigen,
antigen standards, and a method to separate free antigen from that which is bound to antibody. The
binding displacement of radiolabelled antigen by the unknown concentration of antigen in the
sample to be tested is measured by comparing the binding level to that of a standard curve. It is
critical to have a good method for the separation of bound from free labelled antigen
The analytical sensitivity and specificity of an immunoassay depends on the affinity of the
antibody used and the magnitude of its cross-reactivity with other related compounds. Early
radioimmunoassay strengths included fast reaction kinetics and capability for a wide range of
measurement. A major weakness is the complexity of separating bound from free antigen. Often,
this requires the addition of another reagent such as a 'second antibody' (antibodies directed against
the species from which the immunoassay antibodies were derived) as well as a centrifugation step.
Problems with reproducibility and non-specific binding have been a significant cause of assay
variability and a limitation to the utility of this procedure.
Solid phase radio-labelled immunoassays Catt and Tregear (3) noted that antibodies adsorbed to
polystyrene or polypropylene tubes could be used in a solid-phase radioimmunoassay. In
sequential solid phase immunometric immunoassays, a wash step is required to remove unbound
antigen before the addition of a labelled antibody. Simultaneous assays utilise antibodies binding
to two distinct sites on the antigen, and therefore, no wash step is required between the addition of
antigen and labelled antibody. However both procedures require the separation of bound from free
radiolabelled antibody.
Characteristics of AIS:
The BIS has been evolving for hundreds of millions of years and plays a very important role in the
protection of the body from bacterial invasion. Although the immune system may encounter
problems sometimes, generally speaking, we can see its unique protective effect. The working
principles of the BIS will have some inspiration and reference meaning on the research field of
security protection technologies of computer systems, providing a brand new thinking of computer
security, if the com-puter systems are seen as “human bodies” and the external intrusions as
“harmful viruses.”
Immunity refers to the ability of the body to identify self or nonself and exclude nonself. The
BIS is the body’s natural system with functions of resistance to the disease itself and prevention
of invasion from harmful bacteria. This system itself has many characteristics, some of which
obtain certain significance on the research of computer system security.
1. Distributed detection: The immune system works in a way of distributed detec-tion, in which
the “detector” to detect the bacteria invasion is very small but with high detection efficiency, and
centralized control center and collabora-tion are not required. Computer security systems are not
equipped with the function of distributed detection, and the use of the control center has actually
reduced the factor of safety protection of the system.
2. Detection of abnormality: The immune system is able to identify the invading bacteria that the
system has never seen and take corresponding measures. The specific targets of the current
computer security protection system are gener-ally decided by the protective strategies or the
protection system itself, without automatic intrusion detection of the latest way of invasion
[85,124,218].
3. Learning and memory: The immune system is able to automatically learn the structure of
invading bacteria, and memorize these information in order to reply to this type of bacteria faster
and timely subsequently. Current computer security systems do not have the ability of self-
learning.
4. Diversity: Different biological bodies have different immune systems. A certain weakness of
one immune system is not the weakness of another. A virus might be able to break through one
protective immune system, but the possibility of breaking through other immune systems is very
small. Thus, the immune systems have strong ability to protect the overall population. While for
computer systems, the security systems are always the same. Once a loop- hole is found, any
computer system using this kind of security system will suffer the threat of invasion through this
loophole.
5. Incomplete detection: The immune system does not require making nonself test on every
invading cell. It has great flexibility and may sacrifice a portion of the body functions or resources
in order to ensure the normal functions of the body in general. Computer security systems generally
do not have the ability of overall analysis of the system and its functions are generally specific and
fixed.
Controlling engineering: AIS can be readily identified as a feedback controller based on the
principles of fast response and rapid determination of foreign intrusions. It has been applied to
the car’s rear collision prevention system by comprehensive processing signals transmitted from
sensors and controling each actuator executing corresponding operations quickly and accurately.
Fault diagnosis: Distributed diagnosis system that combines immune network and learning vector
quantization (LVQ) can be used to accurately detect the sensors where failure occurs in controlled
object. This system has two modes: training mode and diagnosis mode. In the training mode, data
of sensors working normally are trained and achieved through LVQ; in the diagnosis mode, the
immune network determines the sensors with faults based on the knowledge acquired by LVQ.
Experiments show that the system can auto- matically identify the failed sensors in the group of
working sensors; while in the past, this was implemented by detecting the output of each sensor
inde- pendently. The self-learning ability of the immune system is also used in the monitoring
system of computer hardware, in which the system marks out the area when fault occurs in and
takes appropriate recovery actions once the computer hardware system goes wrong.
4. Optimized design: For the nonlinear optimization problem with multiple local
minima, the general optimization methods are difficult to find the global optimal
solution, while genetic mechanism based on diversity of the immune system can be used
for optimal search. It can avoid premature convergence for improving the genetic
algorithm and dealing with multi-criteria prob-lems. It is being currently used for function testing,
the traveling salesman problem, VLSI layout, structure design, parameter correction of permanent
magnet synchronous motor, etc.
5. Data analysis: AIS has the ability of data analysis and classification by combining the
advantages of classifiers, neural networks, and machine Therefore, it has been used in the fields of
data mining and information processing. J. Timmis discusses how to implement an unsupervised
and self-learning AIS specifically.
6. Virus detection: According to the ability of distinguishing self and nonself of the immune
system, Forrest proposes principles and laws of BIS that AIS cantake inspiration from and has
done a lot of research work. By taking inspiration from the mechanism of BIS resisting and
destroying unknown biological virus, T. Okamolo proposes a distributed Agent-based antivirus
system. It consists of two parts: the immune system and the recovery system. The function of the
immune system is identifying the nonself information (computervirus) by grasping the self
information; the recovery system copies files from the noninfected computer to the computer
which has been infected through the network to cover the files on it. Based on the same principles,
AIS is also used for hacking prevention, network security maintenance, and system maintenance.
.