independent study report # 1
TRANSCRIPT
-
8/4/2019 Independent Study Report # 1
1/75
1
Independent Study Report
Artificial Immune Systems
1. Introduction:The biological Immune systems is a complex and adaptive system that defends body from the
antigens or pathogens from attack. It is possible to differentiate between immune cells as self-
cells and non-self cells. It is probable with the aid of the distributed and parallel force that has
the intelligence to take appropriate action from local and both global view using its connections
of chemical messengers for interactions.
There are two majors branches of the immune systems:
1. The innate system is static system which indentify and destroys antigens while;2. The Adaptive immune system reacts to unknown antigens patterns and develop a reaction
to those encountered antigens that can remain within body for longer time.
Such noticeable information processing capability of bio-logical immune system has caughtattention of computer engineers around the world for its application in computer security,
anomaly detection, fault tolerance, pattern recognition, etc.
This field has got its application in robotics and in some cases involves optimization tasks also.
2. Overview of Bio-logical Immune Systems:
The biological immune system has evolved over millions years and it is elaborate defense
system. The immune system employs multilevel and overlapping defense in parallel and
distributed way although the immune mechanism namely innate and adaptive and processes like
humeral and cellular are not known completely.
-
8/4/2019 Independent Study Report # 1
2/75
2
The biological immune system respond to attack either to neutralize the antigenic effect or
destroy the antigen. Such response is dependent on the way the antigen type and the way it
enters.
The crucial features of the biological immune system are:
a. Affinity (matching)b. Diversityc. Distributed operation (no central mechanism)
Affinity or matching degree refers to the binding between antibody and antigen.
Diversity means there should be different number of antibody types that can act as key to antigen
locks.
Distributed control means that there is no central mechanism to govern the immune response
when antigen attacks. There are local interactions between immune cells and antigens.
There are two immune cells that play important role in immune response:
1. B-cells (Bone Marrow),2. T-cells (Thymus).
Both these types of immune cells belongs to bone marrow but T-cells migrate to thymus to get
mature and in this way flow in the body through blood. There are three types of T-cells which
are mentioned below:
a. Helper T-cellsThese cells are important for the activation of B-cells.
b. Killer T-cells
-
8/4/2019 Independent Study Report # 1
3/75
3
Such cells are attached to the alien invaders and inject the destroying chemical molecules in to
antigens thereby causing their destruction.
c. Suppressor T-cellsThese genre of T-cells suppress the autoimmune interactions between cells. Thereby they
contribute to the network stabilization.
On the other hand, the B-cells are responsible for the production of antibodies that binds to
antigens and cause them to die out. Each B-cell generate only one type of antibody (which
numbers in millions).
In the figure below, I-II show the invade entering the body and activating T-Cells, which then in
IV activate the B-cells, V is the antigen matching, VI the antibody production and VII the
antigens destruction.
Figure (1) Immune system Cells [6]
-
8/4/2019 Independent Study Report # 1
4/75
4
From above description one can say that the innate immune system is responsible for the primary
response and the adaptive immune system is responsible for secondary response.
Hence, the human body is protected against foreign invaders by a multilevel system.
The biological immune system composed of skin, respiratory system, destructive enzymes and
stomach acids. The immune system is divided into two heads:
1. Innate immunity (non-specific);2. Adaptive immunity (specific ).
Such systems affect each other and linked to each other.
Again there are two types of adaptive immunity which are:
a. Humoral immunity,b. Cell mediated immunity.
1. Innate immunity:This immunity is congenital. pH temperature and chemicals rises unbeneficial living conditions
for foreign organisms. Extracellular molecules are ingested by macrophages and such process of
ingestion is affected by chemical messengers called lymphokines. The sialic acid on foreign
molecules make C3b bind to these surfaces for longer time. Thus, MAC is developed that
penetrates the cell surface and kill the cell of foreign antigen.
2. Adaptive Immunity:It is crucial for learning and memory.
-
8/4/2019 Independent Study Report # 1
5/75
5
a. Humoral ImmunityThis kind of immunity is happened by antibodies molecules contained by fluids within body
termed as humors. It involves the interactions between B-cells and antigens. The subsequent
proliferation and formation of memory cells. When there is an interaction between antibody and
antigen, the antigen can be destroyed in many ways. For instance, antibody can cross-link the
antigen forming the clusters that are more readily ingested by macrophages cells.
b. Cell ImmunityAs the name indicates that it is cell mediated. T-cells are responsible for cell-mediated
immunity. Cytotoxic T cells participate in cell-mediated immunity reactions by killing
altered self cells. Cytokines secreted by TDH can mediate this kind of cellular immnunity.
3. Artificial Immune Systems Basic Concepts
3.1 Initialization and Encoding:
In order to implement Artificial Immune System, there are four parameters which are needed to
be considered:
1. Encoding2. Similarity Measure3. Selection4. Mutation
Once we encode, then a similarity measure is determined in order to calculate degree matching
which perform selection and mutation until we reach the stopping criteria.
-
8/4/2019 Independent Study Report # 1
6/75
6
Selection of encoding scheme is very important for algorithms success. Similar to Genetic
Algorithm, there is close relationship between encoding and fitness function of genetic
algorithms. Fitness function is nothing but matching or affinity in artificial immune systems.
Now we have to consider two terms namely antigen and antibody. An antigen is target or
solution for a given problem. For example, the data to be checked or intrusion in system. An
antibody is the remaining data, e.g., other users in the data set or the network traffic.
Antigens and antibodies are encoded in the similar way. The most common way is string
representation, where length is number of variables, the position is variable identifier and the
corresponding value of variable.
For data mining and intrusion detection, a five variable binary problem can be shown as: (10010)
Example:
Data Mining: The problem of recommending movies.
The encoding deals with representation of users profile with respective to movies seen and the
like and respective dislikes. A list of numbers representing the vote can turn out to be encoding.
The votes can be binary or it can be 10 integers in a range. [0,5] where 0 indicates not like movie
and from 1 to 5 shows the rating of how much the movie is appreciated.
A possible encoding scheme for movie recommendation:
**+ *+ *++ (1)
id = identifier
score = score to the user.
-
8/4/2019 Independent Study Report # 1
7/75
7
Intrusion Detection:
The encoding looks like:
[ ], example: [
which represents an incoming data packet send to
port 25. In these scenarios, wildcards like any port are also often used [2,4].
3.2 Similarity or Affinity Measure
Matching degree is one of the most important in developing Artificial Immune Systems
algorithm. Two of the matching algorithm are described below with binary representation:
Now consider two strings below:
(0 0 0 0 0) and (0 0 0 1 1)
It is noticed that by bit-by-bit comparison, there are two different bits at the last. We can say that
the score is 3 depending on the matching between the two strings. This kind of matching
whichever we did is opposite to Hamming Distance technique in which the different bits are
needed to changed in order to bring similarity.
Again consider the strings (00000) and (01010). Once again the score is 3. The way in which the
matching results is different still the score is 3. So, this could be a problem. In order to avoid
such anamoly, we identify the continuous number of bits that match and get the length of the
longest matching as the similarity measure. So, for the first example, the score is 3 and for the
example second, the score is 1. If we do not want to use the binary representation, real-valued
representation is available. We can determine the Euclidean distance between two strings.
-
8/4/2019 Independent Study Report # 1
8/75
8
For data mining, the matching degree is refered to as correlation. If we take the instance of
movie recommendation, assume that we are finding the users from the data that are same to the
main users profile. In that situation, whatever we are trying to do is to determine the similarity.
For this we can use, the Pearson Correlation Coefficient between the two users.
Let there are two users u and v:
(2)
n represents the votes for which u and v have voted. ui is the vote of user for movie i and
represents the average of user u over entire movies. The measure is amended so default to a
value of 0 if the two users have no films in common.
The output ranges from -1 to 1 indicating the strong agreement to strong disagreement. 0 means
no correlation. For data mining, the 1 and -1 are the most important.
In negative selection algorithm, the element that are matched are eliminated and this shows that
the B-cell maturation involves no matching between self molecules or cells.
Now the question arises, where the Negative Selection is applied for artificial immune systems
implementation.
Consider the Intrusion Detection,
One way of solution to such intrusion detection problems is define self set S. Then the set of
detectors are randomly initialized. The set of detectors are subjected to matching algorithm that
-
8/4/2019 Independent Study Report # 1
9/75
9
compares set self. Any matching detector is rejected and we remain with the elements that do not
match with self. All these non-similar elements are comprising resultant detector set.
Such detector set is used to continually monitor the network. If there is a match, this is sign of
danger or alert.
The branch of Computational Intelligence emerged in 1990s Artificial Immune Systems is used
in computer security, pattern recognition, etc. [2,4,6].
4. Biological Immune System Models
4.1 Negative Selection Principle
Its been clear that the thymus is responsible for maturation of T-cells and is shielded by the
blood barrier which is able to exclude non-self antigens from thymus. Hence, the majority of the
biological cells present in thymic environment are self and not non-self. As an inference, the T-
cells containing repertoire that recognize the self cells are excluded from the thymus through the
biological process termed as Negative Selection. All the matured cells that leave the thymic
environment are self-tolerant and they do not identify the self cells.
From information processing view, negative selection perform pattern recognition by collection
important or crucial information about the non-self of the patterns to be identified. So, by taking
inspiration from biology, negative selection algorithm has been put forward for anomaly
detection or fault tolerance.
Define the set that has to be protected and let it be self set (P). Generate the set of detectors (M)
that detects all the elements not belonging to set P. The negative selection algorithm goes as
follows:
1. Produce the random elements (C);
-
8/4/2019 Independent Study Report # 1
10/75
10
2. Compare P and C. If the element of set C matches with an element of set P then discardsuch element or else store it in set M.
Now the set M is created, the next step is to monitor the system for detection of non-self patterns.
Consider set P to be monitored. The set P consists of elements of P and some new patterns or it
can be totally new set. For all the items in set M, that corresponds to non-self patterns, detect it
whether identifies an element of P and if it does then a non-self pattern is recognized and an
action is taken. [12]
Figure (2) Negative Selection Principle [12]
4.2 Clonal Selection
It is the theory that is used to describe how an immune response is executed when a non-self
pattern is identified by a B-cell complimentary to negative selection. Figure shows clonal
selection, proliferation and affinity maturation. The process can be explained as when a B-cell
recognizes an antigen with certain degree of affinity, it is selected to generate high volume of
antibodies which binds to antigens and results into their elimination with the aid of other immune
-
8/4/2019 Independent Study Report # 1
11/75
11
cells. The proliferation process is asexual which is a mitotic process in which cells divide
themselves. The B-cells clones undergo a hyper mutation resulting B-cells with high affinity
towards antigens. The B-cells also become memory cells.
From the computation point of view,
1. An antigen selects immune cells to proliferate. This rate of proliferation is directlyproportional to affinity. The higher the affinity, the higher the proliferation.
2. The mutation rate is inversely proportional to the affinity.
Figure (3) Clonal selection [12]
-
8/4/2019 Independent Study Report # 1
12/75
12
Genetic algorithms are similar to clonal selection if cross-over operator is not there. However,
the genetic algorithm has no affinity proportional reproduction and mutation properties. So,
CLONALG l algorithm has been proposed to include these properties. Such algorithm was
proposed for pattern recognition and thereafter it was modified for optimization tasks.
Suppose the set of patterns given to be P that are to be recognized, then the CLONALG
algorithm steps are termed as below:
1. Generate a population of patterns (M) randomly.2. Now, to the population (M), present each pattern of P to it. Determine affinity with each
and every element of set M.
3. Identify the individuals of M that have best affinity. Produce copies of such elements inproportion to the affinity with the antigen. The more the affinity, the more the number of
copies.
4. Mutate all the copies of the element in proportion to the affinity to the input pattern. Themore the affinity, the lesser the mutation rate.
5. These mutated elements are then added to set M and determine the elements that arematured. These are memories of the system.
6. Iterate steps 2 to 5, until the certain criteria is met. Such criteria are minimum patternrecognition or classification error.
This very algorithm enables the Artificial Immune Systems to become good at pattern
recognition. Hence, the CLONALG learns to recognize patterns depending on evolutionary like
behavior. [12]
-
8/4/2019 Independent Study Report # 1
13/75
13
4.3 Immune Network
The immune network theory states that the dynamic behavior is still there in immune system
even when the antigen is not present. So, how does it happen? It is proposed the cells and
molecules are able to identify each other. However, such theory is criticized by many
immunologist but the computational features of immune network are very important in robotics.
In accordance to this theory, the molecules that are on the surface of antibodies which are
recognized by other antibodies are called idiotopes.
In order to explain this theory, assume that there is antibody Ab1 recognises antigen Ag. Now
imagine that this antibody Ab1 recognises the idiotope of antibody Ab2. So, Ab1 recognises Ab2
and Ag. We say that the Ab2 is internal image of Ag. Such recognition of idiotopes between
molecules gives rise to connected cells network. A network is network of affinities. As a result of
such interactions, a antibody-antibody recognition gives network suppression and antibody-
antigen recognition gives rise to network activation and cell proliferation.
The recognition of one antibody by another one results in network suppression. Such ideology is
modeled by eliminating all but one of the self-recognising cells.
Figure (4) Immune Network [12]
-
8/4/2019 Independent Study Report # 1
14/75
14
Set (P) contains patterns to be recognized.
1. Generate network population randomly.2. For every element in set P, allow CLONALG that gives M* (memory cells) and their co-
ordinates for the current antigen.
3. Calculate the affinity between elements of M*.4. Accept all but those elements from M* that are having threshold more than prescribed.
The intent is to eliminate redundancy in the network by suppressing self-recognising
elements.
5. Combine the remaining elements of step 4 with the remaining elements found for eachantigen element presented. This gives Set M.
6. Calculate the matching degree between each and every element of Set M and suppress allbut self-recognizing.
7. Iterate step 2 to 6, until desired result is attained. [12]
5. Modeling the Bio-logical Immune Systems
5.1 Shape- Space Model:
The interactions between the antibody and antigen is of importance in immune systems. The
concept of Shape-Space is introduced to describe the interactions between immune cell
molecules and antigens quantitatively by Perelson and Oster in 1979.
According to this concept, the antigens can be recognized within a known region known as
recognition region around a antibody. The degree of binding between a antibody and attacking
antigen usually involves the short range non-covalent interactions based on electrostatic charge,
hydrogen-binding, van-der Waals force of attractions/repulsions, etc. The molecules should
-
8/4/2019 Independent Study Report # 1
15/75
15
interact with each other over sufficient portion of their respective surfaces. Hence, there is
extensive region of complementarity.
The existence of chemical groups as well as the shape and charge distributions are characteristic
properties of antigens and antibodies which are crucial in identifying the interactions between
these molecules. This set of features was called the generalized shape of a molecule [1].
Imagine that the generalized shape of antibody combining site can be described by L parameters:
length, height, width of any bump or groove in the combining site, its charge, etc. The confirm
numbers of parameters or their values is not desirable. Then a specific point in L-dimensional
space called shape-space shows the generalized shape of an attacking molecule of an antigen
binding region with relation to its antigen binding properties.
If an organism has a repertoire of N size, the shape space would contain N points. These points
would lie in finite volume V of the space because there is only a limited lengths, widths, charges,
etc. that an antibody combining site can assume. Antigenic determinants (epitopes) are
characterized by generalized shapes whose complements lie within V as the Ag-Ab interactions
are measured via regions of complementarity.
It is not necessary that antigen and antibody should match exactly. They may match with lower
affinity. The paratopes interacts with almost all the epitopes with Volume V with radius e.
Each antibody can recognize all types of epitope within recognition region of volume V, we
assume that an antigen can present different types of epitopes and hence a finite number of
antibodies can recognize almost infinite numbers of points
-
8/4/2019 Independent Study Report # 1
16/75
16
Figure (5) Shape-Space Model [6]
into volume V. This is related to cross-reactivity phenomenon in bio-logical immune systems.
So, in shape-space model like patterns occupy adajacent regions of the shape space and might be
recognized by the same antibody shape as far as e is provided [6].
5.2 Ag - Ab Representations and Affinities:
The Ag-Ab representation determine the distance measure that can be used to calculate the
degree of interaction between these molecules.
Mathematically, there are three ways to represent antibody-antigen pairs and to determine theirmatching strength:
1. Euclidean shape-space2. Manhattan shape-space3. Hamming Shape-space [4]
-
8/4/2019 Independent Study Report # 1
17/75
17
The generalized shape of a molecule (m), either antibody or antigen can be represented by a set
of real valued coordinates m = . m belongs to L dimensional real valued shape -
space.
The affinity between antibody and antigen is measured by the distance they have between two
strings or vectors, for example in Euclidean or the Manhattan distance. In the case of Euclidean
distance, if the coordinates of an antibody are given by and the
coordinates are given by , then the distance (D) between them is:
(3)
(4)Eqn (3) is depicts Euclidean distance case and Eqn (4) depicts Manhattan distance case.
Shape-spaces that use real valued coordinates and that measure distance in the form of eq (1) are
called Euclidean distance shape-spaces and those iin the form of eq (2) are called Manhattan
shape-spaces.bols
Another shape space is Hamming shape space in which the antigen and antibody are termed as
symbols sequences over an alphabet of size k. Such sequences can be interpreted as peptides
and the different symbols as characteristic properties of amino acids. In context of artificial
immune systems the mapping between shape and sequence are equivalent.
-
8/4/2019 Independent Study Report # 1
18/75
18
(5)
Equation (5) depicts hamming distance measure.
From equation (3) to (5) we see how to determine the affinities between molecules in Euclidean,
manhattan and hamming shape-spaces, respectively. In order to study the cross-reactivity, it is
important to coin the relation between distance D, recognition region and matching threshold.
When the distance between two sequences is maximum, the molecules have exact complement
and their affinity is also maximum. In other cases, suppose the matching affinity is not
maximum, it is good to take into consideration real valued spaces differently than hamming
spaces in measuring ag-ab interactions.
In Euclidean and Manhattan, a limit on the magnitude of each shape-space parameter cab be
employed. Moreover, the distance can be normalized, for example, over the interval [0, 1], so
that the matching strength also lies in the same range.
If we assume binary representation of ag-ab interactions then graphical ieraction is clear in
hamming shape-space. In the universe of bitstring representation the molecular binding takes
place only when the bitstrings are complementary to each other. For example,
ab =
ag =
-
8/4/2019 Independent Study Report # 1
19/75
19
Figure (6) Antigen- Antibody perfect matching using bit-string representation [6]
The affinity between antibody and antigen is the number of bits that are complementary in the
representation string. The way to measure the affinity is by XOR operator. The desired matching
strength between two randomly taken bitstrings equals to half of thir length(if they are the same
length).
A binding value shows whether the molecules are bound or not. In other words, it means if the
antigen is recognized or not by antibody. We can use several activation functions that can give us
idea regarding the binding value in proportion to the distances between the ab and ag molecules.
A bond is established only when the value of the match score is greater than (L e) in case of the
threshold function.
In continuous case the sigmoid function is good to apply where the e relies in the inflexion
point pf the curve.
In the hamming shape-space, the set of all possible antigens is considered as a spaces points,
where antigenic molecules with similar shapes occupy the adajacent points in the space. The
-
8/4/2019 Independent Study Report # 1
20/75
20
total number of unique antibodies and antigens is , where k = size of alphabet and L = thebitstring length.
A given antibody covers some portion of the shape-space depending on the recognition of some
sets of antigens. The matching threshold e determines the coverage provided by a single
antibody and in case when e = 0, then a perfect match is necessary. It means that an antibody and
antigen must be exacy complement of each other.
The number of antigens covered within a region of radiuse is given by:
() (6.1)
C = coverage of the antibody,
L = length of the bitstring,
e = matching threshold.
On the basis of eqn (6), a given bitstring of length L and an matching threshold e, the minimum
number of antibody molecules (N) necessary to complete the shape-space coverage can be
defined as
(6.2)ceil is the operator that rounds the value in parenthesis towards its upper nearest integer [2,4,6].
-
8/4/2019 Independent Study Report # 1
21/75
21
6. The AIS ModelThe artificial immune system model proposed by J.D. Farrner and N.H. Packard is simple
enough to simulate on computer but that still contains enough realism to embody characteristic
properties of the network. In this model they have left out many crucial features such as T-cellsand macrophages which contain the essence of the idiotypic netwok.
The sequence of amino acids specifying the chemical properties of the epitope and paratope are
represented as binary strings. So, in this case, the antibodies are viewed as to be composed of
two amino acids , 0 and 1. The sequence of five binary numbers can be corresponded to amino
acid. In this way twenty amino acids can be represented. The simplification that is considered
here is that each antigen and antibody has only one epitope but in reality one can see antigen or
antibody has many different epitopes[5].
Thus, an antibody is represented as (p,e), where p represents paratope and e represents the
epitope string. The allowed reactions between different antibodies and between antibodies and
antigens are found by searching the complementary matches between strings.
The exact string matching is not required. The strings are allowed to match in any possible match
in order to model the two molecules in more than one way. Let represents the length ofepitope string and represents the length of paratope string. So, the matching threshold isdefined as s min(, ), below which the two antibodies will not react at all. Let denotethe value of the n-th bit of i-th epitope string, shows the n-th bit value of the j-th paratopestring [1,2].
-
8/4/2019 Independent Study Report # 1
22/75
22
Now, the matching specificities is given by:
( )... (7)
In above equation (7), represents the exclusive-or operation for complementary matching.
6.1 Procedure Used for Computing Partial Matches:
Figure (7) Epitope and Paratope string matching [5]
In this example, = = 8 and s = 6. Alignments with -2 k 2 are possible. Here k = -1 sothat is comparable to . For the above example, G = 1; for k = -1 and G = 0 forall other values of k, hence = 1.
So, G = x for x > 0 and G = 0 otherwise. The sum over n ranges over all possible positionson the epitope and paratope; the sum over k allows the epitope to be shifted with respect to the
paratope . G determines the strength of a possible reaction between the epiopte and the paratope.
For goven alignment, i.e, value of k, G is 0 if less than s bits are complimentary and G = 1 +
-
8/4/2019 Independent Study Report # 1
23/75
23
when s or more bits are complimentary. If matches occur at more than one alignment, we sum
their strength to consider that the molecules might be able to interact in more than one way, and
thus react more strongly because they spend more time together than molecules that can interact
in only one alignment [5].
In this model, free antibodies with antibodies attached to cells are lumped together and only of
the total number of antibodies of a given type i in terms of the concentration variable xi are kept
track of.
What happens when two different antibodies interact? In this interaction Farmer and Packard
assume the paratope on one antibody recognizes the epitopes on the other antibody. They agin
aasume that the result of such interaction is that the antibody with the paratope reproduces some
fixed numbers of times, while some fixed probability , the antibody with the epitope is
eliminated. The degree to which one antibody reproduces and the other dies is controlled by the
degree of complementarity between the paratope and the epitope. So, the model is symmetric
with regard to antibody interaction.
Suppose N be the number of antibodies with concentrations {, , , } and n antigenswith concentrations {, , ..}. It is possible to avoid simulating the microscopicdynamics in differential equations for the concentrations. This is only possible only when the
system is well mixed and sufficiently large such that the number of interactions needed to
produce a significant change in the concentration of any particular type of antibody is huge.
-
8/4/2019 Independent Study Report # 1
24/75
24
On the basis of assumptions:
[
]
(8)
In above equation (8), the first term represents the stimulation of the paratope of an i-th anitibody
by the epitope of j-th antibody. The second term represents the suppression of i-th antibody by j-
th antibody. The probabaility of collision of antibody of type i with antibody of typr j is shown
by term and parameter c indicates the number of collisions per unit time and rate ofamtibody production simulated by collision.
The match specificities term indicates what reactions occur and how strongly. representsprobable inequality between stimulation and suppression. When = , there aresymmetrical interactions between paratopes and epitopes and the model is similar to one
proposed by Hoffman.
In order to model entire immune response, the concentrations of antigens should also be
introduced that may change depending upon the number of antigens increase or decrease. The
last term shows the death rate. The best way to change in such a way the total concentrationof the system at a fixed value[5].
The list of antibody and antigen types is dynamic. The changing occurs due to new types are
added or removed. The value N and n changes with time but on time scale it is slow as compared
to changes in . In eqn. (8), we do integration over a period of time. The composition of systemis examined and updated as it is needed. To update we put minimum threshold an all
concentrations so that a variable and all of its reactions is eliminated when the concentration
goes below threshold.
-
8/4/2019 Independent Study Report # 1
25/75
25
The generation of new antibody types is done through genetic operators that is applied to
paratope and epitope strings such as Crossover, inversion and point mutation. In crossover, two
antibody types are randomly selected and randomly positions within the two strings are chosen
and then the pieces on one side of the chosen position are interchanged in order to produce two
new types. Epitopes and paratopes are crossed over separately. By randomly changing one of the
bits in a given string point mutation is implemented and the implementation of inversion is
performed by inverting a randomly chosen segment of the string.
Antigens can be generated by a variety of mechanisms either randomly or by design. The same
antigen type can be given to the system so that we can see whether it can eliminate it or not.
Once the system learns to eliminate it, the number of antigens can be presented to see whether
system forget to eliminate or remember to eliminate the antigen. The number of antigen provided
to the system can be varied [5].
The antibodies whose paratopes match epitopes are amplified at the expense of other antibodies.
If = 1 (equal suppression and stimulation) and > 0 then every antibody type eventuallydies due to the damping term. Letting
< 1 favors the formation of loops of reaction, since all
the numbers of reaction loop gain concentration and can neutralize the damping term. When N
increases, the number of loops and respective lengths also increases.
Even when the system is disturbed by introduction of new types, it can remember certain states
due to robust properties of the reaction loops. The antibodies that can recognize the internal and
external other molecules are retained in the system and their concentration is increased.
Antibodies that do not recognize the other molecules are eliminated. Hence, together with
immunological memory, the system posses the immunological forgetting [5].
-
8/4/2019 Independent Study Report # 1
26/75
26
In the bio-logical immune system, antigens are sometimes restored in the system for long time
which is comparable to lifespan of organism. The exact reason for this is not now known. One
theory states that the antigen remain in degraded form in lymph nodes and their periodic
exposure to immune system retain memory. But as antigens are potentially dangerous, this
theory is highly risky. Another theory is that the B-cells that have reacted to antigens undergo the
dormant state and surface up when similar or kind of antigen occurs again. Such dormant state
can last for periods of weeks or may be months [1].
Another hypothesis is proposed by Farmer and Packard by means of idiotypic network.
6.2 Hypothesis:
Let the concentration of antibodies that recognize the antigen be ab1. Now the concentration of
antibodies that recognize the epitopes of ab1 antibodies be ab2. Continuing this way, let abn be
the concentation of antibody that recognize the paratope of ab (n-1) antibodies. If abn is like
original antigen, then it is like a loop because ab1 is going to recognize abn [3].
Figure (8) The formation of a cycle allows the antigen with epitope e0 to be remembered.[5]
-
8/4/2019 Independent Study Report # 1
27/75
27
Arrows denote recognition through string matching algorithmn. Paratope p(i) recognizes epitope
e (i-1) for i= 1,2 n. To form a cycle, we assume that by chance p(i) recognizes en in addition to
e0. Thus, en must resemble the antigen e0. If the antigen is eliminated, the existence of the cycle
can maintain the concentration of ab1, an antibody that specifically recognizes the antigen [5].
If the paratopes are assumed to functions as epiotpes, then for sure the values of n resemble the
antigen [5].
7. String Matching RulesA matching rule defines matching or recognition, and the distance measure that the former is
based on are the cornerstones in any detection, classification, or recognition algorithms. If you
are dealing with categorical data, then a string representation may be more suitable and a
matching rule like rcb is useful [7].
Several string-matching rules are described below:
7.1 Hamming Distance:
It is defined as the number of different characters between two strings. The hamming distance
between x and y strings is expressed as:
( ) (9)
N = length of the string, and represents the i-th bit of the respective strings, the operationwithin bracket shows the x-or operation [7].
-
8/4/2019 Independent Study Report # 1
28/75
28
7.2Binary Distance:
(10)
Based on the number of bits that match or differ, the extensions of hamming distance have
proposed.
(11)
(12)
(13)
a counts the number of 1s that match at the same position of both the strings; d enumerates the
number of 0s that match at the same position of both the strings; b counts the number of 1s in
string x that do not match string y; and c counts the number of 0s in string x that do not match
string y [7].
-
8/4/2019 Independent Study Report # 1
29/75
29
Different similarity measures are developed which are as follows:
1. Russel and Rao (13)
2. Jacard and Needham
14)
3. Kulzinski
5
4. Sokal and Michener
6
5. Rogers and Tanimoto
7
-
8/4/2019 Independent Study Report # 1
30/75
30
6. Yule
8
7.3 Edit Distance:
It is defined as the minimum number of string transformations between two strings s1 and s2
required to change string s1 into s2 where the possible string transformations include (i)
changing a character, (ii) inserting a character and (iii) deleting a character.
It is also termed as Levenshtein distance, it is a generalization of the hamming distance [7].
Value Difference Metric:
(19)Where
( )
And
denotes the probability that xi equals to the character c in the alphabet C [7].
-
8/4/2019 Independent Study Report # 1
31/75
31
7.4 LandscapeAffinity Matching:
This type of matching is used to capture the notion of matching biochemical and physical
structures and approximate matching to immune system. Input string and antibody string are
converted to bytes and then into positive integers to create landscape. Using sliding window, two
strings are compared [7]. Three different similarity measures are defined as:
Difference Matching Rule:
| | (20)
Slope-Matching Rule:
| | (21)
Physical matching:
(22)
7.5 R-Contiguous Bits Matching:
The rcb matching rule is defined as follows:
If x and y are equal length strings, then they are said to be matched if x and y match at atleast r
contiguous locations and we say match(x,y) is true.
-
8/4/2019 Independent Study Report # 1
32/75
32
Example:
If x=ABADCBAB and y=CAGDCBBA, then we can say that match (x,y) is true for r
-
8/4/2019 Independent Study Report # 1
33/75
33
8.1 The Bone-Marrow Object
It decides where in network the antigen has to be inserted, which B-cell is dying and causing
increase in concentration of cells beneficial to the network. The bone marrow object possesses
main algorithm which starts immune response by inserting antigen in b-cell network. Thealgorithm is as below:
Randomly initialize B-cell population
Load antigen population
Till end is reach DO
Select antigen randomly from antigen population
And insert such selected antigen in random point in B-cell network.
Select the approximate percentage of B-cells around insertion point.
For every B-cell selectedDo interaction between antigen and each B-cell selected for immune response.
Arrange these B-cells by the level of their avidity
Delete 5% bad cells out of B-cell population
Create n new B-cells (n = 25% of B-cell population)
Out of this n, select m cells to join the immune network (m = 5% of population) [9]
B-cell Object
The B-cell object possesses a pattern matching element. The B-cell object records the affinity
level of the B-cell and looks after the links to any other B-cell object it is in connection within
network of B-cells.
Antibodies
When an antigen meets antibody, an immune response is elicited and a match score is recorded.
If this score is more than or equal to threshold, the binding between antibody and antigen occurs.
-
8/4/2019 Independent Study Report # 1
34/75
34
Antigens
Each antigen which is potential is represented by antigen object possessing one epiotpe. The
antigens are defined in external ASCII files and are inserted into AIS by the antigen population
object. The object realizes the a series of lists from files and instantiates those series of list as
objects of antigens.
B-cell Stimulation
[ () () ] -
Above equation represents the stimulation of B-cell.
8.2 Applying AIS to Pattern Recognition Problem
1. B-cell ObjectsThe antibodys paratope is created from mRNA list. The bit string is copied by AIS in
complementary manner.
2. AntibodiesBit String representation is used for pattern recognition problem. So, the antibody
representation is of 0s and 1s.
3. AntigensAIS is tested by two diverse antigens population possessing the antigens binary list of
20 elements.
The antigen population used to immunize the AIS is of three pattern type forming 33% of
the population of antigen. The population consists of originals as well as the modified bit
strings introducing noise into the data.
-
8/4/2019 Independent Study Report # 1
35/75
35
Antigen Population Representation:
11111111110000000000 33%
00000000001111111111 33%
00000111111111100000 33%
4. Antigen/AntibodyIn order to determine the match between Ag-Ab, instead of following match to start at
any point on the antigen, a circular approach is followed. Hence, if the pattern described
by the antibody starts halfway along the antigen, then the antibody is shifted half way
along its length and hence a entire match is noted.
Bit Shifted Antibody:
Antibody 0 0 1 0 1 0 1 1 1 0
Antigen 1 0 0 0 1 1 1 0 1 0
Bit Shifted Antibody 0 1 1 1 0 0 0 1 0 1
8.3 The match algorithm:
Repeat
For each region consisting of 2 or more 1s note their length if
then
=
Shift Ab right 1 bit
Until Ab shift complete
-
8/4/2019 Independent Study Report # 1
36/75
36
Calculating Match Value:
Antigen: 0 1 1 0 0 0 0 1 1 1 1 0 1 1 0
Antibody: 1 0 0 1 1 1 0 0 0 1 0 1 1 0 1
XOR: 1 1 1 1 1 1 0 1 1 0 1 1 0 1 1 12
Length: 6 2 2 2
MatchValue: 12 + + + + 88Hypermutation:
In milti-point mutation, each bit selected was flipped and in sub-string regeneration, all the
elements between the two desired points are flipped.
8.4 Running the System
99 binary antigens were used to immunize the system. The test population was then presented to
AIS. The learning part was turned off while testing phase and hence the system is capable of
showing the secondary immune response. In other words, the system can determine whether the
antibody determine the antigen or not.
50 Iterations were performed for the immunization process in which the antibody population
increased from 10 to 28. Then comes the turn for secondary response by presenting antigens as
shown below.
-
8/4/2019 Independent Study Report # 1
37/75
37
1111111110000000000 TEST 1 *
0000111000110010001 TEST 2
1110010010010010010 TEST 3
0000000001111111111 TEST 4*
1010101000101001110 TEST 5
1111001010100110100 TEST 6
0000011111111110000 TEST 7*
TEST 1,4 and 7 are original antigens used in primary response. TEST 2,3 are modified versions
of TEST1. On the same lines, TEST 5,6 are noised version of TEST 4.
AIS should be able to identify TEST 2,3,5,6 without any difficulty [9].
9. Dynamic Behavior Arbitration using AISAkio Ishiguro et. al proposed a inference making system inspired from immune system in living
organism and applied it to behavior arbitration of autonomous mobile robot as conventional AI
systems have brittleness under dynamic changing environment. They try to evolve affinities
among antibodies using genetic operators.
Much attention has been focused on the behavioral decomposition approaches as there are
limitations on the functional decomposition for conventional AI. The arbitration among
competence modules arises difficulties in behavior-based arbitration.
-
8/4/2019 Independent Study Report # 1
38/75
38
To overcome such difficulties, Maes proposed behavior network system under which an action
suitable for the current situation and the given goals emerges on account of interaction between
different competence modules. Akio Ishiguro et. al approached this problem from
immunological point of view as shown in fig. 6.
Figure (9) Architecture of Algorithm [9]
As shown in figure, current situation, like, distance, direction to the detected obstacle perform
action like antigen and competence modules and interactions between modules perform action as
antibody and stimulation/suppression between antibodies, respectively. The baseline for such
approach is that the best possible antibody is selected for antigen.
-
8/4/2019 Independent Study Report # 1
39/75
39
Figure (10) Immune Networks [8, 9]
In order to verify the ability of their proposed, they simulated it. There are three kinds of objects
in this simulated environment: a] predators, b] obstacles and c] foods. For quantitative
evaluation, following assumptions are made:
1. For movement, the immunobot consumes energy say Em.2. If the immunobot is captured by predators, Ep amount of energy is consumed.3. If immunobot collides, Eo energy is vanished.4. If the immunobot get the food, it gets Ef energy.5. For avoiding over-charging, the obtain-food behavior is not emerged after sufficient of
food is already obtained.
The predators attack immuno-bot only if they are in predefined limit or range. So, to survive, the
best possible antibody is desired.
The figure below shows the structure of immunobot used in the simulations. It is armed with
external and internal detectors. External detectors are sensors in eight directions detecting
-
8/4/2019 Independent Study Report # 1
40/75
40
predators, obstacle and food. The distance is also detected by each detector in terms like near,
mid and far. The internal detector detects energy level.
Figure (11) Structure of Robot [8]
9.1 Description of Antibodies
The prepared competence module is antibody. The important thing for immunobot is to select the
best antibody for antigen and such is dependent on the how the antibodies are described. The
selection should be made in bottom-up manner with proper communication between the
modules. The structure of paratope and epitope is crucial for specificity or we can say for
identity of any specific antibody.
Paratope is desirable condition and the epitope is disallowed condition. The paratope and
idiotope are divided into three positions: obstacles, direction and distance. The typical
inference/consensus system adopt a condition-action description just like in fuzzy inference and
-
8/4/2019 Independent Study Report # 1
41/75
41
the proposed system uses condition-action-condition manner. Such manner provides
decentralized dynamic inference in a bottom-up manner.
Figure (12) Antibody Description [9]
The prepared antibody for antigen can be like below:
The antibody is activated if the immunobot detects the food in the front direction and mid-range,
and makes the immunobot move forward to pick it up.
Figure (13) Prepared Antibody [9]
-
8/4/2019 Independent Study Report # 1
42/75
42
However, if a predator exists in front and near/mid range, or if a food is in near range, the
prepared antibody can hesitate to be activated.
On similar lines, the other antibodies are designed.
9.2 Dynamics
In this model, the authors allow only one antibody to get activated when it surpasses the
prespecified threshold. One state variable is introduced in terms of concentration of each
antibody.
{ } (23)
= concentration of antibody that varies with time. =matching ratio between antibody i and j.
9.3 Basic mechanism of the proposed inference making network
Four antigens are listed in the figure shown and the listed five antibodies mainly participate in
the inference/consensus making. For instance, antibody 1 means that the food is detected by
immunobot in far range in front direction and so it is allowed to move forward. Other situations
involve immunobot identifies food in near range/predator in front/high energy level, this
antibody would stimulate other antibodies whose paratopes displays such conditions.
-
8/4/2019 Independent Study Report # 1
43/75
43
Figure (14) Antibody Selection [7,9]
Consider current energy level high, the antibodies 1, 2, 3, and 5 are stimulated by the antigen.
The concentrations of these very antibodies are incremented in accordance to its antigen. The
interaction within immune networks antibodies is importan. In the end, antibody 5 is selected in
figure 9.
In the case of current energy level low, antibody 3 is selected [9].
10. Latest Immune Models and Hybrid Approaches
10.1 Danger Theory based algorithms
In 2002, Aickelin and Cayzer include the following aspects in their AIS from danger theory:
1. Appropriate number of APC to display danger signals needs to be modeled.2. Danger signal is either positive or negative, representing the presence or absence of the
signal.
-
8/4/2019 Independent Study Report # 1
44/75
44
3. So far as biology is concerned, the danger zone is spatial but in computation model theother notions such as temporal proximity is used.
4. Sometimes the killer cells causes self cell death, this should not generate other dangersignals.
5. Priming killer cells should be considered via APCs in AIS models6. Antibody migration rule should specify the concentration of antibodies receiving signal 1
and signal 2 from a given APC.
DT depends on the concentration so different immune cells.These aspects are used to build better
AIS for anomaly detection in which the non-self do not trigger immune response without danger
signal [7].
Figure 15 (a) One Signal Model [7]
Figure 15 (b) Two Signal Model [7] Figure 15 (c) APC controlling IR [7]
-
8/4/2019 Independent Study Report # 1
45/75
45
Figure 15 (d) INS with third signal [7] Figure 15 (e) danger in control through zoning[7]
Figure 15 (f) Control through INS and zoning [7]
-
8/4/2019 Independent Study Report # 1
46/75
46
In 2010, the online supervised two-class classification problem was attempted to solve by using
danger theory. The proposed method is described below:
The algorithm regarding the proposed method are as follows:
Algorithm 1
Danger theory based immune algorithm.
1. Introduce antibody population and memory
2. While stopping conditions are not met do3. For i=0 to antigen population do4. Present antigen to the system5. Now the danger is created by antigen presented6. General antibody population receives signal 0 from antigen presented7. General antibody population receives signal 1 from danger zone8. Antibodies that receives both 0 and 1 signals are selected9. For all antibodies belonging to stimulated antibodies10.Change the status of antibodies11.Now the calculate the interaction between antibody and antigen12.End for
-
8/4/2019 Independent Study Report # 1
47/75
47
13.Suppress antibody population14.Decrease the danger from the antigen which has been already considered
15.For all antibodies belonging to stimulated antibodies16.Ifthe antibodies stimulation reaches certain threshold value then17.Apply clonal selection algorithm18.End if19.End for20. End for21.Check the stopping criteria22.End while23.Output is the memory of antibodies selected via clonal selection and met threshold value
When the learning algorithm is ended, the output antibodies are used to classify for unknown
antigens. A simple process in which an unknown antigen will be classified as the same class as
the antibody with which it has the very low affinity.
Learning Algorithm explained:
1. Initialization: The above algorithm mentioned starts with the antibody random populationand they are assigned labels. Their status are set to zero and memory are set to empty set.
2. Two kinds of signals: The detection of danger signals are co stimulation signal whichare termed as 1 while other are termed as 0. The antibodies populations are divide in to
-
8/4/2019 Independent Study Report # 1
48/75
48
two parts; a] general and b] memory. The memory antibodies are not interested in
reaction with antigens. They are the fixed memory of antigens. They are changed only
when they are suppressed. The general antibodies get signal 0 when presented with
antigen. So, the antibody can detect the stimuli of current antigen and when signal 0 is
perceived only when danger zone is created. The antibodies receiving both signals are
stimulated and can change their status.
Algorithm 2
1. Antibody stimulated = antibody stimulated +1.
2. Ifantibody label == antigen label then
3. Antibody-Antigen reaction =1
4. Else
5. Antibody-Antigen reaction = -1
6. End if7. Antibodyrelevance = antibody-relevance + antibody antigen reaction8. Variable danger zone (var) = affinity between antigen and antibody9. Calculate the antibody stimulation = antibody +antibody - antigen reaction * var10. Var = stimulated antibody population11.Antigen danger = Var *var*antibody stimulation
-
8/4/2019 Independent Study Report # 1
49/75
49
Algorithm 3
1. Ifantibody stimulation (as) < threshold value (t) then
2. Delete antibody population that are less than threshold3. Else ifas
-
8/4/2019 Independent Study Report # 1
50/75
50
5. Delete the antibody with high interactivity6. End if7. End for8. Group the memory antibodies in to pairs9. For all pairs do10.Calculate probability p211.Ifrandom< p2 then12.Remove the memory antibody with high affinity13.End if14.End for
10.2 Combining Dendritic Cells and Danger Theory
In 2007, Yeom used a approach of mixing DT and DC to form model for signal pre-
categorization. The following are principles:
1. Pathogens associated molecular proteins (PAMPs) are expressed by bacteria that can beidentified by DCs for change in behavior.
2. Danger signals are generated by unplanned death of necrotic cells. The sudden andbizarre or chaotic death of internal components of cell causes danger signal to surface up.
DCs are sensitive to concentration of danger signals. The presence of danger signal may
or may not show change but the probability of change is higher than the normal
situations.
-
8/4/2019 Independent Study Report # 1
51/75
51
3. Safe signals are due to normal death of any cell for regulations reasons and the tightlycontrolled process results in the release of various signals into the tissue. Such safe
signals give rise to suppression signals.
4. Inflammatory cytokines can be released as a result of injury, although the process ofinflammation is not enough to stimulate DCs alone.
DCs can stimulate nave T cells and have number/ of functional properties (Yeom, 2007)
DCs first function is to inform immune system to respond when there is attack.
DCs perform different functions depending upon their state of maturation. Modulation between
these state is facilitated by identification of signal between tissues, namely, danger signal,
apoptotic signal and inflammatory signal.
In tissue, DCs collect antigen and experience danger signals from necrosing cells and safe
signals from apoptotic cells. Maturation of DCs occurs in response to the receipt of these signals.
According to Yeom (2007), if there is concentration of danger signals in the tissue at the time of
pick of antigen, the DC is fully matures. Conversely, if there is safe signal, then DC gets matured
differently [7].
10.3 Multilevel Immune Learning Algorithm (MILA)
Both T and B level recognition mechanism is used in this algorithm. It is inspired by the
communication and processes of T-cell dependent humoral immune response. In biological
immune system, B- cells recognize antigen through immnoglobin receptors on their surfaces but
they are not proliferate and differentiate until the green signal is given from Th cells.
-
8/4/2019 Independent Study Report # 1
52/75
52
For Th cells to allow B cells to proliferate and differentiate, Th cells should get stimulated and
that happens only when Th cells recognize antigens in the context of major histocompatibilty
complex (MHC).
Suppression of B cells also occurs due to suppressor T cells. The activated B and T cells move to
lymph nodes where they proliferate, mutate, select, differentiate, and death of B cell takes place
in germinal centres (GCs).
In MILA, an abstraction of above events is incorporated to develop detection algorithm. The
algorithm consists of initialization, recognition, evolutionary and response.
In initialization phase, the detection system is trained to recognize the self. The result of
initialization is used to produce detectors, similar to populations of Th, Ts, Bcells which
participate in immune response (humoral). There are three level :
1. APCs level, that corresponds to highest one.2. B-cell level, the intermediate one.3. Th- cell level, bit level for local patterns.
MILA use rcb-matching rule for real valued representations. A Th cell uses the slide window to
get the w elements. However, B cells uses randomly chosen w elements. The concept of
prematuration and crossover operators can be used.
The another feature of MILA is positive selection by Ts cells that are based on self samples.
An evolutionary phase in MILA is a process of refining the detector set if the earlier detection
rates can be evaluated. This phase involves cloning, mutation, and selection; however cloning in
MILA is targeted one only those detectors that are activated in the recognition phase can cloned
[7].
-
8/4/2019 Independent Study Report # 1
53/75
53
10.4 Combining Negative Selection and Classification technique
In anomaly detection technique, only positive samples are available (self-sample) at the training
stage. However, most conventional classification algorithms need noth self as wells as non-self
algorithms.
In order to allow conventional algorithm to be used, when only self samples are there, a hybrid
algorithm is proposed by Gonzalez (2002) which is used to create synthetic samples from a set of
self-samples. The algo develop the detector set that covers the non-self space using NS and then
points are used to generate the samples for non-self class allowing the use of conventional
algorithm useful.
Figure (16) NS-SOM in generation classifier dataset [7]
.
Particularly, negative samples are generated from positive samples. Then samples from the both
classes are used for neural network for self organizing map (SOM). An SOM, composed of
nodes or neurons (that are able to identify input type) , is a type of AIN that is trained to produce
-
8/4/2019 Independent Study Report # 1
54/75
54
a low-dimensional representation of the input space or self/non-self feature space of the training
samples called map. [7,8].
In order to allow conventional algorithm to be used, when only self samples are there, a hybrid
algorithm is proposed by Gonzalez (2002) which is used to create synthetic samples from a set of
self-samples. The algo develop the detector set that covers the non-self space using NS and then
points are used to generate the samples for non-self class allowing the use of conventional
algorithm useful.
The three phases of NS-SOM are shown in figure below:
Figure (17) NS-SOM Model Structure [7,8]
-
8/4/2019 Independent Study Report # 1
55/75
55
11.Immune Networks and Negative Selection Based algorithmThe mixture of Negative selection and Ab-Ab communications algorithm was developed for
navigation control and path mapping of autonomous mobile robot by Prashant Rao (2008) for
Khepera II robot.
The following is the step by step formulation of the algorithm:
1. Initialization: First initialize a network of immune cells (there is superset of 64 antibodiesfrom 0 to 63). The initial concentrations of antibodies are initialized and the robot is
reset. The subset of 20 antibodies is chosen randomly. The stimulation and suppression
between antibodies using basic matching function is defined. The first two sensors are not
ON in their Khepera II robot
2. Population Loop:i) Antigenic Recognition: The information from the sensors is collected and an
antigen is formed based on that information. The matching is determined between
antigen and randomly selected antibodies and affinities are allotted. Each antigen
stimulates many antibodies but only one is perfectly matched and so selected for
process.
ii) Self-Nonself Determination: The antigen is seen for matching to self set in caseinnate memory takes over and system is allotted standard solution and the loop
executes again OR the system moves on to next step.
iii) Network Communications: The interactions between different selected randomlyantibodies is calculated.
iv) Dynamics: The stimulation minus suppression added to affinity betweenantibodies subtracted from the natural death co-efficient gives over all stimulation
-
8/4/2019 Independent Study Report # 1
56/75
56
of the system. The product from the stimulation and concentration of antibodie
provides us with the rate of change of concentration with time. The antibody with
high concentration is sent to critic that rewards or penalize and in respect to this
affinity are modified.
3. Feedback: The penalty allotted T-cell helper is activated and its calculation is determinedat each step. Adaption function is determined by interaction between T-cell and other
cells in network by modifying the affinities between antibodies employing a suitable
learning rate.
4. Step 2 and 3 are repeated until convergence criteria is met.
Figure (18) Algorithm based on Negative selection and Ab-Ab interaction [6]
-
8/4/2019 Independent Study Report # 1
57/75
57
Figure (19) Algorithm based on Negative selection and Ab-Ab interaction [6]
11.1 Latest Dendritic Cell Algorithm Inspired from Danger Theory
Danger theory states that the dangers signals are generated to activate APCs. APCs stimulate
T-helper cells and which finally gives rise to adaptive immune response. The danger signals
are detected by dendritic cells which acts in three modes namely immature, mature and semi-
mature. If the signal detected is safe then the dendritic cell become immature upon presenting
-
8/4/2019 Independent Study Report # 1
58/75
58
antigen to T-cell. If the dangerous signal is found then the dendritic cell is matured and T-cell
become antigen reactive.
The dendritic cell algorithm takes into consideration safe, danger and PAMPs signals. [11]
ALGORITHM:
input : S = set of data items to be labeled safe or dangerous
output :D = set of data items labeled as safe ordangerous.
Start
Generate initial population of dendritic cells (DCs), D
Create a set to include the migrated DCs, M
forall items in set S do
Select a set of DCs by randomly selecting from D, P
forall DCs in set P do
Add data item to DCs collected list
Update safe, danger and PAMPs concentrations
Update cytokiness concentration
Move DC from D to M and generate a new DC in set D if the
concentration is above threshold.
stop
stop
forall data items in S do
count the number of times data item is presented by a mature and semi-mature DC
Label item to be safe if if presented by more than semi-mature DCs than mature DCs,
Add data item to labeled set M
Stop [11]
-
8/4/2019 Independent Study Report # 1
59/75
59
11.2 Latest TLR (toll-like receptor) Algorithm
Algorithmic steps of TLR algo as described by Aickelin and Greensmith (2007) which is
simply designed for anomaly detection in computer networks are as below:
1. Collect set of system calls that are made in training data2. Collect signal values correspondingly3. Determine the complement set of sets in step 1 and step 2.
Figure (20) Systematic Overview of TLR algorithm [7]
-
8/4/2019 Independent Study Report # 1
60/75
60
4. Generate immature DCs (iDCs) set with signal receptors selected randomly from thecomplement signal set and with antigen receptors randomly selected from the
complement system call set.
5. Similarly, generate nave T-cells (nTCs) with antigen receptors randomly drawn fromcomplement system call set.
6. Immature DCs are exposed to sample signals and antigens, respectively.
7. If iDCs matches the signal. it matures (mDCs) and migrates.
8. If an iDC do not migrate in its lifetime, it is semi mature DC (smDCs) and then itmigrates.
9. Migrated smDCs and mDCs present their antigen and try and match nTCs.
10.If mDC presenting antigen matches to nave T cell, then nTCs are activated and it is saidthat we have anomaly.
11.If smDC expressing antigen matches nTC , then it kills nTC to lower false positives.
12.Migrated smDCs and mDCs and killed nTCs are replaced by new cells as per steps 4 and5. [7]
-
8/4/2019 Independent Study Report # 1
61/75
61
12.Recent Developments and Real world ApplicationsSolving problems using Immunological Computation
In order to apply the knowledge of biological immune system to real world problems, one must
first select the immune algorithm depending on the type of problem. The first step is to identify
the elements involved into the problem and how they can be represented in terms of particular
AIS.
To encode such entities, bit-string, real valued, etc, representation approaches can be chosen.
Then the affinity determination measure is selected related to matching rules employed. Next
step is to decide which AIS is beneficial to create a set of suitable entities that can provide a
good solution to the problem in the context [7].
Figure (21) Problem Solving Using AIS [7]
-
8/4/2019 Independent Study Report # 1
62/75
62
12.1 Virus Detection
Kephart(1994) proposed immunologically inspired approach to detect viruses in computer
system. In this, known viruses are identified by their computer coded sequences and unknown
viruses are detected by their unusual behavior in the system. The virus detection software
continuously scans the system to detect the changes. These changes triggers the release of decoy
programs whose sole intention is to become infected by virus [7].
Figure (22) Flow Diagram for Khephart approach for virus detection [7]
-
8/4/2019 Independent Study Report # 1
63/75
63
A diverse suit of decoy programs are kept at different locations in the systems memory to detect
virus. If one or more decoy programs are modifies, then it is sure that the virus has entered the
system and each decoy program contains the sample of virus. The infected decoy programs are
processed by signature extractor to generate the recognizer for the respective virus.
The signature extractor also extracts the attachment pattern of virus to the host in order to repair
the host in case. The signature extractor also must select the virus signature so that it can avoid
false-positives and false-negatives. The signature must be found in each sample of virus and it is
very likely not to be found in uninfected programs in computer system. Once the best possible
signature is found from virus infected programs, it id compared with half-gigabyte corpus of
legitimate programs to make sure that there is no false-positive. The repair information is
checked by testing on samples of the virus and again by human expert [7].
12.2 Immunogenetic Approaches in Intrusion detection
Gonzalez (2002) proposed negative selection with detector rules to detect attacks by monitoring
network traffic. A real valued representation is used for evolving hyper-rectangular shaped
detectors, interpreted as if-then rules, for high level characteristics of self / non-self space. The
experiments were performed using data from 1999 Defense Advanced Research Project Agency
intrusion detection evaluation dataset. AIS approach was able to produce detectors that gave a
good estimation of the amount of deviation from the normal [7].
12.3 Danger theory in Network Security
Aickelin (2002) first proposed danger theory application to network security. Their system
behaves like DCs looking for danger signals just like impulse increase in network traffic or
abnormally high flow of error messages. If such signals goes above threshold, then an alarm is
raised [7].
-
8/4/2019 Independent Study Report # 1
64/75
64
12.4 Robotics and Control
Robot controlled by Ishiguro et. al. (1996, 1998) , Wantanabe et. al. (1998, 1999) and Lee et. al.
(1999) focused on the development of dynamic decentralized consensus-making mechanism
based on the immune network theory. In dynamic environment, the immunoid is able to collect
the garbage. The metaphor of antibodies, which were potential behaviors of immunoid ; antigens
were related to environmental inputs just like garbage, wall, home base. For the immunoid to
take decide to the best, it matches antigen to antibody [7].
Vertebrate immune systems are inspiration for computer scientist and engineers to create new
algorithms in order to solve real world problems, four main AIS algorithms are:
1. Negative selection algorithms2. Artificial immune networks3. Clonal selection algorithm4. Danger theory and dendritic cell algorithm
The recent development include AIS application in computer security, optimization, data mining,
fault detection, etc. Many authors have explained the recent developments in AIS just like Garret
(2005) who tried to deal with the development before 2005 and attempt to make evaluation of
AIS in criteria of distinctiveness and effectiveness. Hart and Timmins (2010) discussed
application of AIS and proposed a set of problems features for the heavy applications of AIS.
Some of the recent developed models and Hybrid approaches are explained below:
12.5 Conserved Self Pattern Recognition Algorithm (CSPRA)
This very algorithm is recent algorithm in AIS area with an inspiration from Pattern Recognition
Receptors Model (PRR). According to PRR Model, the self/nonself discrimination requires
-
8/4/2019 Independent Study Report # 1
65/75
65
stimulation from APC. On the other hand if one sees, APCs are not stimulated until and unless
they are activated via PRR that identify molecular patterns on bacteria. So, for sure, the PRR
model added additional layers of molecular patterns. CSPRA (2010) naturally include negative
selection algorithm and the anomaly detection in CSPRA is performed by combining the results
from APCs self pattern recognition and T-cell negative selection. Self pattern recognition by
APCs is not done till antigen is not detected by T-cell negative selection algorithm. The
generation of APC detector includes two major steps:
1. Depending on the function between antigen and its feature space, we define theconserved self pattern that can be pre-defined from the data. This very data includes the
empirical one from the laboratory or it can be calculated mechanically by using Pearsons
co-efficient values between the coloum of each attribute and their respective label.
2. By evaluating the maximum, minimum and mean of all the values in the features space ofloc1, loc2,..,generate APC detector R = {(loc1, min, max, mean), (loc2, max, miin,
mean)..} within the conserved self pattern of features located in loc1, loc2..
As compared to classical negative selection algorithm, the proposed and tested CSPRA
Algorithm shows more better and promising results reducing the number of false errors
without increase the complexity. [3, 4, 13]
12.6 Recent Complex Artificial Immune Systems (CAIS)
CAIS consisted of five encountered layers namely encounter layer, preprocessing layer, MHC
layer, competitive layer and stimulation layer. Antigen and Antibody are termed as the input and
output. Suppose an antigen is encountered by the system then there are two ways in which wecan recognize it. One is in which B cell direct recognition and the other way is through the APC
layers. The input is given to APC layer, then the molecular complex pattern formed is given to
MHC layer for processing. The information coming from APC is transformed and translated into
MHC and feed to Th layer. In this Th layer, the cells receive different responses from MHC layer
and develop a set that consists of Th cells that provide better response to input antigens. B-cells
-
8/4/2019 Independent Study Report # 1
66/75
66
become activated due to stimulation from Th layer and also by input pattern. An antibody is the
difference between an input and weights associated with b cells. Ts cells modulate the weights
associated with immune cells located in neighborhood set. As compared to binary immune
systems, the CAIS has invariant feature to recognize translation, rotation and scale patterns. It
can be applied to hand writing pattern recognition problem [11, 13].
12.7 Hybrid Approaches
BAIS (Bayesian Artificial Immune Systems) is developed by removing the mutation and cloning
operators from the probabilistic model for solving the optimization problems and multiobjective
optimization. BAIS is capable of capturing the most relevant interactions between the problem
variables. The very algorithm adopts the population based strategy for search and Bayesian
network for implementing the probabilistic model.
Once the population is initiated, the algorithm starts the loop with stopping condition and the
following steps are evaluated for loops:
a. Using proper selection technique, select the best population from the given set.b. For the best solutions, develop the Bayesian networks that best fits to the selected best
solutions.
c. Sample the antibodiesd. Remove the antibodies with lower fitness and so the similar ones in the criteriae. Now put randomly generated antibodies in the selected ones to maintain diversity [13].
BAIS can be applied for feature selection using wrapper approach. It has the ability to handle the
building block in optimization of Trap-5 and such building blocks are non-overlapping and
overlapping. The multi objective Knapsack optimization can also be solved very efficiently by
BAIS algorithm. Such a approach is termed as the Multiobjective Bayesian Artificial Immune
Systems (MOBAIS) that can be applied for classification problems. It is capable of identifying
-
8/4/2019 Independent Study Report # 1
67/75
67
and preserving the building blocks effectively while it can search and find diverse high leve;
local optimal. The practical application shows that it has parsimonious results and thus shows
accurate results. Furthermore the Bayesian networks were enhanced by learning to avoid the
synthesis of the network at each iteration and only update those two parameters that are crucial
for example the conditional and marginal probabilities at each iteration [13].
The algorithm with an unstructured damage classification based on the data clustering and AIS
pattern recognition can be performed. Such a technique uses the data clustering training data to a
specified number of clusters and generate the initial memory cell set. By combining Afor
example.IS pattern recognition algorithms, this algorithm for the evolution for memory cells.
AIS with SVM can be used for fault diagnosis of induction motors. AIS is used for tuning the
parameters of kernel and penalty for classification accuracy.
In immune multiagent recognizer, each agent recognizer is an immune RBF neural network
model. In the immune RBF neural network model, antigen is input and the antigens are the
compression cluster mapping that is the hidden layers. The output weight can be determined by
using least square algorithm. In this algorithm, each level of recognition systems contain
recognizer that can recognize a sort of antigen.
A multiple valued immune network classifier (MVINC) based on immune netwotk theory was
applied for remote sensing images and performing immune memory using logic theory and
immune theory for classification.
EaiNET combined the AIS and Particle Swarm optimization which uses the learning technique
of PSO which is nothing but the each individual is able to learn the best from the social
population on account of which the convergence rate increases.
Radial Basis Function (RBF) artificial neural network and AIS are combined for compression of
the data in the set. Such a tool is called as aiNET. This can also be used for determining the
number of RBF in ANNs and thus termed as RBFNN.
-
8/4/2019 Independent Study Report # 1
68/75
68
A fault diagnosis model was proposed based on the immune evolution algorithm. The design part
includes the diversity evaluation that is very complex and fault detection is hard, fault calculation
technique integrated the induction and static was designed [13].
Particularly, by combining the agent based modeling and UML, the computational properties of
degenerate recognition systems are investigated. In this, It is possible to determine the
degenerate receptors and that when compared to a non degenerate system, recognition appears
quickly.
In the resource limited AIS, the Network Affinity Threshold (NAT) does not calculate the
network evolution process because the network granularity is determined by NAT and the initial
value is calculated by distance between the antigens. The convergence of the public and the
stability can be impaired by pure clonal selection and random change operation.
The gene immune detection algorithm with complement operator decreases effectively false
position surfaced up in the previous gene immune detection. Also the vaccine and the
complement are introduced. The number of detector are reduced and the efficiency of detection
is increased. The complement operator overcome the defect of the gene immune algorithm and
the detection time can be increased drastically.
ICAIS for incremental clustering based on the principles of AIS was introduced and it
implements incremental clustering and uses the basic immunity response to determine the data
regarding to novel clusters and it also uses the secondary immune response to identify the data to
old patterns [13].
Based on Learning Vector Quantization (LVQ) and immune network [13] model that is an
extension to the basic Jernss Model was proposed that can be used for pattern recognition. The
new classification Hybrid Fuzzy Neuro- Immune Network method based om Multi Epitope
approach. The performance of the proposed method shows promising result in terms of pattern
recognition.
-
8/4/2019 Independent Study Report # 1
69/75
69
APPENDIX A
Pattern Recognition in the Immune System using a Growing SOM
[ The following project is taken from Ph. D Thesis of Leonardo De Castro ]
function [w,win,cwin,D] = abnet(ag,eps,comp,alfa,beta,pc,pm),
% Pattern Recognition in the Immune System using a Growing SOM% Bipolar Splitting/Pruning Self-Organizing Feature Map (GSOM)% with Evolutionary Phase% Main features: bipolar weights, Hamming Distance, Winner takes all% PHASE I: Growing followed by Pruning% PHASE II: Supervised Evolution%% function [w,win,cwin,D] = hybrid(ag,eps,comp,alfa,beta,pc,pm),
% w -> weight matrix (Ab population)% win -> winner for each Ag (v)% cwin -> amount of winning of each individual (tau)% D -> hamming distance of each Ag with relation to its mapped class% ag -> antigen population to be recognized (n2xs2)% eps -> ball of stimulation% comp -> comparison: 1 for comparing complementary chains% 0 for comparing identical chains (Hamm. dist.)% alfa -> amount of bits to be changed% beta -> number of iterations for reducing the learning rate%% Auxiliar functions: COVER, UPDATE, SPLIT, PRUNE, MATCH, CADEIA, TESTGSOM% The columns of w must be similar to each Ag
if nargin == 2,[n2,s2] = size(ag);comp = 0;alfa = 3;beta = 3;pc = 0.6;pm = 0.1;
end;
% Network parametersep = 0; alfa0 = alfa; TD = 1;[np,ni] = size(ag); no = 1; vep = [0];[C,maxno] = cover(ni,eps); vno = [1:1:no];disp(sprintf('Coverage of each Ab: %d',C));disp(sprintf('Initial number of classes: %d',no));disp(sprintf('Possible number of classes: %d',maxno));if maxno > np,
maxno = np; disp(sprintf('Maximum number of classes (N): %d',np));end;% disp(sprintf('Affinity threshold: %d',eps));disp(sprintf('Press any key to continue...'));
-
8/4/2019 Independent Study Report # 1
70/75
70
pause;[w] = cadeia(ni,no,0,0,1);max_ep = (beta + 1) * maxno;
% Network Definitionwhile (ep < max_ep & TD > 0)% & no < maxno),
cwin = zeros(1,no); k = 0;vet = randperm(np); % Assincronouswhile k < np,
k = k+1; i = vet(k); D = [];[D,mXOR] = match(w',ag(i,:),comp);[v(k),ind] = min(D);cwin(ind) = cwin(ind) + 1;win(i) = ind;w = update(w,ind,alfa,mXOR(ind,:)');
end;TD = sum(v);ep = ep + 1;% Growing Phaseif (rem(ep,beta)==0),
[w,no,alfa] = split(cwin,win,w,ag,eps,alfa,alfa0);vno = [vno no]; vep = [vep ep];
end;% Pruning Phase[aux,indmin] = min(cwin);if aux == 0,
[w,no,alfa] = prune(w,indmin,alfa0);vno = [vno no];
end;% Learning rate decreasingif (ep > 0.05*max_ep & rem(ep,0.05*max_ep)==0),
if alfa > 1,alfa = alfa - 1;
end;end;disp(sprintf('IT: %4.0d no: %d LR: %d TD: %d',ep,no,alfa,TD));
end;[v,win,cwin,perc] = testgsom(w,ag,eps);disp(sprintf('Percentage of misclassified Ag: %3.2f%%',perc));disp('Minimal Antigenic Affinity (HD)'); disp(v);disp('Concentration Level: '); disp(cwin);disp(sprintf('Final Architecture: [%d,%d].',ni,no));figure(1); plot(vep,vno); hold on; plot(vep,vno,'or'); axis([0 ep+1 0 no+1]);title('Growing Evolution');xlabel('Iteration'); hold off;
% --------------------------- %% INTERNAL SUBFUNCTIONS %% --------------------------- %
% Function CADEIAfunction [ab,ag] = cadeia(n1,s1,n2,s2,bip)if nargin == 2,
n2 = n1; s2 = s1; bip = 1;elseif nargin == 4,
bip = 1;end;
-
8/4/2019 Independent Study Report # 1
71/75
71
% Antibody (Ab) chainsab = 2 .* rand(n1,s1) - 1;if bip == 1,
ab = hardlims(ab);else,
ab = hardlim(ab);end;% Antigen (Ag) chainsag = 2 .* rand(n2,s2) - 1;if bip == 1,
ag = hardlims(ag);else,
ag = hardlim(ag);end;% End Function CADEIA
% Function SPLITfunction [w,no,alfa] = split(cwin,win,w,ag,eps,alfa,alfa0)[ni,no] = size(w);[ind] = find(cwin > 1); % which outputs map more than one Agif ~isempty(ind),
[val,out] = max(cwin);% out = ind(1);
v = find(win==out);Mag = ag(v,:); % matrix of ag mapped in the same outputD = match(Mag,w(:,out)',0);[aux,new] = max(D);if aux > eps,
disp('** Growing **');if out == 1,
w = [Mag(new,:)',w];elseif out == no,
w = [w,Mag(new,:)'];else,w = [w(:,1:out),Mag(new,:)',w(:,out+1:end)];
end;no = no + 1;alfa = alfa0;
end;end;% End Function SPLIT
% Function TESTGSOMfunction [v,win,cwin,k] = testgsom(w,ag,eps),% disp('** Running the trained network **');[np,ni] = size(ag); k = 0;
cwin = zeros(1,size(w,2));for i=1:np,
[D] = match(w',ag(i,:),0);[v(i),ind] = min(D);win(i) = ind;cwin(ind) = cwin(ind) + 1;
end;k = 100 * (sum(v > eps) / np);% End Function TESTGSOM
-
8/4/2019 Independent Study Report # 1
72/75
72
% Function PRUNEfunction [w,no,alfa] = prune(w,ind,alfa0),[ni,no] = size(w);disp('** Pruning **');
if ind == 1,w = w(:,2:no);elseif ind == no,
w = w(:,1:no-1);else,
w = [w(:,1:ind-1) w(:,ind+1:no)];end;no = no - 1;alfa = alfa0;% End Function PRUNE
% Function COVERfunction [C,no,eps] = cover(len,eps),fat = fatorial(len);
C = 0;while eps > len,
disp(sprintf('Ball of stimulation bigger than chain length %d',len));eps = input('Enter a new ball of stimulation: ');
end;
for i=0:eps,C = C + (fat/(fatorial(i) * fatorial(len-i)));
end;no = ceil((2^len)/C);% End Function COVER
% Function FATORIALfunction fat = fatorial(m);if m == 0,
fat = 1;elseif m < 0,
disp('Negative value');else,
fat = prod(1:1:m);end;% End Function FATORIAL
% Function UPDATEfunction [w] = update(w,ind,alfa,vXOR),
[ni,no] = size(w);for j = 1:alfa,
[val,pto] = max(vXOR);if val == 0,
break; % exit loop if vectors are equalend;w