Lecture 6 - 2-Transcript
Lecture 6 - 2-Transcript
All right, um. Let's start. Um, let's check the outline again before continue. I
will check the recording. So basically on the right hand side I said we start with
question. We cover benchmark functions. And we've seen some experiments over those
benchmark functions from the real publication. Yeah. And those were the our
experiments result. And as you can tell different version of different algorithm
lead to different performances. And designing the right mean for the problem in our
hand is important basically. Yeah. Now we will continue to another paper which is
called Travelling Salesman Problem. Let me open up the outline again. In this
section we will go through travelling salesman problem. Then we will get into
permutation based operator. They are similar to our generic crossover mutation
operator. But they care about the permutation they care about. They take into
consideration about the author. Basically. Then we will get into multi mimetic
algorithm. Then we will revisit all those experiments that we've seen because as I
said, there is no single recipe at the moment. Okay. Travelling salesman problem.
Another paper a brief review of memetic algorithm for solving Euclidean to the
travelling salesman problem. This is also from the same author and those come from
our school. So binary representation for encoding basically. Um, it may not be
suitable for us because. For this problem travelling salesman I think you are
familiar with the problem. We've got entities. We've got one person. That person
should visit all the city ones and complete the cycle. Yeah, but if you go for the
binary representation, it may not be suitable because it may cause some illegal
tours. Which means that that person may not visit all the cities, or he may go some
other cities, because we may end up some undefined city codes if we use the binary
encoding. Also, the same person may visit one city more than once, like twice or
three times even. And also it may cause some loops. The same person going over the
same cities again and again. That's why we need some repair algorithm to deal with
that kind of illegal issues. When I say illegal issues, they are listed here, as
you can see. So let's see this in an example why it is unsuitable basically. Let me
get the panel. Um, basically we've got one individual. Yeah. And the other
individual as well. Imagine these are parents. Okay. These are our parents. And
these are our chromosomes. Yeah. At the top, you are seeing the city number. Number
two. Fifth city or the first city? And each city is represented by three bits.
Yeah. So from the previous, previous example, the previous lecture, we've seen the
general crossover. Operator. Yeah, we've got one crossover segment. And what do we
do in a normal crossover? We kind of swap them. So one is swap once we swap it. We
ended up these individuals solution. But in this case as you can see the city
number one visited twice. Actually this is the road of the salesman. It goes to
first the second city and then goes back to First City again and again. So this is
our illegal tour for the salesman. Also, we ended up another city actually, which
is not in the list. Why? All because we simply get a one crossover point and then
swap them. Then we ended up a really a mess problem. Um, if we don't use Bitstring,
the if we use the actual city numbers, it may still not work. Such as the example
here. We've got chromosome of real numbers 25314. This is one tour and the other
individual solution is 52143. Once we have two different well one crossover point
then once we swap them again another illegal tour, you know visiting the same city
twice. That's why we cannot apply the generic crossover operator on this. Once the
order is matter, we cannot apply the generic crossover or generic mutation
operator. That's why we will get into permutation based generic operator. Um, as
you will not surprise, there are different strategies for the generic crossover.
Remember we've got one TX, x, k, TX, and different bunch of operators or strategy
we have for this one for permutation based. We will also have different strategies
that I will cover three of them only. Then I'll give you the list for you to
explore if you want. So partially mapped crossover. I'm going to skip the text, but
just going to say the essence of it. It's kind of preserved order and positioned so
it is designed to protect the order and position of those visited cities. Yeah, and
it's not for long the travel salesman problem, for any problem which cares about
the order or the permutation of the solution, then you can apply PMCs. Let's get
into actual example. Yeah. Again we've got two individual solution. The first
solution is from 1 to 9. Whatever it is doesn't really matter. And the second is
start from the fourth city and ended up like the lost city is the third one. Yeah.
Then in this PMCs we've got two different crossover point. Later we swap the
segments. Yeah the second bit becomes the first bit. Yeah. We just swapping the
segments in between the crossover points. Then we define a map. In this case, map
says from 1 to 4 or 8 to 5 to 5 to 8, 7 to 6, 6 to 7. We will use this map to take
into consideration the permutation okay. So we swap the segments. We've got our
map. Then we fill in the cities from other parents if no conflict basically. Um, we
will fill up such as this one, two, three would come to this missing part. However,
there is a conflict. We cannot put one here on this x. Yeah, because one is already
visited so we leave it blank. Same as eight and nine comes to the here eight is
already visited. That's why we will leave it blank for now and put to nine. The
same things apply to the second offspring. Then later on, what do we use? We use
our map actually. We supposed to put one here, which we cannot. Then what is the
one equivalent in the map? It is four. Then the four fills the blanket. Then we
ended up having two offspring, which take into consideration the order and position
in this pmc's strategy. Yeah, this is one strategy for the order and position.
Yeah. The other one is oh x order crossover. Is there a question? No. Okay. The
other one is order crossover. I'm not going to read through all the text. However
uh that preserve the relative order of cities. Okay. In this case, the order of
these two solutions are identical, according to Oakes. Basically, you know, it
starts from the 93452. And the second solution is 4 or 5. Two goes to. So the order
is the same. Basically this is the um preserving strategy for the order crossover.
And let's see how does it work. In the same example yeah P1 and P2 parent one and
parent two. In this case um, we copy the well again we've got two crossover point
and we copy the segments into offspring as they are. We don't swap it this time
okay. This is the o x later on. Um Starting from the second cut point. Second cut
points here or from parent. And then we will simply get the sequence of that, such
as let's see the second parent and the sequence is nine, three four, five, two, one
eight and goes. We will use this sequence to fill up the first offspring. Normally
this comes from the second parent. Yeah, but we will use this sequence to fill up
the first offspring which is here. However 4567 has already visited. That's why we
will omit them from the giving sequence. And that's it. We will fill up this into
the first offspring, which means it start from the second cut 93218. Now we ended
up our first offspring, and the same procedure is applied to the second offspring
as well. So this is this is the call or x order crossover, which also takes into
consideration the permutation or the order of the visited cities or the order of
the solution. Yeah. Another one I believe this is the last one. It is something
called cycle crossover. This is a bit of an interesting one. We will see in a
second in the example, but it basically preserve the absolute position of elements,
absolute position of cities or whatever the solution contains. Let's see how does
it work. First, we randomly select a starting point from the parent one. In this
case, the starting point is one. Then we will start drawing cycles. What do I mean
by that? We will put the one first the initial point. Then we will start mapping it
in between the parents from 1 to 4. 4 to 4. Which means we can add this four to our
our first offspring. The four is added. Yeah. 1 to 4. 4 to 4. Then what next? We
will have another sort of mapping. Remember we are trying to plot map from 4 to 8,
8 to 8. That means we can include eight to our first offspring. You know, one by
one, iteratively. We are adding the elements by trying to plot a cycle or mapping.
What is next? You can guess. Yeah. From 8 to 3. 3 to 3. What does it mean? We can
add three to our first offspring here. Next 2 to 3. Three to to to to two is added.
And 2 to 1. 1 to 1. Do you see what I mean by cycle? So we start from the one. We
start mapping one by one. Then we complete the whole cycle. We ended up with this
offspring partially completed, but for the rest we will simply copy and paste the
second parent bits. Second parent cities or elements one by one. And yeah, we ended
up this offspring basically. You know, this is kind of cycling plotting it. And the
same procedure for the offspring two as well. Do you see what I mean? I mean,
previously the crossover was relatively easy, especially with the one point
crossover. We were selecting one point and then swapping it and done. But this
case, we have to be careful about the order and sometimes position as well. That's
why we or people come up with different strategies to consider the permutation
around that. Now, um, now we covered cycle order and partial map crossover. But as
you may not realise, there are many different strategies and even those are
relatively simple because as you can see, it's, um, proposed
by Oliver Smith Holland in 1987 be told. Yeah, there are much more advanced
crossover strategies to consider the permutation as well. So this was the, um, A
crossover for permutation, and also there are some mutation for permutation as
well. You know, when you flip things or when you change perturb values in your
chromosome, you may still have some illegal tours or an unwanted chromosome or
solution. That's why some mutation strategies also care about the permutation as
well, which I'm not going to cover. However, I wanted to highlight these two
exchange mutation and insert insertion mutation, which you can actually tell. How
do they work? By just looking at their name anyway, but I'm not going to cover
this. However, in the paper that we will see they use these mutation techniques.
That's why I just wanted to point out. But there are a bunch of different mutation
strategies as well. Okay, so let's get into actual experiment. GA versus ma.
Solving the TSP problem and experiment settings is. Representation is the real
world this time. As you can tell, we are not using bitstring y because it caused
some illegal tools. We use the actual values actual numbers. And here the first is
starting from the first and finishing on the fourth day. Fitness function is the
path length, obviously crossover. They use three different crossover PMCs that we
just covered two TX. This is a generic one from the last lecture, but they've got
some repair patch up and. Strategies here and OCS also the one we just covered. So
they tried three different crossover technique and two different mutation operator
insertion and exchange mutation. Mate selection. They use rank and tour with
centres. In the last lecture we haven't covered the rank one, but I mention it just
the name replacement TJ transgenerational and steady state. Two different
replacement strategies as you can tell. Again, there are bunch of different
experiments are going on in this paper as well. Hill climbing they use random
mutation hill climbing by using insertion mutation. And again the same hill
climbing with the exchange mutation. Yeah. So that was the experimental settings.
And these are the synthetic data. Basically each each point represent one city. And
we throw the salesman and see how that salesman's or potential candidates go
through and which one find the shortest path. Some are easier, and the other. Uh,
it's not giving here, but the other is a bit more complicated. And also some cities
from Turkey. Yeah. So this is their experimental settings. I will only cover one of
the table. One results from them. But this is just for you to get an idea of what
sort of experiments people do in publications and what sort of discussion they get
out of these results, actually. So to cover this, there is a steady state genetic
algorithm, transgenerational genetic algorithm with some with hill climbing, etc..
Okay. There are four different algorithms GA type. Um, this is just a used data
set, as I said, insertion Mutation and exchange mutation. Alpha represents the
average fitness per generation at each run, and beta represent the best fitness
across 100 runs. They've got 100 runs. Yeah, and the beta shows that the best
fitness function. For example, if we focus on, um, the first line here, is it
visible? I guess so. So they use C 20 I believe there are 20 cities. The alpha,
which is the average fitness function is 145. The beta, which is the best fitness
function around that 100 run is 62. It's significantly different than other when
they use the insertion mutation. However, when they use exchange mutation, the
alpha is 110 and the meta best fitness function over 100 run is 7 to 1. Which one
do you think is better? Is it the ICM or Em? Tricky question, isn't it? Because
when we look at the overall M seems to be better. Yeah. The overall performance.
When we look at the beta, the best fitness function ICM is the better than the M
actually. I would choose ICM personally because the aim here is to find the best
solution, isn't it? Question. Yes. Okay. Let's go through the the result there.
Their findings basically. Well, yeah. Findings of the result. The crossover. As I
said they tried three different crossover or X one is the best one seems to be
overall two TT seems to be the second best. PM is the last best okay for the
mutation. Em overall seems to be better than ICM. Or the mate selection tournament
selection is better than rank replacement strategy. TJ transgenerational
replacement seems to be better than SS steady state okay, and the approach hill
climbing with M seems to be better than the rest. So these are relatively, let's
say limited experiments. But based on those experiments based on these particular
settings, because they've got some synthetic data based on all this. And if you try
all these different strategies, then X1M to 2 p.m. with Ma is the best one, at
least better than the others. Now we will get into classic classification of
memetic algorithm. However, the thing I want to point out here is that there is no
single recipe. Again, you know, you can simply say for any problem I will choose
meme X and then go for it. No, you have to try out different ones. And again the
question is which one? Yeah. So let's get into a classification of memetic
algorithms. As you can see in the paper at the bottom there are different type for
the memetic algorithm. There is a static one. There are some adaptive ones. And
there's something called self adaptive ones. That is what we are going to cover for
the rest of the lecture. Actually it is called multi memes. Uh memetic algorithm
And a if you like the the abbreviation. Um, yeah. Let's move on to multi meme
memetic algorithm. As you can tell we are using multiple memes in our algorithm.
Excellent. So self adaptation. What does it mean. Um this is a deciding which
operator to use on the fly. You know you will have a different options for memes
and on the fly, the algorithm itself will decide which one to use. As you can tell
by now, there are many different memes and some works on some problem, but they
don't work for other problems. We don't know which one is which. That's why we will
simply dump a list of meme into our algorithm, and we will let the algorithm
decide. self adaptation. Yeah. So there are different adaptation techniques. They
which use varying probability applying different operators. In that case each
operator, as I said, we've got a list of operators within the algorithm. And each
operator will have a different probability to be used. The other one, the other
colleagues propose to divide our population into subpopulation, and for each
subpopulation we will have different operator such like that. And the space space
decided to use an additional bit. So additional bit to decide whether apply to RT
or UX. This is an interesting one that we will extend that a little bit, but
basically um, additional bit in the chromosome decide which one to use two dt or
UX. So there is a PhD thesis here at the bottom, which was done in West of England.
University of West of England. Krasnow. Gosh. I guess he formalised this space
technique, strategy, which is the additional bit to decide which one to use. Yeah.
In this strategy, basically a meme encodes how to apply an operator, when to apply,
where to apply, how frequent to apply. Basically it all. It does all the work for
you. Actually, you know that meme will encode all the information, as I said, which
one to apply, where to apply, how to apply, etc.. Yeah. So in this case it is
really self adaptive. Actually, you don't have to try and different memes at each
time. And all these information can be combined. Something called meme plex, which
we will see in a second. But this is the grammar of Meme Plex. Basically, um, meme
Plex lets me get the Penn meme Plex cover the memes, and another meme Plex and meme
covers contain the information where such as the location of crossover, location of
mutation, when to apply, or how to apply general hillclimb strategies or
frequencies. Always, let's say, or never something like that. Then meme Plex
basically contain all the information and the self adaptive service will work
through within the meme plex. Which one to choose? I've got a simple example, But
before that, um, let me talk about bit of multimedia metric algorithm feature.
Basically, as I said, we define a set of rules strategies and mem represent
instruction for that. Why? Because we want to self improve the algorithm itself.
Yeah. Interaction between memes and genes are not direct. Also memes can evolve.
What does it mean. It's not non-traditional way. This is an important bit. Memes
can evolve by using evolutionary strategy, actually. Remember, you've got many
different things in a meme Plex and the meme will evolve to choose which one to go
for. Here is the example of implementation. This is the simple one max set from the
last lecture. Remember good old times like we've got a really small chromosome
Bits, which was the genetic material for us. But now we are attaching another
mimetic material, another material which is mimetic to our gene, our chromosome,
that to represent whatever it is, it is meme. Yeah, it's the hill climbing meme.
And the option is to, which means that we define tree hill climbing method in our
in our algorithm. If the memetic material is two, that means we will be using a
next descent. Hill climbing. What does it mean when it comes to hill climbing? For
this genetic material on the left hand side two will dictate that we will apply
next descent Hill climbing. Remember from the experiment there are many different
hill climbing which works on different problems. but now, in a way we are actually
trying all three. How algorithm will decide which one to go for, but it has
potential to use different means. That was a simple one, but in real world,
obviously
it's going to be a bit more complicated than that, such as in simplex. We can
inject all the information mutation probability or mutation operator such as use
the insertion or use the exchange or crossover operator or frequency. You know, we
can dump all the information into simplex and all different options. And that means
Lex that memetic material will evolve itself. Okay. So let's get into a bit of how
does it evolve? Simple. Simple examples again. Basically the same person cross
logger in the. Is this a publication? Yeah, it seems like a conference publication
suggested that really simple solution. Let's say we ended up these two offspring.
Okay. One and two. And they've got different mimetic material. Remember now we
attached our chromosome at different information. They ended up with this mimetic
material. And the PhD thesis says that whichever has a better fitness function,
let's copy the the same mimetic material to that offspring. You know previously
hill climbing method for this genetic material was one. Whatever it is, let's say
steepest descent, hill climbing. And for this genetic material it was zero, Let's
say next distant hill climbing. But now for both it becomes zero. Yeah, because the
second one has a better fitness function. That's why we copy and paste all the
simplex from beta one to the second best one. So this is how we actually evolve the
memetic material in the in the algorithm mutation. Yes. We can mutate our simplex
as well. Actually, you know, we we mutate our genetic material, but we can also
mutate the memetic material as well. And there is something called innovation rate
I r another parameter. So this is a probability of mutation of memes. Basically we
kind of perturb or flip the memetic material, such as. Here you see such simple
example. Hill climbing is two. We will basically change. We've got three options.
We will basically choose one of those randomly or TSP. Obviously there will be some
more complex mutations here as well. But the important bit is our innovation rate.
Basically, if innovation rate probability is zero, that means there is no
innovation. That means that if a meme option is not introduced in the initial
generation, it will not be reintroduced again. Okay. And if it is one, all strategy
is implied by the available memes might equally used. What does it mean? We will.
We will have a bunch of different memes if it is one. All of those memes can be
used equally, actually. So the measuring the mean performance is constant. There is
something called concentration of mean. Yeah. The total number of individuals that
carried the meme ie at a given generation T, I will show you this in a figure. It's
going to be easier. And then there's something called evolutionary activity of a
meme, which is a the accumulation of meme concentration until a given generation.
What do I mean by that? So let's say we ended up two, uh, meme two simple local
search. Yeah. One is the random mutation and the other is the next descent. Uh,
local search, hill climbing. Yeah, two of them. And we we will have one generation,
one population. We will check the concentration, such as how many individuals will
have a random mutation, hill climbing, and how many individuals will have a next
descent. sand hill climbing. If all of them 90% has a has a random mutation hill
climbing, then the concentration of that random mutation is 90%. That's what I
meant. And for the evolutionary activity over the generations, we will simply count
how many times that particular meme is used. This is important because we will see
some results about it. Basically, how we accumulate. How many times a meme is used?
Yeah. So let's get into experiment quickly. Basically, this is a memetic algorithm
that you will remember. We've got genes and we've got parents. Crossover mutate. In
this case we've got one hill climbing local search method. Easy peasy right. But
multi meme multibeam memetic algorithm. We have four different memes for the local
search. And what do we do? We attach extra memetic information to our solutions.
Meme plates. These are the genes. And these are the memetic information, which
means this gene will be using H for whatever it is. One of the meme that we choose.
Yeah. So we select parents here and then we apply crossover. As I said during the
crossover, some memes links can be copied to the other parents or other offspring
based on the fitness function, as you remember. Then we can also mutate our legs as
well. We mutate our genes and we mutate our plex as well. You know, at the end H2
from the list here, try to plot an H3 from the list here will be applied to the
corresponding genes. Now, now that we are in the hill climbing step. H two will be
applied to this gene and H3 will be applied to this gene. Do you see what I mean?
Previously we didn't know which meme to use, but now the algorithm itself evolved.
The memes and algorithm itself decide which one to use. All right, let's move on to
experiment. Revisiting the benchmark function. The experiment settings from the
paper is what makes the microphone air Raid 0.2. And there are two sets of
experiments are done. One is a single good meme and two poor performing memes are
selected. This is from the previous bit. You know, for this fair, we knew that GA A
is the best and we know that. Now. Which one is the worst? Yeah. While using this
results they set an experiment and said okay, we will choose one of the good ones
and two bad ones. Then they add that three into the meme options. Okay. Is it
clear? Yeah. In the second set of experiments, they choose a 45 different algorithm
G, A, and some memetic algorithms. And the observation that the objective here is
to see whether they've got some sort of relation between different genes, whether
there is a synergy between genes. Okay. This was the previous result. But now we
are seeing new results from that papers. Basically, as I said, one of the criteria
was the accumulation of genes. And here x axis is the number of generation. As we
go along with the generation y axis, we see the evolutionary activity. So f one as
you remember this is fairly easy. One to was performing well. And in the min plex
they add these three and let the algorithm to choose which one. And they observe
which one is picked more. And as you can see after at some point GA becomes the
lead meme in the pool. You know, previously we didn't know which one to choose, but
now we throw three of them into algorithm an algorithm. Decide which one to choose,
and I'll go ahead and choose the correct one actually because GA was performing
well. So in this case we don't have to carry out all these extensive experiments.
We simply throw a few memes to algorithm, and algorithm will decide. Thanks to
evolution. Yeah, another experiment for F2 benchmark function MW3 was the best one.
M3 is a bitwise hill. Climbing a metric was the best one, and as you can tell from
the beginning, the algorithm says I'm going to choose M3 basically. Yeah. The rest
is the um. Um, there are again different different benchmark function has been
tested. And in most of the cases the best one has been chosen by the algorithm
itself. Self adaptive ness. So, uh, for the second experiment, as I said, they
checked the synergy between different means. And they they wanted to see whether
they behave well with each other, and they throw five different algorithms for some
benchmark functions. Yes, it worked well in the MMA context, such as F1. The result
is 17, but if we choose Ma one, it's 19. So sometimes they worked well, but
sometimes they didn't have enough synergy actually. So I'm going to leave this
slides here for you to explore in your own time. However, the same principle
basically in the min flex. In the min flex they throw five different memes and see
which one is observed. Ma three is the one which it should have picked and yes, it
picked. Well, yeah. All right, let's get into the result. And then I will have a
big summary to talk about. But the average evolutionary activity plot show that
multi meme approach is successful actually. So multi meme works to choose the best
meme for you. In all runs full success is achieved. You know, since it's multi meme
we've got different options and in all runs we've got the global minima. If you
remember from the first bit, some memes did not provide the best result or did not
provide anything to us. But for this case, since we've got different options,
different meme in our algorithm, they all provide the global minima. And yes, Ma
can identify the best meme for us actually. And there are some comparison as well.
Comparing the experimental result Ma and ma with the best meme for each benchmark
function, indicate that a ma with best meme is better than the average number of
evolution, except for different, giving the list here for different benchmark
function. Okay, let's get into overall summary. Um. Basically, I just wanted to
mention that these are empirical results. There are a lot of experiments are going
on in the literature. However, there are some limited theoretical or mathematical
analysis that say this is a growing field of study. Okay. Obviously this kind of
algorithm require a lot of computational hardware and then parallel distributed
processing. But I believe it's it's getting there. I mean, okay, the overall
summary, uh, we can do this actually, Or we can simply we can simply start with the
question we had. Let's let's get into question to make it quick. So we started with
this question when does a genetic algorithm perform better than the other? You
know, the experiment showed that in a square case simpler benchmark function or
simpler problem, real world problem might be better. Is the choice of meme
important? Yes. As you can tell by now, it is actually which meme does have a
better performance in a memetic algorithm? Which
one we don't know. Depends isn't it? Can we somehow combine several memes to
obtain better performance? The answer is yes. If we know a limited number of memes,
three or 4 or 5 of them, which I mean, we can't in a minute. You can't simply throw
all different memes like a hundred of them know. You need to narrow it down a
little bit, and you ended up, like 3 or 4 potential best candidate for the meme.
Then, yes, we can somehow combine them in a multi meme algorithm and let the
algorithm decide which one to use in which settings. So that was the overall
summary. Obviously you have access to slides, but I just want to say thank you.
That was all from me actually. From now on quorum will take over the module.
However, I may see you in the lab anyway. Thank you all. It was great to meet you.
Yeah. That's all.
UNKNOWN
Oh thank you. Thank you.