SC 4 Modul
SC 4 Modul
Previous Page
Next Page
Introduction to Optimization
Optimization is the process of making something better. In any process, we have a
set of inputs and a set of outputs as shown in the following figure.
Optimization refers to finding the values of inputs in such a way that we get the
“best” output values. The definition of “best” varies from problem to problem, but in
mathematical terms, it refers to maximizing or minimizing one or more objective
functions, by varying the input parameters.
The set of all possible solutions or values which the inputs can take make up the search
space. In this search space, lies a point or a set of points which gives the optimal
solution. The aim of optimization is to find that point or set of points in the search
space.
GA – Motivation
Genetic Algorithms have the ability to deliver a “good-enough” solution “fast-enough”.
This makes genetic algorithms attractive for use in solving optimization problems. The
reasons why GAs are needed are as follows −
Solving Difficult Problems
In computer science, there is a large set of problems, which are NP-Hard. What this
essentially means is that, even the most powerful computing systems take a very long
time (even years!) to solve that problem. In such a scenario, GAs prove to be an
efficient tool to provide usable near-optimal solutions in a short amount of time.
Failure of Gradient Based Methods
Traditional calculus based methods work by starting at a random point and by moving in
the direction of the gradient, till we reach the top of the hill. This technique is efficient
and works very well for single-peaked objective functions like the cost function in linear
regression. But, in most real-world situations, we have a very complex problem called
as landscapes, which are made of many peaks and many valleys, which causes such
methods to fail, as they suffer from an inherent tendency of getting stuck at the local
optima as shown in the following figure.
Basic Terminology
Before beginning a discussion on Genetic Algorithms, it is essential to be familiar with
some basic terminology which will be used throughout this tutorial.
Population − It is a subset of all the possible (encoded) solutions to the given
problem. The population for a GA is analogous to the population for human
beings except that instead of human beings, we have Candidate Solutions
representing human beings.
Chromosomes − A chromosome is one such solution to the given problem.
Gene − A gene is one element position of a chromosome.
Allele − It is the value a gene takes for a particular chromosome.
picked while a 1 represents the reverse. This is a case where genotype and
phenotype spaces are different.
Basic Structure
The basic structure of a GA is as follows −
We start with an initial population (which may be generated at random or seeded by
other heuristics), select parents from this population for mating. Apply crossover and
mutation operators on the parents to generate new off-springs. And finally these off-
springs replace the existing individuals in the population and the process repeats. In this
way genetic algorithms actually try to mimic the human evolution to some extent.
Each of the following steps are covered as a separate chapter later in this tutorial.
A generalized pseudo-code for a GA is explained in the following program −
GA()
initialize population
find fitness of population
Binary Representation
This is one of the simplest and most widely used representation in GAs. In this type of
representation the genotype consists of bit strings.
For some problems when the solution space consists of Boolean decision variables –
yes or no, the binary representation is natural. Take for example the 0/1 Knapsack
Problem. If there are n items, we can represent a solution by a binary string of n
elements, where the x element tells whether the item x is picked (1) or not (0).
th
For other problems, specifically those dealing with numbers, we can represent the
numbers with their binary representation. The problem with this kind of encoding is that
different bits have different significance and therefore mutation and crossover operators
can have undesired consequences. This can be resolved to some extent by using Gray
Coding, as a change in one bit does not have a massive effect on the solution.
Permutation Representation
In many problems, the solution is represented by an order of elements. In such cases
permutation representation is the most suited.
A classic example of this representation is the travelling salesman problem (TSP). In
this the salesman has to take a tour of all the cities, visiting each city exactly once and
come back to the starting city. The total distance of the tour has to be minimized. The
solution to this TSP is naturally an ordering or permutation of all the cities and therefore
using a permutation representation makes sense for this problem.
Population Models
There are two population models widely in use −
Steady State
In steady state GA, we generate one or two off-springs in each iteration and they
replace one or two individuals from the population. A steady state GA is also known
as Incremental GA.
Generational
In a generational model, we generate ‘n’ off-springs, where n is the population size, and
the entire population is replaced by the new one at the end of the iteration.
It is clear that a fitter individual has a greater pie on the wheel and therefore a greater
chance of landing in front of the fixed point when the wheel is rotated. Therefore, the
probability of choosing an individual depends directly on its fitness.
Implementation wise, we use the following steps −
Calculate S = the sum of a finesses.
Generate a random number between 0 and S.
Starting from the top of the population, keep adding the finesses to the partial
sum P, till P<S.
The individual for which P exceeds S is the chosen individual.
Stochastic Universal Sampling (SUS)
Stochastic Universal Sampling is quite similar to Roulette wheel selection, however
instead of having just one fixed point, we have multiple fixed points as shown in the
following image. Therefore, all the parents are chosen in just one spin of the wheel.
Also, such a setup encourages the highly fit individuals to be chosen at least once.
It is to be noted that fitness proportionate selection methods don’t work for cases where
the fitness can take a negative value.
Tournament Selection
In K-Way tournament selection, we select K individuals from the population at random
and select the best out of these to become a parent. The same process is repeated for
selecting the next parent. Tournament Selection is also extremely popular in literature
as it can even work with negative fitness values.
Rank Selection
Rank Selection also works with negative fitness values and is mostly used when the
individuals in the population have very close fitness values (this happens usually at the
end of the run). This leads to each individual having an almost equal share of the pie
(like in case of fitness proportionate selection) as shown in the following image and
hence each individual no matter how fit relative to each other has an approximately
same probability of getting selected as a parent. This in turn leads to a loss in the
selection pressure towards fitter individuals, making the GA to make poor parent
selections in such situations.
In this, we remove the concept of a fitness value while selecting a parent. However,
every individual in the population is ranked according to their fitness. The selection of
the parents depends on the rank of each individual and not the fitness. The higher
ranked individuals are preferred more than the lower ranked ones.
A 8.1 1
B 8.0 4
C 8.05 2
D 7.95 6
E 8.02 3
F 7.99 5
Random Selection
In this strategy we randomly select parents from the existing population. There is no
selection pressure towards fitter individuals and therefore this strategy is usually
avoided.
Introduction to Crossover
The crossover operator is analogous to reproduction and biological crossover. In this
more than one parent is selected and one or more off-springs are produced using the
genetic material of the parents. Crossover is usually applied in a GA with a high
probability – p .
c
Crossover Operators
In this section we will discuss some of the most popularly used crossover operators. It is
to be noted that these crossover operators are very generic and the GA Designer might
choose to implement a problem-specific crossover operator as well.
One Point Crossover
In this one-point crossover, a random crossover point is selected and the tails of its two
parents are swapped to get new off-springs.
Uniform Crossover
In a uniform crossover, we don’t divide the chromosome into segments, rather we treat
each gene separately. In this, we essentially flip a coin for each chromosome to decide
whether or not it’ll be included in the off-spring. We can also bias the coin to one parent,
to have more genetic material in the child from that parent.
There exist a lot of other crossovers like Partially Mapped Crossover (PMX), Order
based crossover (OX2), Shuffle Crossover, Ring Crossover, etc.
Mutation Operators
In this section, we describe some of the most commonly used mutation operators. Like
the crossover operators, this is not an exhaustive list and the GA designer might find a
combination of these approaches or a problem-specific mutation operator more useful.
Bit Flip Mutation
In this bit flip mutation, we select one or more random bits and flip them. This is used for
binary encoded GAs.
Random Resetting
Random Resetting is an extension of the bit flip for the integer representation. In this, a
random value from the set of permissible values is assigned to a randomly chosen
gene.
Swap Mutation
In swap mutation, we select two positions on the chromosome at random, and
interchange the values. This is common in permutation based encodings.
Scramble Mutation
Scramble mutation is also popular with permutation representations. In this, from the
entire chromosome, a subset of genes is chosen and their values are scrambled or
shuffled randomly.
Inversion Mutation
In inversion mutation, we select a subset of genes like in scramble mutation, but instead
of shuffling the subset, we merely invert the entire string in the subset.
Lamarckian Model
The Lamarckian Model essentially says that the traits which an individual acquires in
his/her lifetime can be passed on to its offspring. It is named after French biologist Jean-
Baptiste Lamarck.
Even though, natural biology has completely disregarded Lamarckism as we all know
that only the information in the genotype can be transmitted. However, from a
computation view point, it has been shown that adopting the Lamarckian model gives
good results for some of the problems.
In the Lamarckian model, a local search operator examines the neighborhood (acquiring
new traits), and if a better chromosome is found, it becomes the offspring.
Baldwinian Model
The Baldwinian model is an intermediate idea named after James Mark Baldwin (1896).
In the Baldwin model, the chromosomes can encode a tendency of learning beneficial
behaviors. This means, that unlike the Lamarckian model, we don’t transmit the
acquired traits to the next generation, and neither do we completely ignore the acquired
traits like in the Darwinian Model.
The Baldwin Model is in the middle of these two extremes, wherein the tendency of an
individual to acquire certain traits is encoded rather than the traits themselves.
In this Baldwinian Model, a local search operator examines the neighborhood (acquiring
new traits), and if a better chromosome is found, it only assigns the improved fitness to
the chromosome and does not modify the chromosome itself. The change in fitness
signifies the chromosomes capability to “acquire the trait”, even though it is not passed
directly to the future generations.
Effective Implementation
GAs are very general in nature, and just applying them to any optimization problem
wouldn’t give good results. In this section, we describe a few points which would help
and assist a GA designer or GA implementer in their work.
Introduce problem-specific domain knowledge
It has been observed that the more problem-specific domain knowledge we incorporate
into the GA; the better objective values we get. Adding problem specific information can
be done by either using problem specific crossover or mutation operators, custom
representations, etc.
The following image shows Michalewicz’s (1990) view of the EA −
Reduce Crowding
Crowding happens when a highly fit chromosome gets to reproduce a lot, and in a few
generations, the entire population is filled with similar solutions having similar fitness.
This reduces diversity which is a very crucial element to ensure the success of a GA.
There are numerous ways to limit crowding. Some of them are −
Mutation to introduce diversity.
Switching to rank selection and tournament selection which have more
selection pressure than fitness proportionate selection for individuals with similar
fitness.
Fitness Sharing − In this an individual’s fitness is reduced if the population
already contains similar individuals.
Randomization Helps!
It has been experimentally observed that the best solutions are driven by randomized
chromosomes as they impart diversity to the population. The GA implementer should be
careful to keep sufficient amount of randomization and diversity in the population for the
best results.
In such a scenario, crossover and mutation operators might give us solutions which are
infeasible. Therefore, additional mechanisms have to be employed in the GA when
dealing with constrained Optimization Problems.
Some of the most common methods are −
Using penalty functions which reduces the fitness of infeasible solutions,
preferably so that the fitness is reduced in proportion with the number of
constraints violated or the distance from the feasible region.
Using repair functions which take an infeasible solution and modify it so that the
violated constraints get satisfied.
Not allowing infeasible solutions to enter into the population at all.
Use a special representation or decoder functions that ensures feasibility of
the solutions.
Defining length is the distance between the two furthest fixed symbols in the
gene.
The schema theorem states that this schema with above average fitness, short defining
length and lower order is more likely to survive crossover and mutation.
Building Block Hypothesis
Building Blocks are low order, low defining length schemata with the above given
average fitness. The building block hypothesis says that such building blocks serve as a
foundation for the GAs success and adaptation in GAs as it progresses by successively
identifying and recombining such “building blocks”.
No Free Lunch (NFL) Theorem
Wolpert and Macready in 1997 published a paper titled "No Free Lunch Theorems for
Optimization." It essentially states that if we average over the space of all possible
problems, then all non-revisiting black box algorithms will exhibit the same performance.
It means that the more we understand a problem, our GA becomes more problem
specific and gives better performance, but it makes up for that by performing poorly for
other problems.