Genetic Algorithms Versus Traditional Methods
Genetic Algorithms Versus Traditional Methods
Genetic algorithms are substantially different to the more traditional search and
optimization techniques. The five main differences are:
4. Genetic algorithms work on an encoding of a parameter set not the parameter set
itself (except where real-valued individuals are used).
T-conorm
2.2. Generalized t-conorm integral
In this section, we use t-conorms and t-norms. They are binary operators that
generalize addition and multiplication, and also max and min.
(T1) x⊥0 = x.
(T3) x ⊥ y = y ⊥ x.
For unions and intersections: min, max, Hamacher, Frank, Yager, Dubois-
Prade, Dombi, Yu, and Weber
For sums and products: drastic, bounded, algebraic, Einstein, and Hamacher
For means: arithmetic, geometric, harmonic, and generalized
CROSSOVER
Single point crossover - one crossover point is selected, binary string from
beginning of chromosome to the crossover point is copied from one parent,
the rest is copied from the second parent
11001011+11011111 = 11001111
Two point crossover - two crossover point are selected, binary string
from beginning of chromosome to the first crossover point is copied from
one parent, the part from the first to the second crossover point is copied
from the second parent and the rest is copied from the first parent
Uniform crossover - bits are randomly copied from the first or from the
second parent
A perceptron can be created using any values for the activated and deactivated
states as long as the threshold value lies between the two. Most perceptrons have
outputs of 1 or -1 with a threshold of 0 and there is some evidence[citation needed] that
such networks can be trained more quickly than networks created from nodes with
different activation and deactivation values.
Perceptrons can be trained by a simple learning algorithm that is usually called the
delta rule. It calculates the errors between calculated output and sample output
data, and uses this to create an adjustment to the weights, thus implementing a
form of gradient descent.
Although a single threshold unit is quite limited in its computational power, it has
been shown that networks of parallel threshold units can approximate any
continuous function from a compact interval of the real numbers into the interval [-
1,1]. This very recent result can be found in [ Peter Auer, Harald Burgsteiner and
Wolfgang Maass: A learning rule for very simple universal approximators
consisting of a single layer of perceptrons, 2008].
A multi-layer neural network can compute a continuous output instead of a step
function. A common choice is the so-called logistic function:
(In general form, f(X) is in place of x, where f(X) is an analytic function in set of
x's.) With this choice, the single-layer network is identical to the logistic regression
model, widely used in statistical modeling. The logistic function is also known as
the sigmoid function. It has a continuous derivative, which allows it to be used in
backpropagation. This function is also preferred because its derivative is easily
calculated:
y' = y(1 − y) (times df / dX, in general form, according to the Chain Rule)
KOHONEN NETWORKS
The objective of a Kohonen network is to map input vectors (patterns) of arbitrary
dimension N onto a discrete map with 1 or 2 dimensions. Patterns close to one
another in the input space should be close to one another in the map: they should
be topologically ordered. A Kohonen network is composed of a grid of output units
and N input units. The input pattern is fed to each output unit. The input lines to
each output unit are weighted. These weights are initialized to small random
numbers.
The winning output unit is simply the unit with the weight vector that has the smallest Euclidean
distance to the input pattern. The neighbourhood of a unit is defined as all units within some
distance of that unit on the map (not in weight space). In the demonstration below all the
neighbourhoods are square. If the size of the neighbourhood is 1 then all units no more than 1
either horizontally or vertically from any unit fall within its neighbourhood. The weights of
every unit in the neighbourhood of the winning unit (including the winning unit itself) are
updated using
(21)
This will move each unit in the neighbourhood closer to the input pattern. As time progresses the
learning rate and the neighbourhood size are reduced. If the parameters are well chosen the final
network should capture the natural clusters in the input data.