0% found this document useful (0 votes)
4K views7 pages

Genetic Algorithms Versus Traditional Methods

Genetic algorithms differ substantially from traditional optimization techniques in five main ways: they search populations in parallel rather than single points, do not require derivatives, use probabilistic rather than deterministic rules, work on encodings rather than direct parameters, and may provide multiple solutions.

Uploaded by

Peeyush Sinha
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4K views7 pages

Genetic Algorithms Versus Traditional Methods

Genetic algorithms differ substantially from traditional optimization techniques in five main ways: they search populations in parallel rather than single points, do not require derivatives, use probabilistic rather than deterministic rules, work on encodings rather than direct parameters, and may provide multiple solutions.

Uploaded by

Peeyush Sinha
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Genetic Algorithms versus Traditional Methods

Genetic algorithms are substantially different to the more traditional search and
optimization techniques. The five main differences are:

1. Genetic algorithms search a population of points in parallel, not from a single


point.

2. Genetic algorithms do not require derivative information or other auxiliary


knowledge; only the objective function and corresponding fitness levels influence
the direction of the search.

3. Genetic algorithms use probabilistic transition rules, not deterministic rules.

4. Genetic algorithms work on an encoding of a parameter set not the parameter set
itself (except where real-valued individuals are used).

5. Genetic algorithms may provide a number of potential solutions to a given


problem and the choice of the final is left up to the user

Define SOFT COMPUTING


Soft computing is a term applied to a field within computer science which is
characterized by the use of inexact solutions to computationally-hard tasks such as
the solution of NP-complete problems, for which an exact solution cannot be
derived in polynomial time.

DELTA LEARNING RULE


A learning rule that adjusts synaptic weights according to the product of the pre-
synaptic activity and a postsynaptic error signal obtained by computing the
difference between the actual output activity and a desired or required output
activity.
LEAST SQUARES
A method of determining the curve that best describes the relationship between
expected and observed sets of data by minimizing the sums of the squares of
deviation between observed and expected values.

A statistical method used to determine a line of best fit by minimizing the


sum of squares created by a mathematical function. A "square" is determined by
squaring the distance between a data point and the regression line. The least
squares approach limits the distance between a function and the data points that a
function is trying to explain. It is used in regression analysis, often in nonlinear
regression modeling in which a curve is fit into a set of data.

T-conorm
2.2. Generalized t-conorm integral

In this section, we use t-conorms and t-norms. They are binary operators that
generalize addition and multiplication, and also max and min.

A triangular conorm (t-conorm) ⊥ is a binary operation on [0, 1] fulfilling the


conditions:

(T1) x⊥0 = x.

(T2) x ⊥ y ≤ u ⊥ v whenever x ≤ u and y ≤ v.

(T3) x ⊥ y = y ⊥ x.

(T4) (x ⊥ y)⊥z = x⊥(y ⊥ z).

A t-conorm is said to be strict if and only if it is continuous on [0, 1] and strictly


increasing on [0, 1)

2. A continuous t-conorm ⊥ is said to be Archimedean if and only if x ⊥ x > x for


all x ∈ (0, 1).
AGGREGATION
Aggregation operations are operations that combine or aggregate two or more
fuzzy sets. There are a number of different types of aggregation, including unions
(sums), intersections (products), and means. Fuzzy Logic contains a wide collection
of different operators, including many nonstandard operators that are not found in
many other fuzzy packages. In addition, Fuzzy Logic provides a function for
creating user-defined aggregators, making it easy for users to experiment with
aggregators or add their own aggregators. The following are among the
aggregators that can be used in Fuzzy Logic.

 For unions and intersections: min, max, Hamacher, Frank, Yager, Dubois-
Prade, Dombi, Yu, and Weber
 For sums and products: drastic, bounded, algebraic, Einstein, and Hamacher
 For means: arithmetic, geometric, harmonic, and generalized

CROSSOVER

Single point crossover - one crossover point is selected, binary string from
beginning of chromosome to the crossover point is copied from one parent,
the rest is copied from the second parent
11001011+11011111 = 11001111

Two point crossover - two crossover point are selected, binary string
from beginning of chromosome to the first crossover point is copied from
one parent, the part from the first to the second crossover point is copied
from the second parent and the rest is copied from the first parent

11001011 + 11011111 = 11011111

Uniform crossover - bits are randomly copied from the first or from the
second parent

11001011 + 11011101 = 11011111

Arithmetic crossover - some arithmetic operation is performed to make


a new offspring

11001011 + 11011111 = 11001001 (AND)


SINGLE-LAYER PERCEPTRON

The earliest kind of neural network is a single-layer perceptron network, which


consists of a single layer of output nodes; the inputs are fed directly to the outputs
via a series of weights. In this way it can be considered the simplest kind of feed-
forward network. The sum of the products of the weights and the inputs is
calculated in each node, and if the value is above some threshold (typically 0) the
neuron fires and takes the activated value (typically 1); otherwise it takes the
deactivated value (typically -1). Neurons with this kind of activation function are
also called Artificial neurons or linear threshold units.

A perceptron can be created using any values for the activated and deactivated
states as long as the threshold value lies between the two. Most perceptrons have
outputs of 1 or -1 with a threshold of 0 and there is some evidence[citation needed] that
such networks can be trained more quickly than networks created from nodes with
different activation and deactivation values.

Perceptrons can be trained by a simple learning algorithm that is usually called the
delta rule. It calculates the errors between calculated output and sample output
data, and uses this to create an adjustment to the weights, thus implementing a
form of gradient descent.

Single-unit perceptrons are only capable of learning linearly separable patterns; in


1969 in a famous monograph entitled Perceptrons Marvin Minsky and Seymour
Papert showed that it was impossible for a single-layer perceptron network to learn
an XOR function. It is often believed that they also conjectured (incorrectly) that a
similar result would hold for a multi-layer perceptron network. However, this is
not true, as both Minsky and Papert already knew that multi-layer perceptrons were
capable of producing an XOR Function. (See the page on Perceptrons for more
information.)

Although a single threshold unit is quite limited in its computational power, it has
been shown that networks of parallel threshold units can approximate any
continuous function from a compact interval of the real numbers into the interval [-
1,1]. This very recent result can be found in [ Peter Auer, Harald Burgsteiner and
Wolfgang Maass: A learning rule for very simple universal approximators
consisting of a single layer of perceptrons, 2008].
A multi-layer neural network can compute a continuous output instead of a step
function. A common choice is the so-called logistic function:

(In general form, f(X) is in place of x, where f(X) is an analytic function in set of
x's.) With this choice, the single-layer network is identical to the logistic regression
model, widely used in statistical modeling. The logistic function is also known as
the sigmoid function. It has a continuous derivative, which allows it to be used in
backpropagation. This function is also preferred because its derivative is easily
calculated:

y' = y(1 − y) (times df / dX, in general form, according to the Chain Rule)

KOHONEN NETWORKS
The objective of a Kohonen network is to map input vectors (patterns) of arbitrary
dimension N onto a discrete map with 1 or 2 dimensions. Patterns close to one
another in the input space should be close to one another in the map: they should
be topologically ordered. A Kohonen network is composed of a grid of output units
and N input units. The input pattern is fed to each output unit. The input lines to
each output unit are weighted. These weights are initialized to small random
numbers.

Learning in Kohonen Networks

The learning process is as roughly as follows:

 initialise the weights for each output unit


 loop until weight changes are negligible
o for each input pattern
 present the input pattern
 find the winning output unit
 find all units in the neighbourhood of the winner
 update the weight vectors for all those units
o reduce the size of neighbourboods if required

The winning output unit is simply the unit with the weight vector that has the smallest Euclidean
distance to the input pattern. The neighbourhood of a unit is defined as all units within some
distance of that unit on the map (not in weight space). In the demonstration below all the
neighbourhoods are square. If the size of the neighbourhood is 1 then all units no more than 1
either horizontally or vertically from any unit fall within its neighbourhood. The weights of
every unit in the neighbourhood of the winning unit (including the winning unit itself) are
updated using
(21)

This will move each unit in the neighbourhood closer to the input pattern. As time progresses the
learning rate and the neighbourhood size are reduced. If the parameters are well chosen the final
network should capture the natural clusters in the input data.

You might also like