International Journal of Computer Trends and Technology (IJCTT) volume 23 Number 4May 2015
A Preliminary Survey on Genetic Algorithm
Techniques
Mayilvaganan M#1, Geethamani G.S*2
#
Associate Professor, Department of Computer Science, P.S.G College of Arts and Science, India,Coimbatore.
Assistant Professor, Research Scholar, Department of Computer Science, Karpagam University, India,Coimbatore.
Abstract In recent years, data mining and Genetic algorithms
is an essential aspect for searching and generating association
rules among the large number of itemsets. Genetic algorithms
maintain a population pool of candidate solutions called strings
or chromosomes. Each chromosome p is a collection of building
blocks known as genes, which are instantiated with values from a
finite domain. Associated with each chromosome is a fitness
value which is determined by a user defined function, called the
fitness function. The performance of a GA is dependent on the
genetic operators in general and on the type of crossover
operator, in particular. Effective crossover in a GA is achieved
through establishing the optimum relationship between the
crossover and the search problem itself. In this paper, an
preliminary studies have been carried out to enable the
researcher to identify the various genetic algorithm methods.
Keywords
Data
Mining,
Genetic
itesets,chromosomes,crossover,fitness function.
Algorithm,
I. INTRODUCTION
In 1975, John Holland was developed the Genetic
Algorithm at University of Michigan. Genetic
algorithms are inspired by Darwins theory about
the evolution, termed as Survival of the Fittest. It
also simulates natural evolution with a combination
of selection, recombination and mutation to evolve
a solution to the problem[1]-[4]. It randomly search
the dataset to solve the optimization problems. It
means that better and better solutions evolve from
previous generations until a near optimal solution is
obtained. It provides efficient, effective techniques
for
optimization
and
machine
learning
applications[3]. This algorithm is Widely-used
today in business, scientific and engineering circles.
Genetic algorithm is an iterative procedure that
represents its candidate solutions as strings of genes
called Chromosomes. A group of individuals
(Chromosomes) called population. Population is
modified in the each iteration of the algorithm.
Genetic Algorithms iterations are called as
ISSN: 2231-2803
generations. Standard Genetic algorithm apply
genetic operators such as selection, crossover and
mutation.
Genetic Operators
The GA maintains a population of n chromosomes
(solutions) with associated fitness values. Parents
are
selected to mate, on the basis of their fitness,
producing offspring via a reproductive plan
(mutation and crossover). Consequently highly fit
solutions are given more opportunities to
reproduce( selected for next generation), so that
offspring inherit characteristics from each
parent[5][7]. As parents mate and produce offspring,
room must be made for the new arrivals since the
population is kept at a static size (population size).
Selection: According to Darwin's evolution theory,
the chromosomes with higher fitness ratings are
selected from the population to be the parents to
crossover that should survive and create new
offspring.
Crossover: It leads to effective combination of
schemata (sub solutions on different chromosomes).
It means choosing a random position in the string
and exchanging the segments either to the right or
to the left of this point with another string
partitioned similarly to produce two new offspring.
II.
RELATED WORK
Wakabi-Waiswa, P.P., et al., proposed [16]
Generalized Association Rule Mining Using
Genetic Algorithms. In this paper, Association rule
mining is designed for combining the Genetic
Algorithms and a modified a-priori based
algorithm. It yields very fast results. It generalized a
http://www.ijcttjournal.org
Page 174
International Journal of Computer Trends and Technology (IJCTT) volume 23 Number 4May 2015
very large database of transactions, where each produce the frequent itemset and also which genes
transaction contain a set of items, and a contained the chromosomes which are used to
classification on the items, then the associations produce the frequent itemsets[12][15]. The
between items at any level of the classification have parameter UG helps us to find both the individual
been found[6]-[9]. It improved the performance of and the genes.
minimum support and number of items. It also The genetic operator selection uses the value of IVF
improves the various other characteristics limitless for getting the current maximal frequent itemsets.
number of roots and levels in the classification, The operator crossover adopts heuristic crossover
depth-ratio and number of transactions.
checks whether the parent chromosome can be
Ghosh S, Biswas S, Sarkar D and Sarkar P.P, replaced by another chromosome using the UI
Mining Frequent Itemsets Using Genetic parameter[8]. The heuristic mutation is adopted by
Algorithm, proposed [6] the algorithm to find the genetic operator mutation uses the UG to judge
frequent itemsets using genetic algorithm. The which transaction has lower relationship.
association rule mining algorithm like apriori, Yan X, Zhang C and Zhang S, developed[13]
partition, fp-tree, etc., generate the frequent Genetic Algorithm-based Strategy for Identifying
itemsets. However, it takes too much time to Association Rules without Specifying Actual
compute the frequent itemsets. The main aim to Minimum Support, for generating the association
introduce genetic algorithm is to reduce the rule using the genetic algorithm without specifying
computing time. Genetic algorithm performs as the minimum support and the confidence is used as
global search to generate the frequent the fitness function.
itemsets[9][10][12]. The time complexity is less First, genetic algorithm is developed for Boolean
when compared to the association rule mining association rule mining. Initializing the select
algorithm because the genetic algorithm is based on operator pop[i] to produce the new one pop[i+1].
the greedy approach. Dou W, Hu J, Hirasawa K and Then apply the crossover for the new population
Wu G, Quick Response Data Mining Model Using with probability cp to reproduce offspring. Each
Genetic Algorithm,[4] proposed this paper to find chromosomes is mutated with probability mp for
the maximal frequent item sets using Genetic producing the high quality chromosomes.
algorithm. In this paper, the authors defined some
parameters because these parameters are used in the III.
WORKING
PRINCIPLE
OF
GENETIC
ALGORITHMS (GAS)
Genetic algorithm operators. The defined
parameters are Individual Identity (IVI), Individual
Fitness (IVF), Upgrade Index (UI), and Upgrade The workability of genetic algorithms (GAs) is
based on Darwinians theory of survival of the
Genes (UG).
Individual Identity (IVI) contains the unique fittest. Genetic algorithms (GAs) may hold a
symbols of each chromosome in the individual. The chromosome, a gene, set of population, fitness,
individuals are distinguished by these symbols. fitness function, breeding, mutation and selection.
Individual Fitness (IVF) has the number of items. If Genetic algorithms (GAs) begin with a set of
the individual cannot create a frequent itemset, then solutions represented by chromosomes, called
the IVF is set to 0, otherwise, IVF is the number of population. Solutions from one population are taken
items and is set to 1. Upgrade Index (UI) is the and used to form a new population, which is
negative number that shows the distance for getting stimulated by the possibility that the new
the frequent itemset of the individual. The larger population will be better than the old one. Further,
value of UI is, the more possible the frequent solutions are selected according to their fitness to
itemset is generated through using the Genetic form new solutions, that is, offsprings. The above
operators. Upgrade Genes (UG) is the set of genes process is repetitive until some condition is satisfied.
needed by the individual to enhance the UI. In more Algorithmically, the basic genetic algorithm (GAs)
situations to know whether the individual can is outlined as below[14]-[16]:
ISSN: 2231-2803
http://www.ijcttjournal.org
Page 175
International Journal of Computer Trends and Technology (IJCTT) volume 23 Number 4May 2015
3.2 Permutation encoding
Step I [Start] Create random population of Permutation encoding is best suited for ordering or
chromosomes, that is, suitable solutions for the queuing problems. Travelling salesman is a
problem.
challenging problem in optimization, where
Step II [Fitness] Estimate the fitness of each permutation encoding is used. In permutation
chromosome in the population.
encoding, every chromosome is a string of
Step III [New population] Create a new population numbers[5].
by repeating following steps until the new
population is generated.
3.3 Value encoding
a) [Selection] Choose two parent chromosomes Value encoding can be form number, real number
from a population according to their fitness. Better on characters to some complicated objects. Value
the fitness, the bigger chance to be selected to be encoding is technique in which every chromosome
the parent.
is a string of some values and is used where some
b) [Crossover] With a crossover probability, cross more complicated values are required[4].
over the parents to form new offspring, that is,
children. If no crossover was performed, offspring 3.4 Tree Encoding
is the exact copy of parents.
It is best suited technique for evolving expressions
c) [Mutation] With a mutation probability, mutate or programs such as genetic programming. In tree
new offspring at each locus.
encoding, every chromosome is a tree of some
d) [Accepting] Place new offspring in the new objects, functions or commands in programming
population.
languages. Locator/identifier separation protocol
Step IV [Replace] Use new generated population (LISP) programming language is used for this
for a further run of the algorithm.
purpose. Locator/identifier separation protocol
Step V [Test] If the end condition is satisfied, stop, (LISP) programs can be represented in tree
and return the best solution in current population.
structure for crossover and mutation. In tree
Step VI [Loop] Go to step 2.
encoding, the chromosomes are represented. There
The genetic algorithms performance is largely are no specific directions for using the type of
influenced by crossover and mutation operators.
encoding scheme in the specified problem rather, it
depends upon the applicability and the requirements
IV.
ENCODING TECHNIQUE IN GENETIC
of the problem[7].
ALGORITHMS (GAS)
Encoding techniques in genetic algorithms (GAs)
are problem specific, which transforms the problem
solution into chromosomes. Various encoding
techniques used in genetic algorithms (GAs) are
binary encoding, permutation encoding, value
encoding and tree encoding.
3.1 Binary encoding
It is the most common form of encoding in which
the data value is converted into binary strings.
Binary encoding gives many possible chromosomes
with a small number of alleles. A chromosome is
represented in binary encoding[8].
ISSN: 2231-2803
4. Selection Techniques in Genetic Algorithms
(GAs)
Selection is an important function in genetic
algorithms (GAs), based on an evaluation criterion
that returns a measurement of worth for any
chromosome in the context of the problem. It is the
stage of genetic algorithm in which individual
genomes are chosen from the string of
chromosomes. The commonly used techniques for
selection of chromosomes are Roulette wheel, rank
selection and steady state selection[3].
4.1 Roulette wheel selection
In this method the parents are selected according to
their fitness. Better chromosomes, are having more
chances to be selected as parents. It is the most
http://www.ijcttjournal.org
Page 176
International Journal of Computer Trends and Technology (IJCTT) volume 23 Number 4May 2015
common method for implementing fitness 4.3 Steady-state selection
proportionate selection. Each individual is assigned This method replaces few individuals in each
a slice of circular Roulette wheel, and the size of generation, and is not a particular method for
slice is proportional to the individual fitness of selecting the parents. Only a small number of newly
chromosomes, that is, bigger the value, larger the created offsprings are put in place of least fit
size of slice is. The functioning of Roulette wheel individual. The main idea of steady-state selection
algorithm is described below[11]:
is that bigger part of chromosome should retain to
successive population.
Step 1 [Sum] Find the sum of all chromosomes
fitness in the population.
5. Genetic Algorithms (GAs) Operators
Step 2 [Select] Generate random number from the Genetic algorithms (GAs) can be applied to any
given population interval.
process control application for optimization of
Step 3 [Loop] Go through the entire population and different parameters[12]-[14]. Genetic algorithms
sum the fitness. When this sum is more than a (GAs) use various operators viz. the crossover,
fitness criteria value, stop and return this mutation for the proper selection of optimized value.
chromosome.
Selection of proper crossover and mutation
Figure 3 (a) shows Roulette wheel for six technique depends upon the encoding method and
individuals having different fitness values. The as per the requirement of the problem.
Sixth individual has a higher fitness than any other,
it is expected that the Roulette wheel selection will 5.1 Crossover
choose the sixth individual more than any other It is the process in which genes are selected from
individual.
the parent chromosomes and new offspring is
created. Crossover can be performed with binary
4.2 Rank selection method
encoding, permutation encoding, value encoding
The application of Roulette wheel selection method and tree encoding.
is not satisfactory in genetic algorithms (GAs),
when the fitness value of chromosomes differs very 5.1.1 Binary encoding crossover
much. It is a slower convergence technique, which In binary encoding, the chromosomes may
ranks the population by certain criteria and then crossover at single point, two point, uniformly or
every chromosome receives fitness value arithmetically. In single point crossover, a single
determined by this ranking. This method prevents crossover point is chosen and the data before this
quick convergence and the individuals in a point are exactly copied from first parent and the
population are ranked according to the fitness and data after this point are exactly copied from the
the expected value of each individual depends on its second parent to create new offsprings. Two parents
rank rather than its absolute fitness. For example, if in this method give two new offsprings.
the best chromosome fitness is 80 percent,its
circumference occupies 80 percent of the roulette 5.1.2 Uniform Crossover
wheel and then other chromosomes will have In uniform crossover, data of the first parent
minimum chances to be selected. On the other hand, chromosome and second parent chromosome are
the rank selection first ranks the population randomly copied.
according to their fitness and then every
chromosome receives ranking. The worst will have 5.1.3 Arithmetic Crossover
fitness 1, the second worst will have a fitness of 2, In arithmetic crossover, crossover of chromosomes
and the best one will have a fitness value n, where n is performed by AND and OR operators to create
is the number of chromosomes in the population.
new offsprings.
ISSN: 2231-2803
http://www.ijcttjournal.org
Page 177
International Journal of Computer Trends and Technology (IJCTT) volume 23 Number 4May 2015
5.1.4 Permutation encoding crossover
and is known as down mutation. Similarly, if the bit
In permutation encoding crossover, one crossover 0 is converted into bit 1, the numerical value of the
point is selected. The permutation is copied from chromosome increases and is referred as up
first parent chromosome up to the point of mutation.
crossover and the other parent chromosome is 5.2.2 Permutation encoding mutation
exactly copied to ensure that no number is left to be In permutation encoding mutation, the order of the
put in the offspring. Further, if the number is not two numbers given in a sequence are exchanged.
yet in the offspring, it is added to the offspring 5.2.3 Value encoding mutation
chromosome. Travelling salesman problems and In value encoding mutation, a small numerical
task ordering problems can be easily solved by value is either added or subtracted from the selected
permutation encoding.
values of chromosomes to create new offsprings.
5.2.4 Tree encoding mutation
5.1.5 Value encoding crossover
Tree encoding mutation, mutates the certain
It can be performed at single point, two point, selected nodes of the tree to create new offspring.
uniform and arithmetic representation as in binary 6. Genetic algorithms (GAs): Issues
encoding technique.
Genetic algorithms (GAs) can be applied in
complex non-linear process controllers for the
5.1.6 Tree encoding crossover
optimization of parameters. Some issues are
In this type of crossover, one point of crossover is important to be considered for proper
selected in both parent tree chromosomes, which implementation of genetic algorithms (GAs) to a
are divided at a point[15]. The parts of tree below plant to be optimized.
crossover point are exactly exchanged to produce
V.
CONCLUSION & FUTURE SCOPE
new offsprings. The choice of the type of the
crossover is strictly depends upon the problem.
In this paper, various genetic algorithms are
discussed, moreover Genetic algorithm find optimal
5.2 Mutation
Premature convergence is a critical problem in most solutions among the search space with the operators
optimization techniques, consisting of populations, like crossover and mutation. Genetic algorithms are
which occurs when highly fit parent chromosomes very effective techniques of quickly finding a
in the population breed many similar offsprings in reasonable solution to a complex problem[13]-14].
early evolution time. Crossover operation of genetic Most of the researchers used the genetic algorithm
algorithms (GAs) cannot generate quite different to find the frequent itemsets and association rules.
offsprings from their parents because the acquired However, GA is used for optimization in our future
information is used to crossover the chromosomes. research it has been proposed to use GA to compare
An alternate operator, mutation, can search new the efficiency of various genetic techniques in large
areas in contrast to the crossover. Crossover is datasets. optimize the large input dataset.
referred as exploitation operator whereas the
mutation is exploration one. Like crossover,
mutation can also be performed for all types of
encoding techniques.
5.2.1 Binary encoding mutation
In binary encoding mutation, the bits selected for
creating new offsprings are inverted, which is
illustrated in Figure 5 (a). In binary encoding
mutation, if the bit 1 is converted into bit 0, it
decreases the numerical value of the chromosome,
ISSN: 2231-2803
REFERENCES
[1]Agrawal R, Imielinski T and Swami A, Mining Association Rules
between Sets of Items in Large Databases, Proceedings of the ACM
SIGMOID International Conference on Management of data, Vol. 22, No. 2,
pp. 207-216, 1993.
[2] Agrawal R and Srikant R, Fast Algorithm for Mining Association Rules,
Proceedings of the 20th International Conference on Very Large Data Bases,
pp. 487-499, 1994.
[3] Das S and Saha B, Data Quality Mining using Genetic Algorithm,
International Journal of Computer Science and Security, Vol. 3, No. 2, pp.
105-112, 2009.
[4] Dou W, Hu J, Hirasawa K and Wu G, Quick Response Data Mining
Model Using Genetic Algorithm, Institute for Credentialing Excellence
Annual Conference, pp. 1214-1219, 2008.
http://www.ijcttjournal.org
Page 178
International Journal of Computer Trends and Technology (IJCTT) volume 23 Number 4May 2015
[5] Fang W, Lu. M, Xiao. X, He. B and Luo. Q, Frequent Itemset Mining on
Graphics Processors, Proceedings of the Fifth International Workshop on
Data Management on New Hardware, 2009.
[6] Ghosh S., Biswas S., Sarkar Dand Sarkar P.P., Mining Frequent Itemsets
Using Genetic Algorithm, International
Journal of Artificial Intelligence & Applications, Vol. 1, No. 4, pp. 133-143,
2010
[7] Grahne G and Zhu J, Fast Algorithms for Frequent Itemset Mining Using
FP-Trees, IEEE Transactions on Knowledge and Data Engineering, Vol.17,
No.10, pp. 1347-1362, 2005.
[8] Han J and Kamber M, Data Mining: Concepts and Techniques, Morgan
Kaufmann Publishers, 2000.
[9] Han J, Cheng H, Xin D AND Yan X, Frequent pattern mining: current
status and future directions, Journal of Data Mining and Knowledge, Vol.
12, pp. 55-86, 2007.
[10]Permanent Magnet Brushless dc Motor, IE(I) Journal-EL, 84, pp.16-21.
B Mrozek, Z Mrozek. (2000). Modeling and Fuzzy Control of DC Drive, in
Proceedings of 14th European
Simulation Multi conference, pp 186-190.
ISSN: 2231-2803
[11]Chu-Kuei Tu and Tseng-Hsien Lin. (2000). Applying Genetic Algorithms
On Fuzzy Logic System For
Underwater Acoustic Signal Recognition, Proceedings of the 2000
International Symposium on Underwater
Technology, pp. 405-410.
[12] Shapiro G.P and Frawley W.J, Knowledge Discovery in Databases,
AAAI/MIT Press, 1991.
[13] Wakabi-Waiswa P.P., Baryamureeba V and Sarukesi K, Generalized
Association Rule Mining Using Genetic Algorithms, International Journal
of Computing and ICT Research, Vol. 2 No. 1, pp. 59-69, 2008.
[14] Yan X, Zhang C and Zhang S, Genetic Algorithm-based Strategy for
Identifying Association Rules without Specifying Actual Minimum Support,
Expert Systems with Applications, Elsevier, Vol. 36, No. 2, pp. 3066-3076,
2009.
[15] www.cs.bgu.ac.il/~sipper/courses/ecal051/assaf-ga.ppt.
[16] www.elearning.najah.edu/OldData/pdfs/Genetic.ppt
http://www.ijcttjournal.org
Page 179