0% found this document useful (0 votes)
118 views12 pages

Fitness Function Evaluation Using Ga

This document discusses using a genetic algorithm to evaluate the fitness value and predict the accuracy of a neural network model for diagnosing diabetes. It provides an overview of genetic algorithms, including that they are inspired by natural evolution and use processes like selection and reproduction to search for optimal solutions. The document outlines the basic steps of a genetic algorithm, which initializes a population of potential solutions, evaluates them, selects the best ones, and uses genetic operators to breed new solutions, repeating until criteria are met.

Uploaded by

abcbatata
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
118 views12 pages

Fitness Function Evaluation Using Ga

This document discusses using a genetic algorithm to evaluate the fitness value and predict the accuracy of a neural network model for diagnosing diabetes. It provides an overview of genetic algorithms, including that they are inspired by natural evolution and use processes like selection and reproduction to search for optimal solutions. The document outlines the basic steps of a genetic algorithm, which initializes a population of potential solutions, evaluates them, selects the best ones, and uses genetic operators to breed new solutions, repeating until criteria are met.

Uploaded by

abcbatata
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

99

CHAPTER 6

FITNESS FUNCTION EVALUATION USING


GENETIC ALGORITHM

6.1 INTRODUCTION

Genetic Algorithm (GA) is a derivative-free stochastic optimization


method based loosely on the concepts of natural selection and evolutionary
processes, which was first suggested by Holland (1975). GA is being used in
numerous approaches for representation of the diagnostic process. In this
Chapter genetic algorithm is applied to compute the best fitness value for
evaluating the prediction accuracy of diabetes that has been identified in
Chapter 4 using neural network. This approach helps medical experts and
physicians for better prediction of diabetes.

6.2 OVERVIEW OF GENETIC ALGORITHM

GA computationally utilizes a natural evolutionary process similar


to the process first described by Charles Darwin in his "The Origin of
Species", to solve a given problem. GA is a global search procedure that
searches from one population of points to another. GA is a probabilistic
search procedure, which is being frequently applied to difficult optimization
and learning problems (Ephzibah et al 2010, Ephzibah et al 2011).

Genetic Algorithms were inspired by the process observed in


natural evolution. GA is considered as a global search approach for
optimization problems. Through the proper evaluation strategy, the best
“chromosome” can be found from the numerous genetic combinations as
proposed by Yi-Ta Wu et al (2006) and Adel Sewisy (2007). GA is an attempt

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


100

to mimic some of the processes taking place in natural evolution (Sivanandam


et al 2008, Vrushali K. Bongirwar et al 2011).

In general, genetic algorithms perform directed random searches


through a given set of alternatives with respect to the given criteria of
goodness as suggested by Rajasekaran et al (2004). These criteria are required
to be expressed in terms of an objective function, which is usually referred to
as the fitness function as discussed by (Jayaram et al 2010 and Rajdev Tiwari
et al 2010).

Genetic algorithms require that the set of alternatives to be searched


through be finite. If GA is applied to an optimization problem where the
requirement is not satisfied, the set involved selects an appropriate finite
subset. It is further required that the alternatives be coded in strings of some
specific finite length which consist of symbols from some finite alphabet.
These strings are called chromosomes, the symbols that form them are called
genes, and their set is called a gene pool.

Genetic algorithms search for the best alternative in the sense of a


given fitness function through chromosomes evolution. Basic steps of genetic
algorithm is shown in Figure 6.1. First, an initial population of chromosomes
is randomly selected. Then each of the chromosomes in the population is
evaluated in terms of its fitness (expressed by the fitness function). Next, a
new population of chromosomes is selected from the given population by
giving a greater change to select chromosomes with the high fitness. This is
called natural selection according to George Kilr et al (1997) and Kavitha et
al (2010). The new population may contain duplicates. If given stopping
criteria (no change in the old and new population, specified computing time,
etc.,) are not met, some specific, genetic-like operations are performed on

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


101

chromosomes of the new population. These operations produce new


chromosomes, called offsprings. The same steps of this process, evaluation
and natural selection, are then applied to chromosomes of the resulting
population. The whole process is repeated until the given stopping criteria are
met. The solution is expressed by the best chromosome in the final
population.

Initialize Population of
Chromosomes

Evaluate Each Chromosome in


the Population

Select Good
Chromosome Create New Population of
Chromosomes

Are
stopping No
criteria
satisfied?

Yes
Stop

Figure 6.1 Basic Steps of Genetic Algorithm

There are many variations on these basic ideas of genetic


algorithms. To describe a particular type a genetic algorithm is greater detail,
let G denote the gene pool, and let n denote the length of strings of genes that
form chromosome. That is, chromosomes are n-tuples in Gn. The size of the
population of chromosomes is usually kept constant during the execution of
genetic algorithm (Tan et al 2003). That is, when new members are added to

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


102

the population, the corresponding the number of old members are excluded.
Let m denote this constant population size. Since each population may contain
duplicates of chromosomes, express populations by m-tuples whose elements
are n-tuples from the set Gn. Finally, let f denote the fitness function
employed in the algorithm. The algorithm, which is iterative, consists of the
following six steps.

1. Select an initial population, P k  ,of a given size m, where k=1. This

selection is made randomly from the set Gn. The choice of value m is
important. If it is too large, the algorithm does not differ much from an
exhaustive search; it is too small, the algorithm may not reach the optimal
solution.

2. Evaluate each chromosome in population, P k  , in terms of its fitness. This

is done by determining for each chromosome ‘x’ in the population the


value of the fitness function, f x  .

3. Generate a new population, Pnk  , from the given population P k  by some

procedure of natural selection. One possible procedure of natural selection


is the deterministic sampling. According to this procedure, calculate the

value ex   mgx  for each ‘x’ in P k  , where gx  is a relative fitness defined

by the formula.
f x 
gx  
 f x 
x p k

Then the number of copies of each chromosome ‘x’ in P k  , that is chosen

for Pnk  , is given by the integer part of ex  . If the total number of

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


103

chromosomes chosen in this way is smaller than m (the usual case), then

select the remaining chromosomes for Pnk  by the fractional parts of ex  ,
from the highest values down. In general, the purpose of this procedure is
to eliminate chromosomes with low fitness and duplicate those with high
fitness.

4. If stopping criteria are not met, go to step 5, otherwise, stop.

5. Produce a population of new chromosomes, P k 1 , by operating on

chromosomes in population Pnk  . Operations that are involved in this step


attempt to mimic genetic operations observed in biological systems. They
include some or all the following four operations:

A. Simple Crossover: Given two chromosomes


x  x1 , x 2 ,.. .., x n ,
y  y1 , y 2 ,. . . ., y n

and an integer i N n 1 , which is called a crossover position, the operation of


simple crossover applied to x and y replaces these chromosomes with their
offsprings,
x   x 1 ,. . . ., x i , y i  1 , . . ., y n ,
y   y1 ,. . . ., y i , x i 1 , . . ., x n

chromosomes x and y, to which this operation is applied, are called mates.

B. Double Crossover: Given the same chromosomes mates x, y as in the


simple crossover and two crossover positions i, jNn 1i  j , the operation of

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


104

double crossover applied to x and y replaces these chromosomes with their


offsprings,
x   x 1 ,. . . ., x i , y i  1 , . . ., y j, x j 1 , . . ., x n ,

y   y1 ,.. .., y i , x i 1 , .. ., x j, y j1 , . .., y n

C. Mutation: Given a chromosome x  x1 , x 2 , . . . , x n  and an integer i N n ,


which is called a mutation position, the operation of mutation replaces x with
x   x1 ,... ., x i 1 , z, x i 1, ..., x n ,
where z is a randomly chosen gene from the gene pool G.

D. Inversion: Given a chromosome x  x1 , x 2 , . .. , x n  and two integers


i, j N n 1i  j , which are called inversion positions, the operation of inversion
replaces x with
x   x1,. .. ., x i , x j , x j 1 , x i 1 , x j 1,. .., x n

6. Replace population Pnk  with P k 1 produced in Step 4, increase k by one,

and go to Step 2.

A crossover operation is employed in virtually all types of genetic


algorithms, but the operations of mutation and inversion are sometimes
omitted. Their role is to produce new chromosomes not on the basis of the
fitness function, but for the purpose of avoiding a local minimum. This role is
similar to the role of a disturbance employed in neural networks. If these
operations are employed, they are usually chosen with small probabilities.
The mates in the crossover operations and the crossover positions in the
algorithm are selected randomly. When the algorithm terminates, the

chromosome in P k  with the highest fitness represents the solution.

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


105

To illustrate the use of a genetic algorithm for determining the

maximum of function f(x)= 2x  x 2 / 16 in the given interval [0,31], the


interval is approximated by 32 integer points i.e., 0, 1, .., 31, and these points
are coded by the corresponding binary numbers. Then, gene pool G = {0, 1}
and all possible chromosomes are binary integers from 00000 through 11111.
Assume m=4 and P(1) = {00010, 01001, 10011, 11000} in step 1 as shown in
Table 6.1a. Using function f as the fitness function, the fitness of each
chromosome in P(1) is calculated using step 2. Then, using the deterministic
sampling in step 3, the population Pn(1) =(01001, 10011, 10011, 11000) as
shown in Table 6.1b is obtained. If given stopping criteria in step 4 are not
met, proceed to step 5. Assuming that the condition Pn(k) = P(k) was chosen as
the stopping criterion, the algorithm does not stop at this point and proceeds
to step 5. In this step, lets assume that only simple crossovers are used, each
of which produces one of the two possible offsprings. For each x in Pn(1) , a
mate y in Pn(1) and a crossover point are chosen randomly and, then, the
offsprings x is produced as shown in Table 6.1 b. Next, in step 6, the old
population Pn(1), is replaced with the new population P(2) of offsprings
produced in step 5, k is increased by one, and then proceed to step 2. Step 2
and 3 are now repeated for k=2, and the results are shown in Table 6.1c. The
stopping criterion in step 4 is again not satisfied; consequently, proceed to
step 5. The result of this step is shown in Table 6.1d. In step 6, replace P(2)
with P(3), increase k by one, and proceed to step 2. The application of steps 2
and 3 for k=3 results in P n(3), shown in Table 6.1e. Now, the stopping criterion
Pn(3) = P (3)
is satisfied in step 4, and the algorithm terminates. The
chromosome 10000, which has the highest fitness, represents the solution.
This chromosome corresponds to the integer 16 which is, indeed, the point for
which the function f reaches its maximum.

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


106

Table 6.1 Illustration of Genetic Algorithm


(a) k = 1 : Step 2 and 3 where P : population, g(x) = relative fitness

Number of
Chromosome
Integers Fitness g(x) 4g(x) Selected
in P(1)
Copies
00010 2 3.75 0.084 0.336 0
01001 9 12.94 0.292 1.168 1
10011 19 15.44 0.350 1.400 2
11000 24 12.00 0.291 1.084 1

(b) k = 1 : Step 5

Mate Crossover Site Resulting


Chromosome
(randomly (randomly Chromosomes
in Pn(1)
Selected) Selected) in P(2)
01001 10011 3 01011
10011 01001 3 10001
10011 11000 1 11000
11000 10011 1 10011

Similarly for the values of k 2, 3, values are set to calculate the fitness value,
mate and the crossover site.

(c) k = 2 : Step 2 and 3

Number of
Chromosome
Integers Fitness g(x) 4g(x) Selected
in P(2)
Copies
01011 11 14.44 0.250 0.100 0
10001 17 15.94 0.276 1.104 2
11000 24 12.00 0.207 0.828 1
10011 19 15.44 0.267 1.068 1

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


107

(d) k = 2 : Step 5

Mate Crossover Site Resulting


Chromosome
(randomly (randomly Chromosomes
in Pn(2)
Selected) Selected) in P(2)
10001 3 2 10000
10001 4 3 10011
11000 1 2 11001
10011 2 3 10001

(e) k = 3 : Step 2 and 3

Number of
Chromosome
Integers Fitness g(x) 4g(x) Selected
in P(2)
Copies
10000 16 16.00 0.274 1.096 1
10011 19 15.44 0.265 1.060 1
11001 25 10.94 0.188 0.752 1
10001 17 15.94 0.273 1.092 1

6.3 IMPLEMENTATION OF GENETIC ALGORITHM FOR


CLINICAL DIABETIC DATABASE

In Chapter 4 using neural network it is seen that the overall opinion


of all the experts regarding the occurrence of diabetes is greater than 0.5. In
this chapter the genetic algorithm is applied to the clinical diabetic database
given in Table A 2.1for computing the best fitness value and also to evaluate
the result with the one obtained using neural network. The clinical diabetic
database shown in Table A 2.1 is framed based on overall opinion of medical
experts which contains information about 70 patients and their medical
conditions.

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


108

The input attributes taken for this implementation are AGE, Fasting
Plasma Glucose (FPG), Post Prandial Plasma Glucose (PPG), Gender (G)
(value 1:Male; value 0: Female), P/NP-Pregnant or Non Pregnant (value 1:
pregnant; value 0: non pregnant) and the output attribute taken is Diabetes (D)
(value ≤ 0.5: patient with no diabetes disease; value > 0.5: patient affected
with diabetes disease). The details about the attributes are discussed earlier in
Chapter 3. The parameters of genetic algorithm used for this implementation
are listed in Table 6.2.

Table 6.2 Parameters of Genetic Algorithm


Parameters Type and Value
Generation Number 100
Population Size 25
Type of Selection Operator Stochastic Uniform
Crossover Single Point
Mutation Function Gaussian
Mutation Rate 1.0

6.4 PERFORMANCE EVALUATION OF GA FOR PREDICTING


DIABETES

Genetic algorithm is applied for a diabetic data having AGE=52,


FPG=158, PPG=224, G=0, P/NP=0 given in Table A 2.1and the best fitness
value is computed using Matlab R2007b is shown in Figure 6.2. From Figure
6.2 it is observed that the best fitness value is 0.601, which signifies that the
patient is diabetic.

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


109

Figure 6.2 Best Fitness Plot for Diabetic Data

Likewise, genetic algorithm is applied for a non-diabetic data


having AGE:36, FPG:86, PPG:91, Gender:1, P/NP:0 given in Table A 2.1 and
the best fitness value is computed using Matlab R2007b is shown in
Figure 6.3.

Figure 6.3 Best Fitness Plot for Non-Diabetic Data

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


110

From Figure 6.3 it is observed that the best fitness value is 0.425,
which signifies that the patient is non-diabetic. Similary genetic algorithm is
applied to all the data given in Table A 2.1 and the prediction was found to be
accurate.

6.5 SUMMARY

This chapter has presented the genetic algorithm for computing the
prediction accuracy of diabetes. The best fitness value is evaluated for all the
data in the clinical diabetic database to predict whether the patient is affected
with diabetic or not. It is observed that the results obtained using genetic
algorithm is similar to the output obtained using neural network. This
proposed approach showed better accuracy in predicting diabetes.

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

You might also like