0% found this document useful (0 votes)
15 views26 pages

Roadmap To The Future of Genetic Algorithm Based Software Testing

Uploaded by

suryakirana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views26 pages

Roadmap To The Future of Genetic Algorithm Based Software Testing

Uploaded by

suryakirana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

1

Roadmap to the Future of Genetic Algorithm


based Software Testing
1
Anupama Surendran and 2 Philip Samuel

1
Department of Computer Science, 2 Information Technology Division, SOE

1&2
Cochin University of Science & Technology, Kochi, Kerala, India

By this study, we intend to highlight some of the major


Abstract— In this paper, we present a roadmap to the future factors which are to be given special research focus during
of genetic algorithm based software testing based on a systematic genetic algorithm based testing. For this, we have tried to
review of literature. The field of software testing has seen an make a systematic categorization of the works in the
extensive use of search based techniques in the last decade.
Among the search based techniques, it is the metaheuristic concerned field and following this, we have made a critical
techniques such as Genetic algorithm that has garnered the study of some of the important factors like fitness function
major share of attention from researchers. Looking at the large design, population initialization and parameter settings in
body of work that has happened and is happening in this field, we genetic algorithm. We have also pointed out the significance
feel that it is high time someone studied how well genetic of these factors in genetic algorithm (GA) based software
algorithm based techniques fare in practical testing process. This testing. Future researchers may consider these factors when
independent review is designed to direct the attention of future
researchers to the deficiencies of genetic algorithm based testing, implementing genetic algorithm based testing to software.
their possible solutions and the extent to which they are Further, trying to resolve the issues discussed in our review
correctable. Our review of genetic algorithm based software may make genetic algorithm based software testing as one of
testing works reveals that, the type of genetic algorithm used, the undefeatable testing methods in practical software testing.
fitness function design, population initialization and parameter The rest of the paper is organized as follows. In Section II,
settings affects the quality of solution obtained in software testing we have provided the motivation behind this review. In
using genetic algorithm. For this, we have provided a review
protocol that helps to collect evidence from literature for the Section III, various literatures are reviewed based on the
review and to set up the future research scope. We hope that, review strategy. Section IV gives the observations from
viewing from such an angle will help them to make a major various literatures for this review. Section V points the
breakthrough in genetic algorithm based software testing field. validity threats to this review and Section VI gives the
conclusions.
Index Terms—Software testing, genetic algorithms, review,
population, parameter settings, selection, crossover, mutation,
II. SYSTEMATIC STUDY- WHY & HOW?
fitness function design
We have made an attempt to look into the issues of GA
I. INTRODUCTION based software testing due to many reasons. This section
explains the purpose of our study and the steps we have
Software has an irreplaceable role in our day to day life and
this makes software testing a vital cog in the wheel [29, followed to conduct the study.
41]. Even though software testing has evolved tremendously, A. Why Systematic Study?
it still remains both time consuming and personnel dependent. In the past few years search based software testing,
Almost 50% of the cost and time of the software development especially evolutionary algorithm, has gained immense
is taken up by testing [5, 27]. Hence, automation of software popularity [1, 10, 22, 37]. A graph is shown in figure 1, which
testing assumes paramount importance. Among automated shows an increase in rate of publications and research works
testing process, genetic algorithm based testing has gained in search based software testing during the period 1975 to
significant momentum from 2000 onwards [35, 36]. In spite of 2014 [37]. Among evolutionary algorithms, genetic algorithm
the large number of works taken place in genetic algorithm based software testing has received a wide interest from
based software testing, the real issues faced by testers is yet to researchers due to its ability to handle complex problems
be addressed. Our work tries to resolve this problem by where an exact solution doesn‟t exist. The need for systematic
reviewing genetic algorithm based testing works. review of genetic algorithm based software testing arises due
2

to a number of reasons. Even though genetic algorithm based -List the literature by author, title, and year
testing has made a great impact in academic research, only 3. From the selected works, identify the factors relevant in
very little attention has been given to understand the testing
complexities of using genetic algorithms in practical software -Concentrate on the approaches followed in various
testing. work
-GA/GA Variation
-Population representation
Graph showing the rate of publication and research work
in search based(eg. Evolutionary algorithms) software -Selection
testing -Crossover Operator
-Mutation Operator
ResearchWorks

100 -Fitness Function


80 -Advantages in testing
60
-Disadvantages in testing
40
20 -Future research scope
0 4. Answer to research questions from observations made
-From GA/GA Variation
2001
2003
2005
2007
2009
2011
2014
75
77
79
81
83
85
87
89
91
93
95
97
99

Years
-From Parameter settings (Selection, Crossover and
Mutation Operator)
Figure 1. Research works in search based software testing during the period -From issues in response time
1975 to 2014 -From Fitness Function design issues
In this study, we have tried to focus on this problem and have -Identify the future research possibilities of the selected
tried to highlight the challenges involved in genetic algorithm works (From the observations of GA/GA Variation,
based approaches for using it as a practical tool in software Parameter settings, fitness function design issues,
testing. The main reason for choosing this problem in our advantages, disadvantages of selected works)
work is because of the usage of genetic algorithms in software 5. Threats to validity of the study
testing without knowing the ambiguities in genetic algorithm 5.1. Threats due to the selection of literature for study
based testing. In this paper, we have mentioned some works 5.2. Threats due to factors considered for conducting
which utilize genetic algorithm for testing. We can see that the study
none of these works have adopted any general operator setting The figure 2 given below shows the diagrammatic
for testing purpose. This inherent non-deterministic nature of representation of the main steps in our review strategy
the genetic operators makes program testing a demanding task.
Issues in GA Review of works Result/Conclusion
The strength of using genetic algorithm mainly depends on based software in GA based on using GA in
setting the genetic parameters to their appropriate values testing software testing software testing
which in turn depends on the problem to be solved [7, 53].
This itself is a major challenge faced by testers. We have
mentioned some of these challenges and have also pointed out
the factors that are still not considered by researchers during Research Answer to
Questions research
GA based software testing. Making an unbiased review like
Review Protocol questions
this may help to solve the issues in genetic algorithm based
Literature
software testing and at the same time help the future Selection
researchers to explore the untouched research areas in GA
based software testing.
Source Test purpose & Type of Relevant
B. Review Strategy of our Work Repository testing works
We have developed a review strategy to effectively cover
all the work which comes under our area of interest. Clearly Conclusion
defining a method to perform systematic study will help to Identify & observe factors relevant in testing from
state the research objectives, findings and conclusions. Our - GA/GA Variation observation
-Population representation
review strategy consists of the following parts:- -Selection
1. Brief overview of Genetic algorithm -Crossover Operator
2. Review/ Study Protocol -Mutation Operator
2.1. Identification of research questions -Fitness Function
-Advantages in testing
2.2. Selection criteria for review/study -Disadvantages in testing
-Identify relevant works from conference, journals, -Future research scope
and transactions
-Test purpose -Priority for works which uses GA for Figure 2. Basic steps in review strategy
test data generation
-Type of testing - Structural, Functional,
Model based
3

III. REVIEW STRATEGY


A. Brief overview of Genetic algorithm
procedure Genetic Algorithm
This section gives an overview of the basic steps involved begin
in genetic algorithm. Genetic algorithm is a type of GET (Initial Population);
evolutionary algorithm and is considered as the best and the CALCULATE FITNESS
strongest of all evolutionary algorithms [19]. It is a type of (Initial Population)
loop
meta heuristic search technique developed by John Holland FINALIZE POPULATION
and works on Darwin‟s principle of survival of the fittest [25]. FOR CROSSOVER
Genetic algorithm uses the technique of natural genetics, (Parent population)
representing a computer model of biological evolution. PERFORM CROSSOVER
(Parent population, child)
Genetic algorithms have the ability to solve a variety of APPLY MUTATION (Child)
optimization and search problems. Several testing techniques CALCULATE FITNESS (Child)
use genetic algorithms believing that testing may be carried GET NEXT GENERATION
out in a better way using the natural evolutionary process (Parent population, Child)
stop process when
present in them. TERMINATION CRITERA
Genetic algorithm identifies an optimal solution for a exit loop
problem by applying natural evolutionary techniques to a end
group of possible solutions referred to as “population” [17, 48,
50]. After each generation, a new generation is formed which
is better than the previous generation. The series of steps Figure 3. Basic steps of genetic algorithm
involved in genetic algorithm are population initialization, Systematic review must be carried out with an aim to answer
selection, crossover, mutation and termination [14]. A string these research questions. During this process, we may come
of digits called chromosomes are present and each individual across several factors which may or may not be related to
of the string is called a gene. Each individual in the population research questions. These factors may be categorized by some
has a fitness value which decides the quality and performance assumptions and observations drawn from various literatures.
of that individual. Greater the fitness value, better will be the Here, we have defined a research questions (RQ) which is
problem solving capacity of an individual [39]. Collection of given below. This (RQ) helps to emphasize the strengths and
chromosomes makes up a population. The initial population is significance of our review.
created randomly and the fitness of the individuals in the RQ. What are the factors to be considered for improving
population is calculated. This information is used to select the genetic algorithm based software testing?
best candidates for forming the next generation parents. After Many research works have claimed genetic algorithms to be
selecting parents of the successive generation, the next step is one of the most useful methods for software testing without
to combine these candidates to form the offspring. Crossover pointing out the real complexities that may occur during
operation is used to perform this step. Crossover enables the genetic algorithm based software testing. This research
selection of good features from parents to form the offspring. question tries to highlight such issues in software testing. We
Mutation is applied to the offspring to create better quality have split the research question (RQ) into some sub questions
individuals. Mutation is defined as the process of altering the (SRQ) to simplify the review process.
genes in the chromosome. A new generation is chosen from SRQ1. In spite of the large volume of works in genetic
the offspring based on the fitness of the individuals. These algorithm based testing, why some works have considered
individuals are considered as parents of the next generation. variations of genetic algorithms?
This cycle is repeated until a global solution for the problem is SRQ2. What is the effect of population representation and
obtained. The basic steps of genetic algorithm are given in size in software testing?
figure 3. SRQ3. Is there any common method to design fitness
B. Review Protocol function during software testing?
SRQ4. What is the general strategy adopted in setting
Clearly defining a review protocol is inevitable for all parameters during software testing?
systematic study as it forms the basis of the study. All the
SRQ5. Can GA based software testing evolve as an
observations, results and the effectiveness of the study depend
undefeatable technique in software testing industry? If so,
on the protocol defined. Our review protocol consists of two what are issues to be sorted out in GA based testing?
sections. In the first section the research questions are clearly
In order to answer these research question, we have
defined and in the second section, methods to perform review developed a method for selecting the literature from various
are explained.
repositories which is explained in the next section
1) Identification of Research Questions
2) Selection Criteria to Perform Review
Research questions may be considered as the foundation of
Since our study is based on the observations of several
the any research as it defines the objective or aim of the literatures, we have to define the selection strategy made for
research.
choosing the works. The main steps are given below.
4

-Identify relevant works from conference, journals illustrates test prioritization and test case minimization etc.
and transactions using genetic algorithms. Keeping these factors in mind, we
-Test purpose-Purpose of using GA in test data generation defined a selection criterion which is relevant for
-Type of testing- Structural, Functional, Model based understanding the core issues in using GA for software testing
-List the literature by author, title, and year as well as to explore the future research perspectives in GA
based software testing. In our selection criteria, the papers
a) Identification of Literature from Various Sources which used GA or variation of GA or combination of GA with
The first step in selection criteria is to identify research other methods for improving the test data generation process
works by searching resources such as IEEE Xplore, ACM during software testing were identified. We also selected some
digital library, Elsevier, Wiley, Springer etc. We have works which uses GA for structural, black box and model
considered some premium conference works as well as works based testing and studied GA parameter setting effects during
published in transactions and some reputed journals. test data generation. Works are categorized according to the

Source Repository: IEEE Xplore (Transactions &Conferences), ACM


digital library (Transactions & Conferences), Springer (Journals &
Conferences), Wiley (Journals), Elsevier (Journals)

Number of papers identified after


applying keyword “Genetic algorithm
based testing”=300
Number of papers identified after
applying keyword “Genetic Algorithm
based software testing”= 120

NO Test case prioritization, test


Structural, functional, YES case minimization,
mutation, model based regression testing, hardware
testing, study of Works included testing, testing of circuits &
parameter effect in all other variants of testing
software testing

Number of Papers identified after final


iteration= 30
Categorization

Work by title, Type of testing Purpose


author, year

type and purpose of testing. Finally, after the second round of


Figure 4. Literature selection review strategy
iteration, we selected nearly 38 papers which clearly fall
The first step is to identify or finalize the keyword or query within our review scope. Figure 4 gives our selection criteria
which many be applied to the source repository. As we are and the list of the works considered in our review after
interested in GA based testing, we gave the keyword “Genetic selection is given in table 1.
algorithm based testing” to various databases. As a result, we C. Conducting the Study to Identify Factors Relevant in
got nearly 300 research papers from these sources. A close Testing
examination of these papers revealed that, most of the works After selecting the papers from the repository, we went
were related to the field of hardware testing, testing of circuits through the details of each paper very carefully. Each paper
etc. Therefore we refined our keyword as “Genetic algorithm was analyzed from a tester‟s viewpoint and the future research
based software testing”. Nearly 120 papers were identified. As scope or the enhancements and modification which may be
it not possible to review all applications of genetic algorithm possible to make the exiting work more reliable was
based software testing, we have ignored some works which considered. We adopted this approach for analyzing the papers
5

for pointing out the shortcoming present in GA based testing that we have categorized the referred works according to type
approaches as well as to highlight the untouched and of testing, purpose & coverage. After that we have analyzed
challenging research areas possible in GA based software the various types of population representation, size,
testing. We have conducted this review with an aim of generation, parameter settings, and fitness function design
answering the research questions listed in Section III. B. 1. For methods used in these referred works.
TABLE 1. WORKS SELECTED AFTER THE FINAL ROUND

No. Author & Year Type of Testing Purpose

1 M. Fisher et. al. Functional testing Introduced new type of genetic algorithm called micro algorithm for test data generation
[2012]

2 J. Louzada et.al. Mutation testing Test data generation using elicit GA


[2012]

3 Xue-ying et. al. Structural testing Reduce cost associated with test suite reduction using GA
[2005]

4 J. Xiao et.al. Structural testing Bug fixation with GA


[2010]

5 G. I. Latiu Structural testing Software path testing using GA


[2012]

6 M. A. Ahmed Structural testing Test data generation for multipath testing using GA
et.al. [2008]

7 A. Pachure et.al. Structural testing Automated approach for branch testing using GA
[2013]

8 M. Roper et al. Structural testing Used GA for testing C programs for attaining the branch coverage criteria.
[1995]

9 B. Jones et al. Structural testing Branch coverage using GA


[1996]

10 R. P. Pargas et Structural testing Path coverage & Branch coverage using TGen tool
al. [1999]

11 C.C. Michael et Structural testing Developed a tool called GADGET which uses genetic algorithms for generating test data for
al. [2001] branch coverage of C programs

12 A. A. Structural testing Test data generation framework using genetic algorithms to provide edge/partition coverage
Sofokleous et.al.
[2008]
13 C. Chen et.al. Structural testing Test data generation for branch coverage

14 D. J. Bernat Structural testing Studied the effect of response time in GA based testing

15 S. Ali et.al. Model based testing Studied the effect of response time in model based test data generation and a method for
[2011] generating test data using GA from OCL constraints generated by UML modelers

16 N. Sharma Functional testing Used GA for generating test data for character set input
et.al.[2012]

17 A. Rauf et. al GUI testing(represented Automated testing of GUI using genetic algorithm
[2010] as State machine)

18 S. Khor et. al. Structural testing Used the concept of GA and formal concept analysis to generate test data for branches
[2004]

19 Y. Cao et al. Structural testing Test data generation of a specific path using GA
[2009].

20 J. Malburg Structural testing Introduced a hybrid method which combines GA based and constrained based test data
[2011] generation approach
6

21 W. Zhang et.al. Structural testing Test data generation of many paths using GA
[2010]

22 G. Fraser et.al. Structural Testing Studied the effect of seeding in test data generation of object oriented programs
[2012]

23 A. Arcuri et. al Structural testing Studied the effect of parameter settings for object oriented programs using Evosuite tool
[2011]

24 P. M. S. Bueno Structural testing Path coverage testing using GA


et. al. [2000]

25 J. Wegner et. al. Structural testing Test data generator for structural testing of real-world embedded software systems using GA
[2002]

26 J. Miller et. al. Structural testing Test data generation for Branch coverage using GA
[2006]

27 P. McMinn Structural testing Studied the impact of cross over on C programs during GA based testing
[2013]

28 C. Doungsa-ard Model Based testing Test data generation from UML diagrams
et al. [2007]

29 G. Fraser et. al. Structural testing Test suite generation for object oriented programs which cover multiple goals simultaneously
[2013]

30 J. Li et.al. Model based testing Test data generation for class behavioral testing using GA
[2009]

31 D. Gong et.al. Structural testing Test data generation for many paths using GA
[2011]

32 P. Pocatilu et. al. Structural testing Test data generation for embedded systems based on GA using control flow graph construction
[2013]

33 C. Mao et. Structural testing Used variation of GA called Quantum inspired GA for test data generation to improve program
al.[2013] coverage

34 D. Liu et. al. Structural testing Test data generation using modified GA to avoid premature convergence
[2013]

35 Y. Suresh et. Structural testing Test data generation for basis path testing using GA
al.[2013]

36 A. Arcuri et. al. Structural testing Studied the effect of parameter settings for object oriented programs using Evosuite tool and
[2013] proved that parameter tuning may or may not good result. If search budget and time is a
constraint, then default value of parameters may be used for such problems rather than going for
parameter tuning
37 G. Fraser et.al. Structural testing Extended the GA based Evosuite tool to a mementic algorithm based approach to improve the
[2013 & 2014] performance of GA during test data generation

38 J. P. Galeotti et. Structural testing Integrating DSE(dynamic symbolic execution) approach with GA based Evosuite tool for test
al. [2014] data generation of programs

RQ
Factors to be considered for improving genetic algorithm based software testing

SRQ1 SRQ2 SRQ3 SRQ4 SRQ5


Variation of GA in Population size Fitness function Parameter Future issues to be
software testing & representation design strategy settings/tuning solved

Observation drawn from factors addressed


To establish GA as an undefeatable method in software testing industry

Figure 5. Research questions and problem addressed


7

Figure 5 shows the main objective of this review, factors algorithm. In micro algorithm, all the individuals are supposed
addressed and the final conclusion from the observation made to die except the fittest. Main benefit of micro algorithm is
from the review. The following subsection gives a list of that only a small population size is required for convergence.
observations made from literature during GA based software J. Louzada et.al. used an Elicist genetic algorithm for
testing. They are categorized as variations of genetic automatically generating test data during mutation testing
algorithms used in software testing, population setting, [32]. In their work, the experimental results shows that using
parameter setting and finally fitness function design issues metahueuristics for mutation obtains better result compared to
during software testing. results obtained without using heuristics. Xue-ying et. al. used
1) Variations of Genetic Algorithm in Software Testing genetic algorithm for reducing the cost associated with test
Even when a group of researchers claim that simple GA is suite reduction [55]. They have introduced a new type of
best for software testing, we can see that a lot of works have genetic algorithm called GeA which considers cost factor
used several variations of GA for software testing. We can see during the calculation of fitness function. They have compared
that in software testing, either simple genetic algorithm, the performance of their algorithm with Greedy algorithm
parallel GA, multipopulation GA or some type of hybrid named MgrA and simple genetic algorithm called SGeA. The
methods which use a combination of GA and some other performance of GeA was found to be better than SGeA and
methods may be used. In most of the works it is mentioned MgrA [55]. G. I. Latiu used evolutionary algorithms for
that the variations of GA are used to overcome the limitations software path testing [30]. In their work, they compared the
of simple GA. We have enumerated here certain works using result of two new approaches based on particle swarm
these. M. Fisher et.al. have used genetic algorithm based optimization and simulated annealing with genetic algorithms.
approach for Black box testing [12]. In their work, they have
used a new type of genetic algorithm called micro genetic
algorithm. They have mentioned that in genetic algorithm,
finding an optimal set up for all the parameters is very difficult
and that is why they have suggested the concept of micro

TABLE 2. WORKS ON VARIATIONS OF GA

Work GA VARIATION RESULT

M. Fisher et.al. [2012] Micro genetic algorithm for Black box testing Only a small population size is required for
convergence

J. Louzada et.al. [2012] Elicist genetic algorithm for mutation testing Elisit GA better than ordinary GA

Xue-ying et. al. [2005] New type of GA called GeA for reducing cost GeA is better than ordinary GA
associated with test suit reduction

G. I. Latiu [2012] New approach based on simulated annealing for Simulated annealing based approach is better
path testing than GA based approach

J. Malburg et.al. [2011] Hybrid method which combines GA based and Hybrid method proved to have better
constrained test data generation approach performance

S. Khor et. al. [2004] Used concept analysis along with GA for Concept analysis with GA is better than simple
structural test data generation GA

S. Ali et. al. [2011] Applied (1+1) EA for model based test data Proved (1+1) EA to be better than GA and GA
generation better than random search

J. Wegner et. al. [2002] Applied GA and multi-population GA for Proved multi-population to be better than
structural testing of real-world embedded simple GA
software systems

G. Fraser et.al. [2013 & 2014] Extended the GA based Evosuite tool to a Proved that mementic algorithm based GA
mementic algorithm based approach to improve approach gives better branch coverage than
the performance of GA during test data GA
generation
8

J. P. Galeotti et. al. [2014] Integrating DSE(dynamic symbolic execution) Gives better code coverage than GA
approach with GA based Evosuite tool for test
data generation of programs

C. Mao et. al.[2013] Used variation of GA called Quantum inspired As the search space is enlarged, the new QIGA
GA (QIGA) for test data generation to improve avoid local optimal solution compared to GA
program coverage

D. Liu et. al. [2013] Used a modified GA Modified approach gives higher test data
efficiency and avoids premature convergence
compared to GA.

They took the parameter values based on some assumptions the performance. The new method gave higher branch
made from previous work. Finally, they concluded that coverage than ordinary GA based method. J. P. Galeotti et. al.
simulated annealing based approach is better than genetic [58] iintegrated DSE (dynamic symbolic execution) approach
algorithm and particle swarm optimization based approach. J. with GA based Evosuite tool for test data generation of
Malburg and G. Fraser have used a hybrid method which programs and the results of their showed that the integrated
combines GA based and constrained based test data generation approach gave better code coverage than GA. C. Mao et. al
approach [33]. They have proposed this approach in order to [63] used a variation of GA called Quantum inspired GA
eliminate the disadvantages caused by both these methods. In (QIGA) for test data generation to improve program coverage.
their work, a constraint solver is used to ensure that the In their work, the search space is enlarged to avoid local
offspring produced by mutation does not get stuck in local optima. The results showed that unlike GA based approaches,
optima and they follow a different control flow. Finally, they QIGA has less chance to get struck in local optima. D. Liu et.
prove that GA-DSE (Genetic algorithm based dynamic al. [64] used a modified GA based approach in their work. The
symbolic execution) has higher branch coverage than random modified method avoided premature convergence and
search, GA or DSE. S. Khor et. al. have used the concept of converged faster than ordinary GA based method. The
GA and formal concept analysis to generate test data for modified GA also gave higher test data efficiency compared to
branches [28]. They develop a tool called Genet. The authors GA based method. Table 2 gives a summary of the list of
have not used any branch functions or any program graphs for works described in this paragraph. From table 2 we can see
generating test data. They mentioned that the structural that the variants of GA are proved to be better than simple
complexity is an unreliable measure to describe the programs GA.
and they suggested that the search space feasibility is a better
indicator. They used concept analysis instead of fitness 2) Population Setting during Software Testing
function in their approach. The concepts were ranked and the In GA based software testing, population representation,
concepts with highest rank are the winner or cover more size of population and generation is an important factor [21,
branches. S. Ali et. al. have applied a search based method for 49]. Population can be represented as a group of 0‟s and 1‟s,
model based test data generation [3]. They have developed a as a group of integers, as decimal numbers or as characters. In
method for test data generation which satisfies the OCL some problems a tree representation is also possible. Based on
constraints generated by UML modelers. In their work, the the problem, appropriate method of representation is applied.
authors have claimed that, there is no guarantee that an Improper representation of the individual in genetic algorithms
optimal solution will be found in reasonable time. Fitness may cause unexpected variations in the final result. M. Fisher
function is expressed as the branch distance of OCL et.al. have used a sequence of transitions between components
expressions and if the input data satisfies a constraint, the for population representation using black box testing [12]. J.
input data was rewarded. The parameter settings were done Louzada et.al. used a Binary representation during mutation
based on experience from the previous experiments. They testing whereas Xue-ying et. al. used a subset of test cases for
conducted experiments using GA, (1+1) EA and Random population representation during test suite reduction and J.
search. The experimental results showed that, (1+1) EA to be Xiao et.al. used a binary representation of bug id for
better than GA and GA outperforms random search. J. Wegner representation of candidate solution during bug fixation [32,
et. al. used simple GA as well as multi-population GA to 54, 55]. In some other works which use genetic algorithm for
generate test data for structural testing of real-world embedded structural testing, an array of bits, base 10 representation,
software systems. They compared the results and finally binary representation, string of characters, integers and binary
multipopulation GA proved to be better than simple GA and string were used to represent the population whereas in model
both of these proved to be better than random method [52]. G. based testing the population is represented as a set of
Fraser et.al. [59, 60] extended the GA based Evosuite tool to a transitions [8, 9, 31]. A. Rauf et. al. used transitions in the
mementic algorithm based approach to improve the state diagram to represent the population [45]. Doungsa-ard et
performance of GA during test data generation. In their al. represented the population as the series of transitions in
method, local search is applied at regular interval to improve the UML diagram [11]. The population size can also be a
9

confounding factor because if the population size is too small of times the whole genetic cycle may be repeated to obtain the
the genetic algorithm will not search all the possible solution required solution. As it is not possible to repeat the whole
areas to procure an optimal solution [10, 11]. In this case, the process indefinitely, a stopping criterion should be defined. In
individuals may reproduce abundantly and the resulting other words, stopping criteria decides the cause of algorithm
diversity in population may cause the individuals to converge termination. Genetic algorithm terminates when the required
to a point which appears to be better than the neighbouring result is obtained or for a specific number of generations or
points. In such a situation, even though there is a chance that a when there is no improvement in the fitness value which is the
better solution exists, it is missed as the population size is best in a specified time interval. Usually in GA, as the number
already declared to be very small. This is known as the of generation increases the end result also improves which in
premature convergence problem [39]. turn causes an increase in execution time of the whole process.
We can see that the size of population and the number of Therefore population representation, size and generation are
generations were also set to different rates in the selected critical factors in GA based software testing.
works. Generations decide the number of runs or the number

TABLE 3. POPULATION REPRESENTATION, SIZE AND GENERATIONS

Sl.no. Work Type of Testing Purpose Population Population Size &


Representation Generations

1 M. Fisher et. Black box Introduced new type of genetic algorithm called Sequence of Not specified
al. [2012] micro algorithm for test data generation transitions between
components
2 J. Louzada Mutation testing Test data generation using elicit GA Binary, Generations set as 300, 500
et.al. [2012] and 1000

3 Xue-ying et. Structural testing Reduce cost associated with test suite reduction Subset of test cases Not specified
Al [2005]. using GA

4 J. Xiao et.al. Structural testing Bug fixation Bug id represented Size of population as 100
[2010] as binary & 500 generations

5 G. I. Latiu Structural testing Software path testing using GA Array of bits Size set as 40 and
[2012] generations as 100

6 M. A. Ahmed Structural testing Test data generation for multipath testing using GA Base 10 size set 30 and generations
et.al. [2008] representation of as 100
alleles
7 A. Pachure Structural testing Automated approach for branch testing using GA Binary Size set as 6,10, 16, 20,
et.al. [2013] representation, 26....110 and Generations
as 107
8 M. Roper et Structural testing Used GA for testing C programs for attaining the Character string, Not specified
al. [1995] branch coverage criteria.

9 B. Jones et al. Structural testing Branch coverage using GA Binary Size set as 45, generations
[1996] representation not specified

10 R. P. Pargas Structural testing Path coverage & Branch coverage using TGen tool String of characters Size set as 100 and the
et al. [1999] number of generations
considered until the specific
branch coverage is obtained
11 C.C. Michael Structural testing Developed a tool called GADGET which uses Binary string Population size set as 24
et al. [2001] genetic algorithms for generating test data for branch and 100 and the number of
coverage of C programs
12 A. A. Structural testing Test data generation framework using genetic Integers Size set as 100 and the
Sofokleous algorithms to provide edge/partition coverage generations set as 600
et.al. [2008]
13 C. Chen et.al. Structural testing Test data generation for branch coverage Not specified Size set as 100

14 D. J. Bernat Structural testing Studied the effect of response time in GA based Not specified Not specified
testing

15 S. Ali et.al. Model based testing Studied the effect of response time in model based Set of transitions Size set as 100
[2011] test data generation and a method for generating test which satisfy OCL
data using GA from OCL constraints generated by constraints
10

UML modellers

16 N. Sharma Black box Used GA for generating test data for character set Character set, Population size=Fitness
et.al.[2012] input value required * Solution
size
17 A. Rauf et. al GUI Automated testing of GUI using genetic algorithm Transitions in the Generations set as 300 and
[2010] testing(represented as state diagram 500
State machine)
18 S. Khor et. al. Structural testing Used the concept of GA and formal concept analysis Real value Size set as 250,
[2004] to generate test data for branches representation Generations set as 50

19 Y. Cao et al. Structural testing Test data generation of a specific path using GA Binary Size=80, generations=800
[2009]. representation

20 J. Malburg Structural testing Introduced a hybrid method which combines GA Bit vector Not specified
[2011] based and constrained based test data generation
approach
21 W. Zhang Structural testing Test data generation of many paths using GA Binary Size set as 90 for one
et.al. [2010] experiment and size set as
130 for another experiment
22 G. Fraser Structural Testing Studied the effect of seeding in test data generation Method calls Size set as 80 and the
et.al. [2012] of object oriented programs number of generations set
as 10 minute time out (or
1000000 statements)
23 A. Arcuri et. Structural Testing Studied the effect of parameter settings for object Method calls Size set as 4, 10, 50, 100,
al [2011] oriented programs using Evosuite tool and 200. Generations set as
10 minute time out (or
1000000 statements)

24 P.M. S. Structural Testing Path coverage testing using GA Binary string Size set as 80
Bueno et. al.
[2000]

25 J. Wegner et. Structural Testing Test data generator for structural testing of real- Integer Generation set as 200
al. [2002] world embedded software systems using GA representation

26 J. Miller et. Structural Testing Test data generation for Branch coverage using GA Integer Population size set as 100
al.[2006] representation and generations set as 300

27 P. McMinn Structural Testing Studied Impact of cross over Input vector Size set as 300
[2013]

28 C. Doungsa- Model Based testing Test data generation from UML diagrams Sequence of triggers Size set as 10
ard et al.
[2007]
29 G. Fraser et. Structural Testing Test suite generation for object oriented programs Method calls Size set as 80 and the
al. [2013] which cover multiple goals simultaneously number of generations set
as 10 minute time out (or
1000000 statements)
30 J. Li et.al. Model Based testing Test data generation for class behavioral testing Sequence of input Size set as 40 and
[2009] using GA alphabet generations set as 50

31 D. Gong et.al. Structural Testing Test data generation for many paths using GA Integer Generations set as 70 for
[2011] one experiment and as 130
for another experiment
32 P. Pocatilu et. Structural testing Test data generation for embedded systems based on Sbyte (Integer) Size set as 100 and
al. [2013] GA using control flow graph construction generations as 100

33 C. Mao et. Structural testing Used variation of GA called Quantum inspired GA Q-bit (Quantum Size set as 30 and
al.[2013] for test data generation to improve program coverage chromosome) generations as 100

34 D. Liu et. al. Structural testing Test data generation using modified GA to avoid Real numbers Size set as 15 and
[2013] premature convergence generations as 30

35 Y. Suresh et. Structural testing Test data generation for basis path testing using GA Binary Size set as 100 and
al.[2013] generations as 500
11

36 A. Arcuri et. Structural testing Studied the effect of parameter settings for object Method calls Size set as 4, 10, 50, 100,
al. [2013] oriented programs using Evosuite tool and proved and 200. Generations set as
that parameter tuning may or may not good result. If 10 minute time out (or
search budget and time is a constraint, then default 1000000 statements)
value of parameters may be used for such problems
rather than going for parameter tuning
37 G. Fraser Structural testing Extended the GA based Evosuite tool to a mementic Sequence of Size set as 5, 25, 50, 75, 100
et.al. [2013 & algorithm based approach to improve the statements of length and Generations set as 10
2014] performance of GA during test data generation l minute time out

In table 3, we have listed the type of population, size and wheel selection individuals are selected according to their
generations used in some of the GA based software testing fitness. Each individual will be assigned a fitness value and
works. We have categorized the work according to the testing the normalized fitness value is calculated. After calculating the
strategy and purpose and have listed the type of population, normalized fitness value, accumulated fitness value is
size and generations used in them. Different population setting calculated by adding the fitness value of the concerned
are used in different works. Hence, we can infer that, individual and the sum of the fitness value of all other
declaring the correct population size still remains a problem in individuals. A random number is selected between 0 and 1 and
genetic algorithm. the selected individual will have an accumulated fitness value
greater than all other previous individuals but less than the
3) Parameter Setting during Software Testing remaining individuals. Tournament selection is a refinement
In genetic algorithm based program testing, the parameter of roulette wheel selection. Here, roulette wheel selection is
setting needs special attention. For example in the case of repeatedly applied to produce a group of population and the
crossover and mutation, their rates should be not be set at best individual is selected from this group. In random
either high or low levels. In selection, individuals are selected selection method, the chromosome is selected randomly from
from the parent population for crossover and mutation to the given population whereas in best selection method the
produce next generation individuals [25, 44, 51]. There are individual with the highest fitness value is selected. There are
different types of selections like roulette wheel, tournament many other types of selection methods, but we have
selection, random selection, best selection etc. In roulette mentioned only a few.
TABLE 4. PARAMETER SETTING

Sl.no Work Type of Testing Purpose Selection Crossover Operator & rate Type of Mutation & rate
1 M. Fisher et. Black box Component Only fittest Average Different mutation rates
al. [2012] coverage individuals evaluated
selected
2 J. Louzada Mutation testing Coverage of Tournament One point
et.al. [2012] mutants (Random selection Inversion of bits
of individuals)
3 Xue-ying et. Structural testing Test suite Roulette wheel 1/L where L is the number of
One point
al. [2005] coverage selection bits in the gene
4 J. Xiao et.al. Structural testing Test data for
Ratio method 0.8 0.01
[2010] covering bugs
5 G. I. Latiu Structural testing Path coverage
Best fitness One point and rate as 0.75 0.1
[2012]
6 M. A. Structural testing Multipath
Single point and rate set as 0.5 or
Ahmed et.al. covera Roulette wheel 0.1 or 0.3
0.9
[2008] ge
7 A. Pachure Structural testing Branch coverage Binary
Two point and rate set 1.0 0.01
et.al. [2013] Tournament
8 M. Roper et Structural testing Branch coverage Random selection,
al. [1995] size of the
population set as Single, double and uniform
Simple mutation.
45, number of crossover point
generations not
specified
9 B. Jones et Structural testing Branch coverage Simple mutation. Mutation
Random selection One point
al. [1996] rate decide by user
10 R. P. Pargas Structural testing Path coverage & Simple mutation & and rate
Random selection One point
et al. [1999] Branch coverage as 0.10
11 C.C. Structural testing Branch coverage
Simple mutation & rate set as
Michael et Random selection One point
0.001
al. [2001]
12 A. A. Structural testing Edge/partition
Sofokleous coverage Tournament Inter and Intra crossover Inter and intra mutation
et.al. [2008]
12

13 C. Chen Structural testing Branch coverage


Random 0.4 0.1
et.al.
14 D. J. Bernat Structural testing Not specified Not specified Not specified Not specified
15 S. Ali et.al. Model based testing Coverage of OCL
Mutation probability= 1/n
[2011] constraints
Rank selection One point where „n‟ is the number of
generated by
variables
UML modellers
16 N. Sharma Black box Cover all possible -Replacement of an existing
et.al.[2012] combination of character with randomly
Diagonal crossover, Crossover
character input Individual selected generated characters or
rate is expressed as the number
sequence based on the Addition of random
of individuals from one
fitness value characters at a random
crossover set
position or Deletion of a
character at a random position
17 A. Rauf et. GUI Coverage of
Selection based on
al [2010] testing(represented transitions in state One point and two point One bit
fitness
as State machine) machine
18 S. Khor et. Structural testing Branch coverage Selection based on
al. [2004] the ranking of
individuals Uniform crossover, rate=0.8 Mutation rate = 0.4
according to the
concepts
19 Y. Cao et al. Structural testing Path coverage
Roulette wheel 0.80 0.15
[2009].
20 J. Malburg Structural testing Branch coverage
Random Several setting Several settings
[2011]
21 W.Zhang Structural testing Path coverage One point mutation and rate =
Roulette wheel One point, rate is 0.9
et.al. [2010] 0.3
22 G. Fraser Structural Testing Branch coverage Mutation probability for test
Special type of one point
et.al. [2012] suite = 1/T, where T is the test
crossover where the first part of
suite
Rank selection the sequence of statements of the
Mutation probability for test
based on fitness 1st parent is merged with the
case= 1/3, where the
second part of the second parent
operations applied are
and vice versa
remove, change & insert
23 A. Arcuri et. Structural Testing Branch coverage Mutation probability for test
Special type of one point
al [2011] suite = 1/T, where T is the test
crossover where the first part of
Roulette wheel, suite
the sequence of statements of the
Tournament & Mutation probability for test
1st parent is merged with the
Rank selection case= 1/3, where the
second part of the second parent
operations applied are
and vice versa
remove, change & insert
24 P.M. S. Structural Testing Path coverage Individuals
Bueno et. al. selected based on
Single point Simple & rate set as 0.03
[2000] Previous
knowledge
25 J. Wegner et. Structural Testing Statement & Discrete recombination
Individual selected
al. [2002] Branch coverage One point (Different mutation range for
based on fitness
each subpopulation)
26 J. Miller et. Structural Testing Branch coverage Uniform random mutation,
al. [2006] non-
uniform random mutation and
Tournament One point
Muhlenbein‟s mutation, is
randomly chosen

27 P. McMinn Structural Testing Branch coverage Breeder genetic algorithm


[2013] One point, Uniform & Discrete mutation operator applied at
Best fitness
recombination an inverse of chromosome
length
28 C. Doungsa- Model Based testing Coverage of
ard et al. sequence of Random mutation & rate set
Best fitness Two point & rate set as 0.5
[2007] transitions in as 0.5
UML diagrams
29 G. Fraser et. Structural testing Mutation probability for test
Special type of one point
al. [2013] Branch coverage suite = 1/T, where T is the test
crossover where the first part of
suite
Rank selection the sequence of statements of the
Mutation probability for test
based on fitness 1st parent is merged with the
case= 1/3, where the
second part of the second parent
operations applied are
and vice versa
remove, change & insert
30 J. Li et.al. Model based testing
[2009] Transition
Best fitness One point, rate 0.8 Rate set as 0.05
coverage of Finite
state machine
13

31 D. Gong Structural testing


et.al. [2011] Multipath Roulette wheel One point, rate is 0.9 One point, rate is 0.3
coverage
32 P. Pocatilu Structural testing Path coverage
Roulette wheel 0.1 0.8
et. al. [2013]
33 C. Mao et. Structural testing Branch coverage Gambling roulette Quantum rotation gate
One point, rate is 0.90
al.[2013] selection technique, rate is 0.05
34 D. Liu et. al. Structural testing Branch & Path
Not specified 0.8 0.4
[2013] coverage
35 Y. Suresh et. Structural testing Path coverage Fitness based
Two point, rate is 0.5 Bit wise, rate is0.05
al.[2013] selection
36 A. Arcuri et. Structural testing Branch coverage Mutation probability for test
Special type of one point
al. [2013] suite = 1/T, where T is the test
crossover where the first part of
Roulette wheel, suite
the sequence of statements of the
Tournament & Mutation probability for test
1st parent is merged with the
Rank selection case= 1/3, where the
second part of the second parent
operations applied are
and vice versa
remove, change & insert
37 G. Fraser Structural testing Branch coverage Special type of one point
et.al. [2013 crossover where the first part of
Mutation probability for test
& 2014] the sequence of statements of the
suite = 1/T, where T is the test
1st parent is merged with the
suite
Rank selection second part of the second parent
Mutation probability for test
based on fitness and vice versa. If P1 & P2 are
case= 1/3, where the
parents then O1= αP1. (1-α)P2 &
operations applied are
O2= αP2. (1-α)P1 , where O1 & O2
remove, change & insert
are offspring and α is a random
value chosen from [0,1]
The process of crossover and mutation has serious impact population size is smaller, because with a small sized
on the process of test data generation using genetic algorithm. population, the result may get converged at a faster rate than
Some of the most commonly used types of crossover are one normal. Thus, in a limited population, if one of the individuals
point crossover, two point crossover and uniform crossover. surpasses the neighboring individuals, then that point or
By using uniform crossover the diversity in the individuals individual will be considered as the best solution even when
produced is more compared to single and two point crossover better solutions exist. Considering these local points as the
and a better result is obtained. Mutation is the process of candidate solutions and assigning higher fitness values to them
altering the value of genes present in the chromosome for will result in a diversion from the original solution. This
creating genetic diversity [19]. Diversity in the population will results from the inherent weakness of genetic algorithms [39].
create better individuals compared to a population without A group of researchers used an evolutionary algorithm along
genetic diversity. According to the problem to be solved, with a reprogrammable hardware array and the fitness
mutation rates can be set to specific values. If the rate of function was designed to output an oscillating signal [20]. At
mutation is set to high value, the search will become similar to the final stage of the experiment, the researchers found that the
a random search and if the mutation rate is very low then there circuit had become a radio receiver which was able to pick up
will be no diversity in the population. From the list of works and relay an oscillating signal from the nearby electronic
given in table 4, we can see that different type of crossover, device. Here, there was a deviation from the main goal itself
mutation and selection are used in the referred works. Initially, and this was due to the fault in the design of the fitness
we have categorized the works according to the type of testing function [20]. Each one of the many works which use genetic
and coverage goal. After that, parameters used in these works algorithm for software testing has designed their own fitness
are analyzed and listed. Different parameter settings are used function [3]. For example, we can see that in the path testing,
in different works. Hence, from the above mentioned facts we fitness function can be a measure of the function of branch
can conclude that setting the correct type of parameters is still distance and approximation level or sum of normalized
an unresolved issue in GA based testing. intermediate fitness functions of multiple paths for multipath
testing or a measure of weighted hamming distance of
4) Fitness Function Design during Software Testing predicate value and the reciprocal of the branch predicate
Applying genetic algorithm in program testing requires value [4, 42]. Similarly, in model based testing fitness
optimizing the specified fitness function [43]. A fitness function can be a measure of test paths covered by the
function should be designed in such a way that it gives optimal chromosome or measure of proximity to the constraints
solution for a given problem [34]. Defining the fitness satisfied or measure of dependencies between components.
function imprecisely may lead to a wrong solution or may Here, we have mentioned only a few possibilities.
cause the problem to be stuck in the local optima [19, 39, 59,
60, 63, 64]. The misleading nature of fitness function creates
several problems. For example, the individuals with lower
fitness values may be finalized as the optimal solution even
when better individuals exist. This mainly occurs when the
14

TABLE 5. FITNESS FUNCTION CALCULATION


Sl.no. Work Type of Testing Coverage Fitness Function
1 M. Fisher et. al. [2012] Black box Component coverage Measure of dependencies between components
2 J. Louzada et.al. [2012] Mutation testing Coverage of mutants FF= w1*(MutA/MutKill) + w2*(MutL/MutKill) +
w3*(MutR/MutKill)
Where, w1, w2 and w3 are the weights assigned to each
type of mutation used to perform mutation testing. MutA,
MutL and MutR represents arithmetic, logical and
relational mutation operator
3 Xue-ying et. al. [2005] Structural testing Test suite coverage Measure of coverage and cost
4 J. Xiao et.al. [2010] Structural testing Test data for covering bugs Value(B)=α* priority+β* severity* Has finished(B),
where B is the bug and α and β are the weights for
priority and severity , Hasfinished (B) may have vale 1 or
0 if the bug is fixed before deadline and if the bug cannot
be fixed
5 G. I. Latiu [2012] Structural testing Path coverage Function of branch distance and approximation level
(Branch distance designed based on Korel‟s branch
distance function)
6 M. A. Ahmed et.al. [2008] Structural testing Multipath coverage FF= Sum of normalized intermediate fitness functions of
multiple paths
7 A. Pachure et.al. [2013] Structural testing Branch coverage FF= Approximation level + Normalised branch distance
8 M. Roper et al. [1995] Structural testing Branch coverage Two fitness function-
First one is the weighted hamming distance of predicate
value, Second one is the reciprocal of the branch
predicate value
9 B. Jones et al. [1996] Structural testing Branch coverage Measure of coverage
10 R. P. Pargas et al. [1999] Structural testing Path coverage & Branch Measure of number of common branch predicated in the
coverage control flow graph of the program
11 C.C. Michael et al. [2001] Structural testing Branch coverage Expressed as Predicate function based on Korel‟s fitness
function
12 A. A. Sofokleous et.al. Structural testing Edge/partition coverage For BO-GA , Fitness function F= w1(#edges exec) + w2(
[2008] #predtrue+ #predfalse)/w1+w2, where w1 and w2 are the
weights in the range [0,1] and #edges exec is the number of
executed edges and #predtrue and #predfalse is the number of
simple predicates evaluated at least once to true or once to
false
For CU-GA, fitness function Fclose-up-= Fvert + 1/ Fdist(c) ,
where Fvert is the sum of exercised vertices and Fdist(c)
focus on the condition that prohibits test case to visit next
vertex

13 C. Chen et.al. Structural testing Branch coverage Set by analyzing the pre-dominator tree used to construct
the EPDG.
14 D. J. Bernat Structural testing Not mentioned Not mentioned
15 S. Ali et.al. [2011] Model based testing Coverage of OCL constraints
Branch distance of OCL expressions
generated by UML modellers
16 N. Sharma et.al.[2012] Black box Cover all possible combination Measured in terms of the proximity to the constraints
of character input sequence satisfied
17 A. Rauf et. al [2010] GUI Coverage of transitions in state
Accuracy = test paths covered by the chromosomes/ total
testing(represented as machine
number of chromosomes
State machine)
18 S. Khor et. al. [2004] Structural testing Branch coverage Uses concept analysis
Concept pair=(o, a), where o is the object set and a is the
attribute set
19 Y. Cao et al. [2009]. Structural testing Path coverage Measure of similarity between target path and execution
path with sub path overlapped.
20 J. Malburg [2011] Structural testing Branch coverage Integrates approach level (number of unsatisfied control
dependencies) and branch distance
21 W. Zhang et.al. [2010] Structural testing Path coverage Measure of approach level and branch distance level
22 G. Fraser et.al. [2012] Structural Testing Branch coverage fitness (T)= , where T is
the test suite, M is the methods in the program and MT is
the number of executed methods and d(bk, T) is the
branch distance for branch b on test suite T

23 A. Arcuri et. al [2011] Structural Testing Branch coverage fitness (T)= , where T is
the test suite, M is the methods in the program and MT is
the number of executed methods and d(bk, T) is the
branch distance for branch b on test suite T
15

24 P.M. S. Bueno et. al. Structural Testing Path coverage


[2000] FT=NC-EP/MEP

25 J. Wegner et. al. [2002] Structural Testing Statement & Branch coverage Measure of approximation level and normalized predicate
level distance
26 J. Miller et. al. [2006] Structural Testing Branch coverage Rules to build fitness function for basic relational
operations
Example Fitness function ( where p is a preset penalty)
value):-

if (a), F=0 for true & F= p for false


if (a = b), F= 0 for(a=b) & F= abs(a-b)+p for a≠b
if (a < > b), F =0 for(a=b) & F =p for a≠b
if (a < b), F=0 for (a< b) & F= abs(a-b)+p for a ≥b
if (a < =b), F=0 for (a≤ b) & F= abs(a-b)+p for a >b

27 P. McMinn [2013] Structural Testing Branch coverage


28 C. Doungsa-ard et al. Model Based testing Coverage of sequence of
Number of transitions fired by input sequence
[2007] transitions in UML diagrams
29 G. Fraser et. al. [2012] Structural testing Branch coverage fitness (T)= , where T is
the test suite, M is the methods in the program and MT is
the number of executed methods and d(bk, T) is the
branch distance for branch b on test suite T

30 J. Li et.al. [2009] Model based testing Fitness function designed to guide the construction of
state splitting tree

31 D. Gong et.al. [2011] Structural testing Multipath coverage Represented as an n-dimensional vector, where each
dimension is related with a target path & the fitness
function of a target path is given as, FF= Approximation
level + Normalised branch distance

32 P. Pocatilu et. al. [2013] Structural testing Path coverage Fitness function is measured as inverse similarity
coverage (ISC) , F=len(a)/sym(a), where len(a) gives the
length of the path given by chromosome a and sym(a) is
the similarity function
33 C. Mao et. al.[2013] Structural testing Branch coverage Measure of branch distance function for the ith in a
program (bchi) and the corresponding branch weight (wi)

34 D. Liu et. al. [2013] Structural testing Branch & Path coverage Ratio of branch (f) or path of practical coverage of test
case to the total branches (z) or paths, F=f/z
35 Y. Suresh et. al.[2013] Structural testing Path coverage Fitnes function is a measure of branch distance function
based on Korel‟s branch distance function
36 A. Arcuri et. al. [2013] Structural testing Branch coverage Sum of the normalized branch distances of all branches
in the program under test

37 G. Fraser et.al. [2013 & Structural testing Branch coverage


Sum of normalized minimal branch distance
2014]
works in GA based software testing which are listed in Section
Table 5 given above explains the fitness function design III.
strategies used in various works which uses GA for software
A. SRQ1: In spite of the large volume of works in genetic
testing. We have identified the works according to the type of
algorithm based testing, why some works have considered
testing and coverage criteria. After that the fitness function variations of genetic algorithms?
design strategy is listed for each work. From the above facts
we can conclude that, designing fitness function deserves Even when a group of researchers claim that simple GA is best
prime importance in GA based testing. for software testing, we can see that a lot of works have used
several variations of GA for software testing. In Section III.
IV. REVIEW OBSERVATIONS C. 1. , the list of the works which use variation of GA in
software testing is given and the observations are given in
In the previous section, we saw the different population table 6. Most of these works shows that variations of genetic
representation, parameter settings, fitness function design algorithm perform better compared to simple GA. Form the
issues used in GA based software testing. We have made a list of observations and data collected from various works, we
structured ordering of some of the most relevant observations can see that till now the researchers are not able to conclude
in the referred works for answering the research questions. which type of GA or what variation of GA is best for various
Finally, we have arrived at some conclusion by referring some software testing strategies.
16

TABLE 6. OBSERVATIONS ON WORKS USING GA VARIATION FOR SOFTWARE TESTING

Number of works Number of works Number of works in Comments


using GA in which variation which GA is proved to be
variation for test of GA is proved to better
data generation be better than GA

12 12 NIL Since variation of GA is proved to be better than GA in all


works, its high time to assess which variant is best in
software testing

Solution using simple


GA

Problem Testing Strategy


Specification Compare results

Solution using
variation of GA Accuracy

Time
Performance

Cost

Identify the best method


for a particular category of
problem

Figure 6. Research direction suggested from the observations in GA variation

What we can infer from all this is that, depending on the generalized.
system under test/problem, the type of GA or the variation of
GA used also differs. This is one of the areas which future 2) Advantages of Suggested Research Focus
researchers should concentrate upon. They should clearly  Programmers will be confident of the type of GA which
elucidate which type of GA or what variation of GA is may be used for a particular category of problem
universally applicable to a particular category of problems  Unnecessary effort and time spend to find variants of GA
during software testing. For example, a researcher should be in software testing may be reduced.
able to identify or get a clear idea of what type of GA gives B. SRQ2. What is the effect of population representation and
the best result during model based testing, structural testing, size in software testing?
stress testing, mutation testing etc. Little works have shown
We have seen different types of population representation
interest in this issue till date. This may be due to wide
and different population size and the number of generations
possibilities or variations of GA which may be applied during
used in software testing from the table 3 given in Section
software testing. We recommend that, trying to perform such
III.C. 2. Some observations from table 3 are given below in
an in-depth study will clear the air regarding the use of GA in
table 7, table 8 & table 9.
software testing to a great extent. We suggest that the
researchers should try to focus research in the direction given
below.

1) Future Research Directions


From the list of works which uses GA variation for software
testing, we can recommend that the researchers should try to
focus research in the direction which is represented in figure 6.
This suggests that, for a given problem specification, we
should try to find whether simple GA or variants of GA works
better. The performance of these two methods may be
compared and finally the best method may be found and
17

TABLE 7. POPULATION REPRESENTATION FOR STRUCTURAL TESTING several types of population representation are used. Similar is
the case with black box, mutation, GUI and model based
Number of works using different Population representation for Structural
testing testing which may be inferred from table 8 and table 9. In
most of these works it is mentioned that they have chosen a
Binary Base Character Array Integer Real Sbyte Qbit particular type of representation for population by referring
10 of bits previous works, whereas in some works it is mentioned that
they have chosen a particular representation randomly. Only a
8 1 1 2 1 2 1 1 very few works have given a clear explanation of choosing a
particular type of population representation for the system
under test. For example, G. Fraser et. al. have claimed that,
TABLE 8. POPULATION REPRESENTATION FOR BLACK BOX TESTING
seeding strongly influences the efficiency of search based test
data generation and they have used seeding to improve the
Number of works using different Population representation for Black box process of population initialization [16]. Similar is the case
testing with population size and the number of generation. Here, if the
researchers are able to get a picture of the different types of
Transition between components Character set population representation and the approximate size of
population according to the type of software testing, the issues
1 1
related to the size and representation of the population may be
TABLE 9. POPULATION REPRESENTATION IN MUTATION, MODEL BASED AND
solved.
GUI TESTING

Population representation in Mutation, Model based and GUI testing

Mutation testing GUI testing Model based testing

Binary 1 Transitions 1 Transitions 1


in state satisfying
diagram various
constraints

From table 7, we can see that for structural testing itself

Possible population
representation, size,
generations
Type 1

Testing Problem Type 2 Possible population


Strategy Specification/type representation, size,
generations Decide/Find the best
type of population
representation, size &
Type 3 generations for a
: particular category of
Possible population problem
representation, size,
generations
Type n

Guide to population
Possible population
issues in GA
representation, size,
generations based testing

Figure 7. Research direction suggested from the observations in population representation in GA based software testing

For example, if a study is conducted to finalize the most representation is most suitable during black box testing or
suitable type of population representation for structural testing what type of population representation is best during model
of a particular category of problem or what type of population based testing, software testers may not have any confusion in
18

population representation. Instead of adopting a particular type Design of fitness function differs according to the type of
of population representation, size and generations by referring testing, purpose or coverage and the method used for
previous literatures, researchers should try to find a general or designing the fitness function. For example, before using GA
most suitable type of population representation, optimal value for structural testing, tester should have an idea about how to
of size and number of generations for a particular category of design fitness and the factors to be considered for designing
problem during various testing strategies. There lies immense fitness function. This is because, many factors such as
research scope in this direction. Researchers have not explored program dependency and path selection affects the process of
this direction yet. Instead, they have randomly selected a fitness function design. After designing the fitness function,
particular type of population representation, size and parameters are tuned to get the expected result. Response time
of the system is also in turn dependent on parameter tuning.
generation and proceeded directly to the test data generation
Therefore, setting up general guidelines for fitness function
process.
design in GA based testing minimizes the issues related to
parameter setting in GA. From table 5 given in Section III. C.
1) Future Research Direction 4, we can see that several approaches are taken for designing
From the list of observation on population representation, the fitness function according to the coverage criteria. One of
size and number of generations in GA based software testing, the most important factors to be considered during fitness
we can recommend that the researchers should try to focus function design is the program dependency consideration. In
research in the direction which is represented in figure 7. In most of the genetic algorithm based software testing, program
the figure we can see that, for a given problem specification, dependency is not correctly followed [23, 35]. An evidence of
we suggest to find the most suitable type of population this fact may be drawn from the table 10 given below which
representation, size and generation. Testing should be done lists the future scope of the referred works. Table 11 gives
using different population settings and the best population some observations drawn from table 10. In table 11, we can
setting should be reported. This in turn may resolve the see that nearly 13 works have mentioned to solve dependency
uncertainties prevailing in population settings during GA related issues in their future work. Solving dependency related
based testing. issues in turn depends on the path identification. Nearly 13
works have reported path identification problem as their future
2) Advantages of Suggested Research Focus research perspective. If there is no automated method to
 Problems related to population representation, size identify the potential paths during fitness function design, all
and generation in GA based software testing may be the statements in the program should be analyzed to identify
resolved to a great extent if general guidelines are the relevant statements. In GA based software testing, only a
formed for population issues. very few research works have addressed the problem of
 Researchers will be confident on the method selected potential path identification [24].
if the base values are set according to some general
guidelines
C. SRQ3. Is there any common method to design fitness
function during software testing?
Fitness function is one of the core aspects of GA based testing.
The result of GA based testing depends on fitness function.
TABLE 10 . FUTURES ISSUES TO BE SOLVED IN REFERRED WORKS

Sl.no. Work Type of Testing Factors to be resolved in future


1 M. Fisher et. al. Black box -Time reduction factors
[2012] -Full coverage of dependencies
- Find the optimal value for parameters for black box testing
2 J. Louzada et.al. Mutation testing - Need to design fitness function which cover all type of mutants
[2012] - Need to test the result with other type of metaheuristic methods
- Find the optimal value for parameters for mutation testing
3 Xue-ying et. al. Structural testing - Effect of different parameter setting may be studied
[2005] -Optimal/Best parameter setting may be found by conducting more experiments
- Dependency between different test cases may be discussed and the effect of applying
metaheuristic on the same
4 J. Xiao et.al. Structural testing -Parameter setting may be studied in detail
[2010] - The side effects of fixing the bugs may be discussed. In other words, the solving the
dependency issue between the bugs
5 G. I. Latiu [2012] Structural testing - To find the reason why GA based method is less competing than PSO and SA
- Study parameter settings
6 M. A. Ahmed et.al. Structural testing - How to identify target paths
[2008] - Study the dependency issue between multiple paths during testing
- Study parameter settings
7 A. Pachure et.al. Structural testing - How to identify target paths
[2013] - Study the dependency issue between multiple paths during testing
- Study parameter settings
19

8 M. Roper et al. Structural testing -Solving dependency issues


[1995] - Parameter setting variations

9 B. Jones et al. Structural testing - Scalability of the approach


[1996] -Solving dependency issues
- Parameter setting variations

10 R. P. Pargas et al. Structural testing - The authors claim that their approach is scalable. Experiments are needed to prove their claim
[1999] -Covering full dependency in the program using a program dependence graph instead of control
flow graph
- Parameter setting variations
11 C.C. Michael et al. Structural testing -Method to identify the predicates should be found out
[2001] -Methods to prioritize branches during testing
-Solving the dependency issues in the program
-Parameter setting variation
-As the tool is developed for C program,
12 A. A. Sofokleous Structural testing - The authors claim that their approach is scalable. Experiments are needed to prove their claim
et.al. [2008] -Covering full dependency in the program using a program dependence graph instead of control
flow graph
- Parameter setting variations
13 C. Chen et.al. Structural testing -Solving dependency issues and finding rules for setting parameters and measures needed to
improve fitness function
14 D. J. Bernat Structural testing - May study operator setting variations
15 S. Ali et.al. [2011] Model based testing -Collect evidence on best parameter setting
-Handling the relation between different transition in UML models
16 N. Sharma Black box -How to extract constraints from the flow graph using more efficient methods other that the
et.al.[2012] currently used depth first search strategy
17 A. Rauf et. al GUI testing(represented as -Full Automation of the tool
[2010] State machine) -Define the number of generations or minimum coverage needed for testing the application so as
to minimize cost and time

18 S. Khor et. al. Structural testing Even though, the authors claim that their approach does not use any flow graph, they have also
[2004] mentioned that it is difficult to cover nested predicates using their approach. This issue is ought
to be solved in future.
19 Y. Cao et al. Structural testing -Automatic selection of path from control flow graph
[2009]. -Study the influence of different type of parameters
20 J. Malburg [2011] Structural testing -Methods to improve the coverage using different options of fitness function
21 W. Zhang et.al. Structural testing -Full automation
[2010] -Finding Suitable value of Subpopulation size
22 G. Fraser et.al. Structural Testing -Study different settings for parameters during test suite generation
[2012] - Study the effect of seeding for other search based teschniques
23 A. Arcuri et. al Structural Testing -May extend the work to different languages
[2011]
24 P.M. S. Bueno et. Structural Testing - Study parameter settings
al. [2000] - Better methods to solve random variations

25 J. Wegner et. al. Structural Testing -Assumes that the target is already given
[2002] -Multipath coverage not discussed
-Study the dependency issue in the program when using multiple subpopulation
during testing
- Study parameter settings
26 J. Miller et. al. Structural Testing -Handle complex data structures like arrays, improve scalability and improve path coverage
[2006] -Study parameter settings
27 P. McMinn [2013] Structural Testing -May conduct study on various type of systems to study the impact of crossover, so that the
method may be generalized
28 C. Doungsa-ard et Model Based testing -Modify the method to cover all transitions
al. [2007] -Modify the method to cover loops
-Study parameter settings
-Handle dependencies in the UML models
-Handle more complex UML designs
29 G. Fraser et. al. Structural testing -Methods to handle collateral coverage of individuals (As the overhead increases when collateral
[2012] coverage is considered)
-Study the effects of different coverage
-Test the approach using other type of algorithms like simulated annealing in addition to genetic
algorithms

30 J. Li et.al. [2009] Model based testing -Handle more complex state charts
-Better fitness function design
-Comparison of genetic algorithm with other metaheuristic algorithms
-Study parameter settings

31 D. Gong et.al. Structural testing -Full automation of the approach


[2011] -As the initial population is split-up into subpopulation, assigning correct size to the
20

subpopulation remains an issue


-Study parameter settings

32 P. Pocatilu et. al. Structural testing -Fitness function improvement


[2013] -Compare the performance with other genetic algorithm based test data generation methods
33 C. Mao et. Structural testing -Fitness function improvement
al.[2013] -Improve coverage of the method
34 D. Liu et. al. Structural testing -Improve time efficiency (Parameter tuning & parameter selection)
[2013]
35 Y. Suresh et. Structural testing -Improve path coverage
al.[2013] -Handle complex program
36 A. Arcuri et. al. Structural testing -Study the effect of parameter tuning to different kind of problems [As parameter tuning
[2013] may/may not result in worse result compared to the result obtained without parameter tuning]
37 G. Fraser et.al. Structural testing - Find optimal parameter configurations as the result of the suggested work depends on the class
[2013 & 2014] on which test data generation is applied
38 J. P. Galeotti et. al. Structural testing -Improve program coverage
[2014] -Improve performance of the method

TABLE 11 . OBSERVATIONS FROM TABLE 10


conclusion which summarizes the steps to be taken while
Observation from table 10
Number of works which report 13
designing the fitness function during GA based testing.
solving dependency related issues Table 12 gives a gist of our conclusion on fitness function
as their future enhancement design issues. Some of the future research perspectives in
Number of works which report 13 fitness function design are also mentioned below.
solving path identification related
issues as their future enhancement
Number of works which report 26
solving parameter tuning issues as
their future enhancement

Even though fitness function design varies according


system under test, general guidelines for designing the
fitness function based on the category of system under test
or the factors to be considered during fitness function
design may be established by the researchers. Though a
very few works have mentioned about such possibilities,
this problem is yet to be discussed deeply. If in future, such
a study is accomplished, it will go long way in making the
use of GA based software testing applicable to all type of
system irrespective of whether the system is large or small.
Unfortunately, little attempt is made for such a study,
whereas, most of the effort is spent on designing newer
fitness function for testing. Figure 8 given below, gives the
prime reason for why the design fitness function is
considered as the most important factor in GA based
software testing. From figure 8, we can arrive at a

Path selection Dependency & path


strategy coverage

Fitness function design

Parameter tuning for


Response time/
fitness function
Optimization time
optimization Optimization of
Fitness function

End Result

Figure 8. Factors affecting and affected by fitness function design


21

TABLE 12. MAIN ISSUES IN FITNESS FUNCTION DESIGN AND THEIR 2) Advantages of Suggested Research Focus
POSSIBLE SOLUTIONS
One of the main issues in using GA for practical
Issues in fitness function design Suggested solutions testing may be solved by alleviating the difficulties faced
during the design process of fitness function.
Trace the potential paths/relevant
paths for designing fitness function Use methods to attain maximum
D. SQR4. What is the general strategy adopted in setting
coverage, identify potential paths parameters during software testing?
Trace all possible dependencies in and trace dependency while Finding an optimal value of parameters is one of the most
the program so that test data designing fitness function
generation using GA may be sought issues in GA based testing which haven‟t received
applied in all practical situations. much attention till now. In table 4, parameter settings used
in various works are shown. From table 4 we can notice
Try to attain maximum coverage that, even though several works which explain different
during test data generation
types of operators and their relevance in different contexts
exist, use of these operators in specific context still remains
1) Future Research Direction unexploited. Tables 13-17 show an evidence of this factor.
From table 12, we saw the issues in fitness function In most of the works we can notice that, the operator setting
deign and their suggested solution. Therefore, we suggest are set randomly or they have used specific operators by
the future researchers to give more emphasis on finding referring some similar works in their field. Few works by
general methods for fitness function design according to the researchers like Fraser et.al. and McMinn have considered
category of problem specification. The main steps to be such a possibility of studying the parameter setting during
followed during this process are given in figure 9. In figure GA based software testing [4, 38]. Even these researchers
9, general strategy for designing fitness function for a given claim that more studies are required to reach a concrete
problem specification is listed. The advantage of using such conclusion. In GA based testing, even after testing a
a general strategy for fitness function design is given below. program using the best available genetic parameters, a
better solution or the same solution can be obtained even if
we use less competing methods of crossover, selection and
Step 1. Specify the purpose of testing mutation for solving the same problem. This shows the
Step 2. Identify the testing category uncertain nature of genetic algorithms [35]. Most of the
Step 3. Analyze the system under test or works given in table 4 have mentioned about this risk.
the type of system (Critical, Application, Another issue involved in parameter tuning is the time
Small, Medium, Real time) taken for optimizing the fitness function. Fitness function
Step 4. Identify the methods to available optimization is a heuristic process and the optimization time
to design fitness function for each type of and effort varies according to the nature of the problem [1,
system 6]. Therefore, the exact time required for testing a program
Step 5. Consider the factors affecting the cannot be accurately predicted [3]. The time varies as the
design of fitness function for a system parameter settings are changed.
under test
-Identification of relevant paths
-Maximum coverage to generate
possible test data values
-Cover all possible dependencies in
the program
-Optimal parameter setting during
fitness function design
-Time required to optimize the fitness
function
Step 6. Design fitness function according
to the category of problem specification

Figure 9. Main steps to be followed during fitness function design


22

TABLE 13. STRUCTURAL TESTING: NUMBER OF WORKS USING DIFFERENT TYPES OF SELECTION IN TABLE 4

Tournament 4 Roulette 8 Ratio 1 Binary 1 Random 6 Rank 8 Gambling 1


wheel tournament selection based Roulette wheel
selection

TABLE 14. BLACK BOX, GUI & MUTATION TESTING: NUMBER OF WORKS USING DIFFERENT TYPES OF SELECTION IN TABLE 4

Black box testing GUI testing Mutation Model based


testing testing

Individuals Best fitness Random Best fitness


selected based on 2 1 selection 1 3
fitness

TABLE 15. STRUCTURAL TESTING: NUMBER OF WORKS USING DIFFERENT TYPES OF CROSSOVER IN TABLE 4

One point Two point Uniform Special types of


cross over

17 4 4 1(Intra and Inter


crossover)

TABLE 16. STRUCTURAL TESTING: NUMBER OF WORKS USING DIFFERENT TYPES OF MUTATION IN TABLE 4

Simple mutation Expressed in terms Inter & Intra mutation Breeder Quantum
of probability GA Rotation
mutation gate
operator

8 4 1 1 1

TABLE 17. BLACK BOX, GUI & MUTATION TESTING NUMBER OF WORKS USING DIFFERENT TYPES OF CROSSOVER & MUTATION IN TABLE 4

Black box testing GUI testing Mutation testing Model based testing

Cross Mutation Crossover Mutation Crossover Mutation Crossover Mutation


over

Average - Different One point- 1 One bit-1 One point-1 Inversion of One point- 2 Random-1
1 rates of bits-1
mutation

Diagnoal- Replacement Two point -1 Nil Nil Nil Two point -1 Probability
1 method based-1
23

Result

Apply basic Select the


Parameters for Compare optimal
optimization the values value

Testing strategy & Design Fitness Tune


Problem under test Function Parameters
Compare the values got
from advanced
operators & basic
operators

Result
Apply
advanced/variants of
Parameters for
optimization Cost Constraints Finalize the best value
of parameter settings
Time required for for a given system
optimization under test

Figure 10. Suggested research directions in parameter settings

Another factor which has a major role in parameter tuning parameters needed to optimize the function. For this, we
is the „search budget‟ [13]. As budget plays a critical role in have to analyze the outcome of applying basic and
software testing, parameter tuning should be a factor of advanced operators during software testing. Finally, the best
budget. From table 11 we can see that, nearly 26 works possible parameter setting for a given category of problem
have reported parameter tuning as their future enhancement. may be found out. After finding the optimal parameter
It is surprising that little attempts were made on how to settings, the relation between fitness function optimization
overcome the issues in parameter setting. Instead, these and time taken for optimization as well as the relation
issues related to parameter setting are shifted to “Threat to between fitness function optimization and search budget
validity” in most of the works. One reason for this could be may be identified. A very few works have even suggested
the difficulties in carrying out and designing such a study. such a possibility and therefore, there lies a great research
Using GA based testing, even though a problem may be scope in this direction [2, 4, 15, 57].
solved in less time, the field is still dormant to decide the
2) Advantages of Suggested Research Focus
best possible combination of parameter settings suitable for
 General guidelines for setting the parameters
a given problem. This leaves immense possibilities for
according to the problem
research in this field.
 Time & effort spend in parameter tuning may be
1) Future Research Direction minimized
Given a problem specification, although it may not be  Solves the ambiguities in setting parameters during
possible to find the exact value for all parameters, we GA based testing
suggest the researchers to look into the future issues in E. SQ5. Can GA based software testing evolve as an
parameter settings which are given in figure 10. In figure 10 undefeatable technique in software testing industry? If so,
we can see that, after identifying the properties of the what are issues to be sorted out in GA based testing?
system under test, the next step is the design of fitness This research question forms the heart of this review and
function. This is already visible from figure 8. Therefore, the answer to this questions lies within the observations
for finding the optimal value of parameters, the primary made from GA variation, population, fitness function
necessity is a fitness function, as parameters setting are design and parameter settings given in the previous
used to optimize the fitness function. After finding the sessions. Solving the problems mentioned in SRQ1, SRQ2,
fitness function, the next step is to find the optimal value of SRQ3 and SRQ4 may make GA based testing an
24

undefeatable method in software industry. If the researchers VI. CONCLUSION


are able to find solution for the future issues mentioned in Of late, software test data generation has gained a wide
SRQ1, SRQ2, SRQ3 and SRQ4, the uncertainties which interest among software engineers. In software testing, test
exist in using GA based methods in practical software data generation using search based techniques especially
testing may be eliminated. with metaheuristic algorithms like GA have been widely
explored by many researchers during the last decade. As
V. THREATS TO VALIDITY large number of research works on test data generation
The main threats to validity of this review maybe due the using GA is being carried out, it is high time to conduct a
literatures referred, method of conducing the review and study to throw light on the practical implications of this
from the observations made during the review. research trend. Therefore, we have tried to perform a
systematic review to collect evidence from various
A. Literature Referred for Review
literatures on the effectiveness of GA in software testing
The main threat to the validity of our work may be due to process. In this review, we have concentrated on the most
the limitation in the scope of the works which we have critical aspects of GA based software testing: The
referred. We have limited our analysis to only those works effectiveness of GA in software testing and how to
which have mentioned the application of genetic algorithm overcome the limitations of GA in practical software
in test data generation. For this, we have refined our search testing. Most of the works which have tried to tide over the
several times by the query “Genetic algorithm based issues on population initialization, parameter settings and
software testing” and have finally identified a set a works fitness function calculation as and when they arise without
which clearly falls within our area of focus. The downside addressing the basic cause. We feel that the present
of such restriction in the selection of works was that, all the direction in GA based software testing is like attempting to
possible variants of genetic algorithms and their treat symptoms. Treating symptoms will not cure sickness.
applications may not have been analysed which may have For curing sickness, the disease itself needs to be treated.
used different terms that doesn‟t match our search query. More holistic approaches and generalized solutions are need
B. Method of Conducing the Review of the hour in GA based testing. We have reviewed and
We have made this study in order to expose the real suggested directions to tackle the root cause of these
practical issues in GA based software testing and to explore problems and correct it once and for all. For this, we have
the research direction in GA based testing which have not critically analyzed numerous works, studied pros and cons
received much attention till now. For this, we have made a of various works and have set up some suggestions on how
study of the various parameters used in GA based testing. to improve the process of GA based testing. Therefore, we
There exist a large number of parameters in GA and as the suggest the researchers to set up some general guidelines on
number of parameter increases, the combination of how to carry out the testing process, set up parameters &
parameters also increases exponentially [4]. The number of design fitness function in GA based software testing so that
parameters also varies according to the system under test. the potential of GA based software testing may be utilized
Therefore, we have considered only the parameters which to the full extent.
play a critical role in GA based testing. We feel that such a
narrowing in the field of our study has sharpened its focus REFERENCES
and enabled us to do an in depth analysis of our chosen [1] M. A. Ahmed and I. Hermadi, “GA-based multiple paths test
data generator”, Computer & Operations Research, vol. 35, pp.
study objectives; which being the identification of 3107-3127, 2008.
shortcomings of genetic algorithm and the future research [2] S. Ali, L. C. Briand, H. Hemmati, and R. K. Panesar-Walawege,
scope in GA based software testing. We have not discussed “A Systematic Review of the Application and Empirical
Investigation of Search-Based Test Case Generation”, IEEE
any issues related to parameter control in this review. The Transactions on Software Engineering, vol. 99, 2009.
value of parameters set initially need to be changed as the [3] S. Ali, M. Z. Iqbal, A. Arcuri and L. Briand, “A Search-based
search based testing proceeds. We have not mentioned any OCL Constraint Solver for Model-based Test Data Generation”,
in Proc.
issues in this area. Parameter control is still a non-discussed 11th International Conference on Quality Software, 2011, pp. 41-50
area in search based software testing, which has many [4] A. Arcuri and G. Fraser, “On parameter tuning in search based
issues to be explored, discussed and solved. software engineering”, in Proc. SSBSE, 2011, pp. 33-47.
[5] B. Beizer, “Software Testing Technique”, Van Nostrand
C. Observations Made During the Review Rheinhold, New York, 1990
[6] D. J. Berndt and A. Watkins, “Investigating the Performance of
All the findings and suggestions made in this study have Genetic Algorithm-Based Software Test Case Generation”, in
been made by analyzing the papers carefully by two Proc. HASE, 2004
researchers involved in this work. We have conducted the [7] D. Binkley, M. Harman and K. Lakhotia, “FlagRemover: A
Testability Transformation for Transforming Loop Assigned
review based on the facts drawn from referred literature. Flag”, ACM Transactions on Software Engineering and
much more work and experiments may be done in order to Methodology, vol. 2, no. 3, pp. 110-146, 2009.
substantiate the claims made in this work. [8] P. M. Bueno and S. Jino, “Automatic Test Data Generation for
Program Paths Using Genetic Algorithms”, International
25

Journal of Software Engineering and Knowledge Engineering, [33] J. Malburg and G. Fraser, “Combining Search based and
vol.12, no.6, 691-709, 2002 Constraint-based Testing” , in Proc. IEEE ASE, Lawrence, KS,
[9] Y.Cao, C. Hu and L. Li, “An Approach to Generate Software USA, 2011, pp. 436-439.
Test Data for a Specific Path Automatically with Genetic [34] T. Mantere and J. T. Alander, “Evolutionary Software
Algorithm”, in Proc. ICRMS, Chengdu, 2009, pp. 888-892. Engineering, A Review”, Journal of Applied Soft Computing,
[10] C. Chen, X. Xu , Y. Chen, X. Li and D. Guo, “A New Method vol. 5, pp. 315-33, 2005
of Test Data Generation for Branch Coverage in Software [35] P. McMinn, “Search-based software test data generation: A
Testing Based on EPDG and Genetic Algorithm”, in Proc. survey,” Software Testing, Verification and Reliability, pp.105-
ASID,2009, 307-310 156, 2004
[11] C. Doungsa-ard, K. Dahal, A. Hossain, and T. Suwannasart, [36] P. McMinn, “Evolutionary Search for Test Data in the Presence
“Test Data Generation from UML State Machine Diagrams of State Behavior”, Ph. D. Thesis, University of Sheffield,
using Gas”, in Proc. ICSEA, 2007. Sheffield, England, 2005
[12] M. Fischer, “Generating Test Data for Black-Box Testing using [37] P. McMinn, “Search-Based Software Testing: Past, Present and
Genetic Algorithms”, in Proc. 17th ETFA , 2012, pp. 1-6 Future ”, in Proc. 4th International Conference on Software
[13] G. Fraser and A. Arcuri, “It is not the length that matters, it is Testing, Verification and Validation Workshops, 2011, pp. 153-
how you control it”, in Proc. IEEE International Conference on 163
Software Testing, Verification and Validation, 2011, pp. 150-159 [38] P. McMinn, “An identification of program factors that impact
[14] G. Fraser and A. Arcuri, “EvoSuite: Automatic Test Suite crossover performance in evolutionary test input generation for
Generation for Object-Oriented Software”, in Proc. ESEC/FSE, the branch coverage of C programs”, Information and Software
Szeged, Hungary, Sep. 2011, pp. 416-419. Technology, vol. 55, 153-172, 2013
[15] G. Fraser and A. Arcuri, “Whole test suite generation”, IEEE [39] C. C. Michael, G. E. McGraw and M. A. Schatz, “Generating
Transactions on Software Engineering, vol. 39 , no.2 , pp. 276 - software test data by evolution”, IEEE Transactions on Software
291, 2013 Engineering, vol. 27,no. 12, pp. 1085-1110, 2001.
[16] G. Fraser and A. Arcuri, “The Seed is Strong: Seeding Strategies [40] J. Miller, M. Reformat and H. Zhang , “Automatic test data
in Search-Based Software Testing”, in Proc. ICST, 2012, generation using genetic algorithm and program dependence
pp.121-130 graphs”, Information and Software Technology, vol. 48, pp.
[17] M. R .Girgis, “Automatic test data generation for data flow 586-605, 2006
testing using a genetic algorithm”, Journal of Universal [41] G. Myers, “The Art of Software Testing”, Wiley, New York,
Computer Science, vol. 11, no.5, pp. 898-915, 2005 1979.
[18] D. Gonga, W. Zhanga, X. Yaob, “Evolutionary generation of test [42] A. Pachauri and Gursaran, “ Software Test Data Generation
data for many paths coverage based on grouping”, The Journal using Path Prefix Strategy and Genetic Algorithm”, in Proc.
of Systems and Software, vol. 84, pp. 2222- 2233, 2011 International Conference on Science and Engineering, 2011, pp.
[19] D. Goldberg, “Genetic Algorithms in Search, Optimization and 131-140.
Machine Learning”, Addison-Wesley, Boston, Massachusetts, [43] R. P. Pargas, M. J. Harrold, and R. R. Peck, “Test data
1989 generation using genetic algorithms”, Journal of Software
[20] D. Graham-Rowe, “Radio Emerges from the Electronic Soup”, Testing, Verification, and Reliability, vol. 9, pp. 263-282, 1992
New Scientist, vol.175, no. 2358, pp. 19, Aug. 2002 [44] M. Pei, E. D. Goodman, Z. Gao and K. Zhong,, “Automated
[21] M. Harman, “The current state and future of search based Software Test Data Generation Using A Genetic Algorithm”,
software engineering” in Proc. FOSE, 2007, pp. 342-357 Technical Report , Michigan State University, 1994
[22] M. Harman, S.A. Mansouri and Y. Zhang, “Search-Based [45] A. Rauf, S. Anwar, M. A. Jaffer and A. Shahid, “Automated
Software Engineering: Trends, Techniques and Applications”, GUI Test Coverage Analysis using GA”, in Proc. 7th
ACM Computing Surveys, vol. 45, no. 1, pp. 1-66, 2012 International Conference on Information Technology, 2010, pp.
(Publication increase) 1057-1062
[23] M. Harman and P. McMinn, “A theoretical and empirical study [46] N. Sharma, A. Pasala and R. Kommineni, “Generation of
of search based testing: Local, global and hybrid search”, IEEE Character Test Input Data using GA for Functional Testing”, in
Transactions on Software Engineering, vol. 36, no. 2, pp. 226- Proc. 19th APSEC-SATA workshop, 2012, pp. 87-94
247, 2010 [47] A. A. Sofokleous and A. S. Andreou, “Automatic,evolutionary
[24] M. Harman, P. McMinn and J. Wegener, “The Impact of Input test data generation for dynamic software testing”, The Journal
Domain Reduction on Search Based Test Data Generation”, in of Systems and Software, vol. 81, pp. 1883-1898, 2008
Proc. ESEC/FSE, Croatia, Sep. 2007,pp. 155-164 [48] M. Srinivas, “Genetic Algorithms: A Survey”, Journal
[25] J.H. Holland, “Adaptation in natural and artificial systems”, Computer, vol. 27, pp. 17-26 June 1994.
University of Michigan Press, 1975. [49] H. H. Sthamer, “The Automatic Generation of Software Test
[26] B. Jones, H. Sthamer, and D. Eyres, “Automatic structural Data Using Genetic Algorithms”, Ph. D Thesis, University of
testing using genetic algorithms”, Software Engineering Journal, Glamorgan, Pontypridd, Wales, UK, 1996.
vol. 11,no. 5, pp. 299-306, 1996. [50] N. Tracey, “A Search-Based Automated Test Data Generation
[27] P.C. Jorgensen, “Software Testing: A Craftsman’s Approach”, Framework for Safety Critical Software”, Ph. D. thesis,
Auerbach Publications(Taylor and Francis group), 2008 University of York, 2000.
[28] S. Khor and P. Grogono, “Using a Genetic Algorithm and [51] A. Watkins, “The automatic generation of test data using genetic
Formal Concept Analysis to Generate Branch Coverage Test algorithms”, in Proc. 4th Software Quality Conference, 1995, pp.
Data Automatically”, in Proc. ASE, 2004 300-309
[29] B. Korel, “Automated Software Test Data Generation”, IEEE [52] J. Wegener, A. Baresel, and H. Sthamer, “Evolutionary test
Transactions on Software Engineering, vol. 16, no. 8, pp. 870- environment for automatic structural testing”, Journal of
879,1990 Information and Software Technology, vol. 43, pp. 841-854,
[30] G. I. Latiu, O. A. Cret and L.Vacariu, “Automatic Test Data 2001
Generation for Software Path Testing using Evolutionary [53] S. Xanthakis, C. Ellis, C. Skourlas, A. Le Gall, S. Katsikas and
Algorithms”, in Proc. Third International Conference on K. Karapoulios, “Application of genetic algorithms to software
Emerging Intelligent Data and Web Technologies, 2012, pp.1-8 testing”, in Proc. 5th ICSE, 1992, pp. 625-636.
[31] J. Li, W. Baob, Y. Zhaoa, Z. Maa and H. Donga, “Evolutionary [54] J. Xiao and W. Afzal, “Search-based resource scheduling for bug
generation of unique input/output sequences for class behavioral fixing tasks”, in Proc. 2nd SSBSE, 2010, pp. 133-14
testing”, Computers and Mathematics with Applications, vol. 57, [55] Xue-ying MA Bin-kui SHENG Zhen-feng HE Cheng-qing YE,
pp. 1800-1807, 2009 “A Genetic Algorithm for Test-Suite Reduction”, in Proc.
[32] J. Louzada, C. G. Camilo-Junior, A. Vincenzi and C. Rodrigues, ICSMC, 2005, pp. 133-139
“An Elitist Evolutionary Algorithm for Automatically [56] W. Zhang, D. Gong, X. Yao and Y. Zhang, “Evolutionary
Generating Test Data”, in Proc. IEEE WCCI , Brisbane, generation of test data for many paths coverage”, in Proc.
Australia, 2012 Chinese Control and Decision Conference, 2007, pp. 230-235.
26

[57] J. Campos, A. Arcuri, G. Fraser and R. Abreu, “Continuous Test [62] P. Pocatilu and I. Ivan, “A Genetic algorithm based system for
Generation: Enhancing Continuous Integration with Automated Automatic control of Test Data Generation”, Studies in
Test Generation”, in Proc. Automated Software Engineering informatics and Control, vol 22, no.2, pp. 219-226, 2013
(ASE), 2014 [63] C. Mao and X.Yu, “Test data generation for software testing
[58] J. P. Galeotti, G. Fraser and A. Arcuri, “Extending a Search- based on quantum-inspired genetic algorithm”, International
Based Test Generator with Adaptive Dynamic Symbolic Journal of Computational Intelligence and Applications, vol. 12,
Execution”, in Proc. ISSTA, 2014 no. 1 pp. 1350004 (21 pages), 2013
[59] G. Fraser, A. Arcuri and P. McMinn, “A Memetic Algorithm for [64] D. Liu, X. Wang and J. Wang, “ Automatic test case generation
whole test suite generation”, Journal of Systems and Software, based on genetic algorithm”, Journal of Theoretical and Applied
DOI: 10.1016/j.jss.2014.05.032, 2014 Information Technology, vol. 48, no.1, pp. 411-416, 2013
[60] G. Fraser, A. Arcuri and P. McMinn, “Test Suite Generation [65] Y. Suresh and S. K. Rath, “A genetic algorithm based approach
with Memetic Algorithms”, in Proc. GECCO, 2013 for test data generation in basis path testing”, International
[61] A. Arcuri and G. Fraser, “Parameter tuning or default values? Journal of Soft Computing and Software Engineering [JSCSE],
An empirical investigation in search-based software Special Issue: The Proceeding of International Conference on
engineering”, Empirical Software Engineering, vol.18, no.3, pp Soft Computing and Software Engineering, vol. 3, no. 3, 2013
594-623, 2013

You might also like