0% found this document useful (0 votes)
31 views7 pages

Test Case Prioritzation

Testing Techniques

Uploaded by

shedam4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views7 pages

Test Case Prioritzation

Testing Techniques

Uploaded by

shedam4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/334768435

Test Case Prioritization based on Historical Failure Patterns using ABC and GA

Conference Paper · January 2019


DOI: 10.1109/CONFLUENCE.2019.8776936

CITATIONS READS

4 109

4 authors, including:

Gaurav Pahwa
Capgemini
3 PUBLICATIONS 42 CITATIONS

SEE PROFILE

All content following this page was uploaded by Gaurav Pahwa on 19 April 2022.

The user has requested enhancement of the downloaded file.


Test Case Prioritization based on Historical Failure
Patterns using ABC and GA
Pushkar Padmnav Gaurav Pahwa Dinesh Singh
Aricent Aricent Aricent
Gurgaon, India Gurgaon, India Gurgaon, India
[email protected] [email protected] [email protected]
Saurabh Bansal
Aricent
Gurgaon, India
[email protected]

Abstract— Regression testing is defined as type of software procedure [4] are better than code coverage based
testing carried out to ensure that no new bugs are introduced prioritization. However, the unavailability of the mapping of
due to modifications to existing code or addition of new the test cases to the requirements (requirements traceability)
features. Prioritization techniques order the set of test cases for and effort to pull such information poses a serious challenge
improved & effective testing. Test Case Prioritization (TCP) in
corrective regression testing is an indispensable arsenal to help
to already struggling project deliveries and timelines. To
discover faults faster during the initial phase of testing. Many address above issue, we have proposed a swarm & GA
techniques have been proposed for Test Case Prioritization based artificial intelligence technique which utilizes the
(TCP) based on requirement correlation, test coverage, historical regression testing data to generate the prioritized
information retrieval which are dependent on data which is not set of test cases based on the number of faults it uncovers
easily available. Our approach considers the historical and its execution time with better results.
execution of the regression cycles through use of Artificial Bee
Colony Optimization (Swarm Intelligence) & Genetic Section II describes the background on testing techniques.
Algorithm for fault detection with improved results. Then we briefly describe the evolutionary and swarm
Keywords— Test Case Prioritization, Regression Testing, algorithms in Section III. Section IV details out the
APFD, Artificial Bee Colony Optimization, Genetic algorithm, experimental analysis. In Section V, we discuss the
Corrective Regression Testing, Swarm Intelligence experiment results. Lastly, we present our future work in
Section VI and conclusion in Section VII.
I. INTRODUCTION
Testing is an integral part of the software
II. BACKGROUND
development lifecycle and ensures product’s reliability,
compatibility, efficiency and robustness. Testing is an A. Regression Testing
activity to check whether the actual results match the
expected results and software is defect free [1]. There are Regression testing is a process which is applied after a
program is modified. It tests the modified program to build
close to ~100 testing strategies [2] which exist today, the
confidence that the changed program will perform as per the
most common ones used by various companies are unit
specification (possibly modified). Regression forms an
testing, integration testing, regression testing, smoke testing,
integral part of the testing strategy during the maintenance
alpha testing, beta testing, stress and performance testing. It
phase where the software system may be corrected, adapted
has been observed that at least 40-50% of the effort is
to new environment, or enhanced to improve its
expended in testing a software.
performance [7]. As the regression suites grow with each
new-found defect, test automation is frequently involved
A specification (requirements) or structural (code) change
which often poses a challenge due to time constraints.
can introduce bugs which are detected during the regression
testing. Although, regression is an important exercise for B. Types Of Regression Testing
testing a product for its robustness, the time it takes for Corrective regression testing and progressive regression
testing a product can span from days to weeks for a testing are forms of regression testing [7]. Whenever a new
complete set of execution. The faults, if revealed late in the requirement is incorporated into a system, the specification
cycle delays bug fixes and project releases. Prioritizing test is modified. Progressive testing tests the program for its
cases helps in improving the performance of the testing correctness against the modified specification. The cycle is
cycle through detection of faults earlier in the cycle, thereby invoked at regular intervals and fewer test cases can be
permitting more time for troubleshooting and debugging of reused during the testing. However, conducting this testing
issues. helps in ensuring that, there are no features that have been
compromised in the new and updated version that exists in
Multiple approaches have been researched for test case the previous version. In Corrective Regression testing, the
prioritization using traditional code coverage based specification doesn’t change. Only few areas of code may be
techniques which have not proved to be very effective [3]. modified for any fixes. According to [21], fixes involve
Techniques based on requirement correlation and weights
software failures, performance failures, and implementation individuals on their own. Swarm requires no leader. There is
failures to keep the system working properly. The set of test no single ant or bee influencing the decisions to exploit a
cases from the previous test plan will be valid but might not food source or create trails for movement.
be able to test the previously targeted program constructs
during the corrective regression testing. Modern science has been researching on the swarm
capabilities to build systems which are autonomous in
Progressive testing is done after enhancements related to nature and are more superior and capable than a single
specification change are carried out, while corrective intelligent unit. These properties of collaboration,
regression testing is done during development cycle and after cooperation and learning from biological systems can be
corrective maintenance. It is imperative to understand that employed to solve several optimization problems in various
both these types of regression testing involve execution of domains.
two set of test cases viz. specification based test cases and
structural test cases. The approach in the paper is intended C. ABC (Artificial Bee Colony Optimization) and GA
for corrective regression testing. (Genetic Algorithm)
In our experiment, we have used the combination of
III. TEST CASE PRIOTIZATION PROPOSED APPROACH Artificial Bee Colony Algorithm (ABC) which is a swarm
inspired meta-heuristic algorithm and evolutionary method
A. Motivation (Genetic algorithm) [11] for prioritizing the test cases.
Test case prioritization has been explored across many Metaheuristic is a high-level framework to develop heuristic
optimization algorithms that may provide a sufficient good
research papers through use of multiple techniques and
solution to an optimization problem especially with
empirical studies have supported the methodologies used.
incomplete information. Metaheuristics can often find good
solutions with less computational effort than optimization
Charitha et al. [4] proposed prioritizing the test cases based algorithm, iterative methods or simple heuristics.
on requirements and risks. The technique uses risk level of Evolutionary computation, genetic algorithms and particle
potential defect type to identify the risky requirements and swarm optimization [14] are population based
then prioritizing the test cases by establishing the relation metaheuristics. Since its introduction by Karaboga in 2005
between test cases and the requirements. E.g. A security [15] for solving numerical optimization problems, Artificial
threat defect will correspond to nonfunctional requirements Bee Colony Algorithm (ABC) has stirred a lot of interest
which will have multiple test cases associated and among researchers. Two vital notions of self-organization
depending on the risk severity of the defect, the and division of labor are essential to obtain swarm
prioritization will be done. intelligence behavior for system to self-organize and adapt to
given environment as put across by Karaboga.
Arafeen et al. [9] suggested the use of requirements based
• Self-organization is a set of rules for interactions
clustering using similarity measure and selecting the test
between the components of the system. These are
cases from each cluster following specific methods.
essentially positive feedback, negative feedback,
fluctuations and multiple interactions.
Jung-Hyun Kwon et al. [10] tried to address the limitations
• Division of labor is segregation of different set of
of the traditional code coverage techniques for infrequently
tasks to be performed by specialized individuals
tested code by use of Information Retrieval Concepts. Using
which helps swarm to respond to changed
coverage score and similarity score features calculated by
environments in search space.
TF (Term Frequency) and IDF (Inverse Document
Frequency), the linear regression model is trained and
evaluated to assign weights to the test cases based on both In Artificial Bee Colony Algorithm, there are three
features. components: food sources, employed bees and unemployed
bees. Unemployed bees are grouped into either onlooker or
All the above approaches rely either on source code scout bees.
information, or factors related to requirement such as
implementation complexity and customer priority. However,
our approach is based on the historical regression data and The food sources are chosen according to the distance from
execution time of the test cases which is captured through the nest, nectar amount which can also be termed as
various test management tools. profitability or fitness value of the source. Employed bees or
foragers are associated with a food sources where they are
B. Understanding Swarm Intelligence employed. These bees share the fitness information with the
other group of bees by performing a waggle dance. The
Swarm Intelligence is a collective behavior exhibited by
other group of bees namely scouts are on a search for new
group of organisms or artificial systems with ability to
food sources around the nest and onlooker bees wait in the
manage complex systems of interacting individuals through
nest for establishment of the most profitable food source
minimal communication to produce a global emergent
from the information shared by the employed bees.
behavior [12]. These systems are self-organizing and
achieve their objectives through the interactions of the entire
group. Bee swarming, ant colonies, fish schooling are some
natural swarm intelligence systems. These groups are
smarter when they are thinking together, than they would be
B. Experimental setup
Open industrial data set from ABB Robotics Norway
[18] has been used for Test Case Prioritization. Out of the
three data sets available - Paint control, IOF/ROL and
GSDTSR (google shared dataset for test suite results),
IOF/ROL has been selected as an input to the experiment.
Fig. 1: ABC process

a) Dataset
Genetic Algorithm have been used for problem solving
since its inception, but not on a larger scale due to the time The dataset is available on bitbucket1 with following
complexity it carries. However, with the advent of fast features.
computers, the computational power is not a challenge TABLE I: Dataset Features
Column Description
anymore [11]. GA used the analogy of the natural evolution
Cycle Cycle number to which the test execution belongs
where the strongest of all are most likely to survive. Id Unique Id for each execution
Duration Execution time of the test run
Name Test case name
Verdict Test verdict (Failed: True, Passed: False)
Last List of previous results
results
Last run Last execution of test case in timestamp format

The following features are filtered to select only relevant to


the TCP algorithm.
Fig. 2 : GA Process
TABLE II: Filtered Features
Column Description
The process of GA begins with initializing a population, then Test Execution Unique incrementing value
selecting the parents to create a new set of offspring through record
cross over. The mutation is applied to each of the offspring Test case id Test case ID / name
after crossover. The desired output is to yield a fitter set of Time Taken Execution time in milliseconds
offspring or a better solution to a problem. The cycle Result Test Verdict (Failed: True, Passed: False)
continues till the exit criteria is met i.e. solution reached for a
problem. The usage of ABC and GA has been eloborated in There are total 2086 unique test cases and 32260 execution
the experimental evaluation section records for 320 execution cycles in the dataset selected.

IV. EXPERIMENTAL EVALUTAION b) Data Selection


Having laid down the fundamentals of the Swarm Finding an open regression test dataset has been a
intelligence, ABC and GA algorithms, we will now detail challenge for the experiment. The dataset1 cycles were not
out the Test case prioritization using ABC and GA with the exclusively regression and had mixed execution including
historical execution data from the dataset. specific test case selection for feature testing. The maximum
number of test cases executions in any of the cycles was
A. Metrics and Evaluation Criteria found to be 1064. 10 cycles were identified in order of
APFD is a standard metrics which can be used to decreasing order of the number of test records execution.
evaluate the performance of the prioritization technique. The identified cycles are as follows.
APFD stands for Average Percentage Faults detected [10]
which basically means how quickly the faults are detected
during a test execution cycle. TABLE III: Training & Validation data selection
S.No. Cycle Number of Unique Training Validation
executions test cycle cycle
cases
1 178 1064 1058 177 178
(1)
2 180 965 949 179 180
3 245 935 935 244 245
4 163 806 774 162 163
where 5 249 792 175 248 249
TFi  the position of the first test in T that exposes the 6 263 788 558 262 263
fault i 7 103 757 757 102 103
8 270 748 667 269 270
m  number of the faults in the program 9 275 735 680 274 275
n  number of the test cases 10 67 690 690 66 67
The APFD metrics value ranges from 0 – 1. The value
closer to 1 indicates that faults are detected early in the 1
Data set extracted from https://bitbucket.org/helges/atcs-data
execution of the prioritized cycle.
profile. Fault discovery profile or fitness value refers to the
c) Training Procedure goodness of the test case for solving a problem. The more
the number of 1’s for a foraging bee, better the fitness value.
In this section, we use the test cases data from the Table
IIIin the order as specified. For each of the historically For the best results, the complete cycle is repeated five
executed cycles, the data has been trained and then validated times and weighted average of the ranks is considered as the
on the last cycle for measuring its efficiency in detecting the final ranks for the prioritized test cases. Finally, the ranked
faults through the APFD metrics. test cases are grouped into five prioritized buckets P1
through P5. P1 containing the highest prioritized list of test
The bees start foraging for each of the test cycle to prioritize cases. The Fig. 3 gives an overview of the process.
the test cases. The process starts by initial foraging with half
bees as number of test cases while adding a test case to each
of the bee. The Table IV matrix gives an overview of a
sample execution cycle of test cases. The table captures six
days of execution history which can span for more than a
year. The data is from 01-Sept-2018 till 06-Sept-18, where
the columns represent the dates for execution of test cases T1
to T4.
Fig. 3: Training process with ABC and GA
Table IV: Test Case Execution Matrix over 6 days
Test 09/01 09/02 09/03 09/04 09/05 09/06
Case TABLE VI: Initial Assignment
T1 1 0 0 0 0 0 Bee Test Test Case Fault Fitness Value Time
T2 0 0 1 0 1 0 Case Detecting Capacity
B1 T2 001010 2 2
T3 0 1 0 0 0 1
B2 T3 010001 2 4
T4 1 0 0 0 1 1
Step 1:
TABLE V: Test Case Execution Time
Test Case Execution Time TABLE VII: Test Case Assignment
T1 6 Bee Test Test Case Fault Fitness Value Time
T2 2 Case Detecting Capacity
T3 4 B1 T2, T1 101010 3 8
T4 4 B2 T3, T4 110011 4 8

The 1’s are the test failures and 0’s are the pass values. In Step 2:
the first round the bees return to hive but no crossover is
B3: T2 | T4
done as each bee has only a single test case. However, the
exit criteria are applied to verify if any bee has discovered
all faults.

If the fitness value by adding a new test case to a bee


increases i.e. more number of 1’s through OR operation, the
test case is added else the cycle continues till the remaining
list is exhausted. On successful addition of a new test case,
the bees return to hive for crossover. The new set of
offspring are only added to the set of foraging bees, if the
following criteria of crossover are met.
• None of the existing bees should have the same
state as that of the newly created offspring i.e. TABLE VIII: Crossover
same number of 1’s. Bee Test Test Case Fault Fitness Value Time
• Execution time of the new bees should be less than Case Detecting Capacity
the maximum execution time of the any other B1 T2, T1 101010 3 8
B2 T3, T4 110011 4 8
foraging bee. B3 T2, T4 101011 4 6
B4 T3, T1 110001 3 10
If any of the bees discovers all failures or has exhausted all
the test cases, the exit criteria are met and the process ends.
At this juncture, the bee will perform waggle dance to Bee B4 will not be added to the list as its execution time is
announce the best food source i.e. the best solution for Test not meeting the crossover criteria. The Bees will again
Case Prioritization. The number of test cases for that bee are forage for addition of new set of test cases
ranked according to the minimum execution time and are
subtracted from the total test cases. The process is then TABLE IX: Foraging
Bee Test Case Test Case Failure Detecting Fitness Time
again repeated for the remaining test cases till all the test Capacity Value
cases are ranked or prioritized. B1 T2, T1, T3 111011 5 12
More weightage is allotted to a test case with minimum B2 T3, T4, T2 111011 5 10
execution time, if two test cases have same fault discovery B3 T2, T4, T3 111011 5 10
In this cycle, the Bee B3 will not be assigned the test case TABLE XII: Third Cycle Validation Results
T1 as it doesn’t add to its fitness value. Now adding T3 to Priority Passed Test Failed Test Cumulative Not
Cases Case Detection % Executed
B2 will have the exit criteria met. The Bee B3 has minimum P1 100 20 50 254
execution time and shares same fitness with B1, hence B3 P2 188 16 90 169
will be chosen for first set of prioritized test case. P3 204 2 95 168
P4 225 1 97.5 147
The prioritized test cases will be T2, T4, T3 & T1 in that P5 173 1 100 200
order.
TABLE XIII: Fourth Cycle Validation Results
Priority Passed Test Failed Test Cumulative Not
Cases Case Detection % Executed
V. EXPERIMENT RESULTS P1 62 33 38.37 260
P2 148 34 77.9 173
a) Validation P3 177 18 98.83 160
P4 71 0 98.83 284
The results of the prioritization for each of the cycle in P5 230 1 100 124
Table III are divided into buckets from P1 to P5. P1 contains
TABLE XIV: Fifth Cycle Validation Results
the highest rank test cases. Each validation cycle is then
Priority Passed Test Failed Test Cumulative Not
verified for fault detection against model generated for each Cases Case Detection % Executed
cycle. As seen from the results, the failures for each of the P1 3 24 16.32 348
validation cycle are detected in top most prioritized buckets. P2 4 19 29.25 351
P3 12 13 38.09 350
TABLE X: First Cycle Validation Results P4 4 13 46.93 357
Priority Passed Test Failed Test Cumulative Not P5 0 78 100 297
Cases Case Detection % Executed
P1 92 74 50.68 196 TABLE XV: Sixth Cycle Validation Results
P2 214 51 85.61 97 Priority Passed Test Failed Test Cumulative Not
P3 226 11 93.15 124 Cases Case Detection % Executed
P4 78 7 97.94 275 P1 30 56 33.73 294
P5 302 3 100 56 P2 60 20 45.78 300
P3 99 12 53.02 268
P4 107 54 85.54 219
Table X contains the validation result for the first cycle. On P5 89 24 100 267
examining the results, it shows that around 85% of the
failures are detected during the execution of the P1 & P2
TABLE XVI: Seventh Cycle Validation Results
bucket test cases. The APFD value for the above set of Priority Passed Test Failed Test Cumulative Not
prioritizations improved from default value of 0.54 to 0.78. Cases Case Detection % Executed
Here, the total number of test cases trained were 1806. Out P1 91 37 84.09 180
of the 1806 test cases, 1064 were executed in the 178 th P2 143 3 90.90 161
cycle. The sum of Passed, Failed and not Executed test cases P3 168 0 90.90 139
P4 116 0 90.90 191
is 1806. The sum of Passed and Failed is 1058 as there were
P5 194 4 100 110
few multiple executions of the test case in the same
validation cycle. TABLE XVII: Eight Cycle Validation Results
Priority Passed Test Failed Test Cumulative Not
Similarly, the tables through XI – XIX presents the result of Cases Case Detection % Executed
the validation for all the other nine cycles with desirable P1 78 38 35.84 267
P2 155 24 58.49 204
outcome. P3 130 38 94.33 214
However, there are three validation cycles where the failures P4 139 6 100 238
P5 59 0 100 324
are detected late in the cycle in P5 bucket. These are the
cycles where it has been observed that the test cases falling TABLE XVIII: Ninth Cycle Validation Results
in the P5 bucket are the ones which never failed previously Priority Passed Test Failed Test Cumulative Not
and were low prioritized but failed in the validation cycle. A Cases Case Detection % Executed
mixed solution approach will be discussed for handling this P1 35 81 47.64 269
P2 82 88 99.41 215
edge case in the Future Work section. The other set of P3 99 1 100 285
cycles present excellent results and a drastic improvement in P4 71 0 100 314
APFD. P5 101 0 100 284

TABLE XI: Second Cycle Validation Results TABLE XIX: Tenth Cycle Validation Results
Priority Passed Test Failed Test Cumulative Not Priority Passed Test Failed Test Cumulative Not
Cases Case Detection % Executed Cases Case Detection % Executed
P1 52 67 17.58 243 P1 70 41 73.21 108
P2 155 84 39.63 122 P2 166 11 92.85 41
P3 157 64 56.43 140 P3 159 2 96.42 57
P4 54 12 59.58 295 P4 132 2 100 84
P5 150 154 100 57 P5 107 0 100 111
The graph illustrates the APFD trend for the ten cycles of execution cycles and has no dependency on the
validation against the trained prioritized model. The average requirements, feature risks, code coverage analysis, thereby
of the APFD stands at 0.70 for all cycles and 0.79 for cycles reducing the efforts for a software company to invest time in
other than three identified in results section where failures collection & preparation of the data for the training.
were detected in last cycle.
ACKNOWLEDGMENT
This research was supported by Aricent Technology
Innovation.
REFERENCES
[1] G. Saini, K. Rai, “An Analysis on Objectives, Importance and Types
of Software Testing”
[2] https://www.softwaretestinghelp.com/types-of-software-testing/
[3] L. Inozemtseva and R. Holmes, “Coverage is not strongly correlated
with test suite effectiveness”
[4] C. Hettiarachchi, H. Do, B. Choi , “Effective Regression Testing
Using Requirements and Risks ”
[5] J. Hartmann, D.J. Robson, “Approaches to regression testing”
[6] R.R. Bate, G.T. Ligler, “An approach to software testing:
methodology and tools”
[7] H.K.N. Leung, L. White , “Insights into regression testing (software
Fig. 4 : APFD Trend testing)”
VI. FUTURE WORK [8] J. Gaur, A. Goyal, T. Choudhury, S. Sabitha, “A walk through of
software testing techniques”
Proposed TCP technique ranks the test cases in [9] Md. J. Arafeen, H. Do , “Test Case Prioritization Using
order of their effectiveness to uncover faults purely based on Requirements-Based Clustering”
its execution history. It does not have knowledge of new [10] J.-H. Kwon, I.-Y. Ko, G. Rothermel, M. Staats, “Test Case
features introduced in the system that can also lead to failure Prioritization Based on Information Retrieval Concepts”
of regression test cases that do not have prior history of [11] K.F. Man, K.S. Tang, S. Kwong , “Genetic algorithms: concepts and
applications [in engineering design]”
failure. Future work is planned to undertake test case
[12] Y.-F. Zhu, X.-M. Tang , “Overview of Swarm intelligence”
selection to determine the subset of regression test cases that
[13] Y. Yi, R. He , “A Novel Artificial Bee Colony Algorithm”
are potentially affected by a given modification to the
[14] I. Koohi, V. Z. Groza , “Optimizing Particle Swarm Optimization
system. Algorithm”
[15] D. KARABOGA, “AN IDEA BASED ON HONEY BEE SWARM
Additionally, it is planned to further optimize the TCP by FOR NUMERICAL OPTIMIZATION”
considering the severity of the bug associated with the failed [16] Md. I. Kayes, “Test case prioritization for regression testing based on
fault dependency”
test cases. This can be achieved by incorporating bug
severity weight during the training cycle. Test case failures [17] R. Gulati, P. Vats , “A literature review of Bee Colony optimization
algorithms”
associated with higher severity bugs will carry a higher
[18] H. Spieker, A. Gotlieb, D. Marijan
weightage. , M. Mossige , “Reinforcement Learning for Automatic Test Case
Prioritization and Selection in Continuous Integration”
[19] B. Suri, I. Mangal, V. Srivastava, “Regression Test Suite Reduction
VII. CONCLUSION using a Hybrid Technique Based on BCO And Genetic Algorithm”
In this paper, we proposed a novel approach for [20] https://www.testbytes.net/blog/types-of-regression-testing/
prioritization of the test cases using the historical failure [21] B. P. Lientz and E. B. Swanson, “Software Maintenance
Management”, Addison-Wesley, 1980
patterns for the correction regression cycle testing. The
[22] https://en.wikipedia.org/wiki/Metaheuristic
results clearly indicate that there is significant improvement
in the fault detection capability at the start of the regression
testing itself. The training data used by our model is mostly
maintained by the organization through various set of test
management tools. Our technique relied on the past

View publication stats

You might also like