0% found this document useful (0 votes)
30 views7 pages

Mutation Testing Cost Reduction Techniques

The document surveys techniques for reducing the costs associated with mutation testing, a valuable but underutilized software testing method due to its high expenses. It discusses the challenges of mutant generation, execution, and analysis, particularly the issue of equivalent mutants that complicate result interpretation. The authors propose selective mutation strategies and highlight the importance of efficient test case design to improve the practicality of mutation testing in industrial applications.

Uploaded by

Milind Kale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views7 pages

Mutation Testing Cost Reduction Techniques

The document surveys techniques for reducing the costs associated with mutation testing, a valuable but underutilized software testing method due to its high expenses. It discusses the challenges of mutant generation, execution, and analysis, particularly the issue of equivalent mutants that complicate result interpretation. The authors propose selective mutation strategies and highlight the importance of efficient test case design to improve the practicality of mutation testing in industrial applications.

Uploaded by

Milind Kale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

feature

testing

Mutation Testing Cost


Reduction Techniques:
A Survey
Macario Polo Usaola and Pedro Reales Mateo, University of Castilla-La Mancha

F
rom the research perspective, mutation is a mature testing technique that has
Although mutation’s
often shown its value for evaluating both software and software testing tech-
main steps (mutant
niques. However, to the best of our knowledge, there’s an important gap be-
generation, test tween its current research status and the possibilities of adopting it for the indus-
case execution, and trial world, owing to its high costs.
result analysis) can For three decades, researchers have made consid- So, a mutant M of a program under test P is a copy
be costly, research erable effort and obtained sufficient results regard- of P that contains a small code change that’s inter-
allows developers to ing mutation. However, neither software practition- preted as a fault. Mutation relies on the ability of
ers nor testing-tool developers have put the results the test data set (the test suite) to find faults in the
apply it to industry. to work. Here, we describe research on cost reduc- set of mutants.
tion in mutation testing, focusing on techniques Test engineers typically use automated tools to
that could easily transfer to industrial practice. generate mutants. These tools apply a set of muta-
tion operators to P. They define each mutation op-
Preliminary concepts erator to introduce some type of syntactic change
Richard DeMillo and his colleagues proposed mu- to a statement. For example, a simple instruction
tation as a testing technique in 1978.1 They de- such as return a + b (where a and b are integers) can
scribe this basic idea as follows: mutate in at least 20 different ways (a − b, a × b, a
/ b, a + b++, −a + b, a + − b, 0 + b, a + 0, |a| + b,
A programmer enters from a terminal a pro- a + |b|, and so on), depending on the mutation op-
gram, P, and a proposed test data set whose erators. Thus, the number of mutants generated
adequacy is to be determined. The mutation even for a medium-size program can be very large.
system first executes the program on the test Dealing with this number of mutants has implica-
data: if the program gives incorrect answers tions regarding the time needed to compile, link,
then certainly the program is in error. On and execute them.
the other hand, if the program gives correct Automated tools typically execute test cases
answers, then it may be that the program is against the original program and the mutants,
still in error, but the test data is not sensi- registering the results with each program version
tive enough to distinguish that error: it is not (original or mutant). When the result of execut-
adequate. The mutation system then creates a ing a test case against a mutant M differs from the
number of mutations of P that differ from P same test case against P, the test case has found
only in the occurrence of simple errors.1 the fault introduced in M, and the mutant is killed;

80 IEEE SOFT WARE Published by the IEEE Computer Society 0 74 0 -74 5 9 / 10 / $ 2 6 . 0 0 © 2 0 10 I E E E


otherwise, the mutant is alive. Thus, the mutation Subsumption of Coverage Criteria
testing aims “to kill all the mutants.” An oracle
compares states of the mutant and the original pro- A criterion coverage C1 subsumes another criterion C2 if for every program,
gram after executing each test case. Here, the main any test set T that satisfies C1 also satisfies C2.1 Three examples of such sub-
problem is the number of executions required to sumption follow. Exercising all statements in a class subsumes all methods. If a
execute all the test cases against all the mutants, test suite traverses all the edges in a connected graph, all its nodes are also
which, in principle, should be |T| × |M|, where T is traversed. So, all-edges subsumes all-nodes. The decision coverage criterion requires
the set of test cases and M is the set of mutants (you that each decision execute at least once. The logical connector replacement mutation
must also consider |T| additional executions for the operator replaces each decision in a program by true and false. To kill them,
original program). test cases respectively taking the false and the true branches must be written
Many mutants that remain alive will never be to cover decisions.2
killed because they’re equivalent mutants and will
always produce the same output as P for any test References
case. Actually, the “fault” introduced in equiva- 1. P.G. Frankl and E.J. Weyuker, “An Applicable Family of Data Flow Testing Criteria,” IEEE
Trans. Software Eng., vol. 14, no. 10, 1998, pp. 1483–1498.
lent mutants isn’t a fault but an optimization or 2. A.J. Offutt and J.M. Voas, Subsumption of Condition Coverage Techniques by Mutation
deoptimization of the code (for example, the Java Testing, tech. report ISSE-TR-96-01, Dept. of Information and Software Systems Eng., George
instructions return a and return a++ provide the same Mason Univ., 1996.

result). Equivalent mutants are really noise and


make the third step of mutation testing difficult—
analyzing test case execution results. Taking into
account the set of equivalent mutants, Equation 1 Fix P
gives the quality of a test suite (measured in terms
of the number of mutants killed) and defines the
mutation score: Test suite F
(T )
K Run T T
MS (P ,T ) = , (1) T on P P(T ) correct? Are there valid
(M − E ) mutants?

where P is the program under test; T is the test Input test


F
suite; K is the number of mutants killed; M is the program
number of mutants generated; and E is the number Create
of equivalent mutants. Add new mutants
So, mutation testing’s main difficulties come test cases
from the number of mutants the operators gener- to T
ate, the number of required executions, and the Run T on
each alive
result analysis step, which is hampered by the Define mutant
equivalent mutants introduced, usually around threshold
F
20 percent. These difficulties, together with the
“strange” nature of mutation (discovery of artifi- Eliminate Remove
Threshold ineffective equivalent
cially seeded faults), mean that this testing tech- reached? TCs mutants
nique hasn’t received much attention from the
industrial community, which is more interested T
in detecting real faults in the actual application.
Some studies have discussed how discovering all
the faults seeded by mutation operators might
subsume (see the “Subsumption of Coverage Cri- Figure 1. A modified version of mutation testing where T is the test
teria” sidebar),2 probably subsume,3 or corre- suite, P is the program under test, and TC is a test case. The tester
spond to several widely accepted coverage criteria checks the correct behavior of the original program before generating
(such as decision, condition, condition/decision, mutants.
and modified decision/condition). From here, you
can consider the mutation score as an adequate- is to have a reliable original program P, if T finds
coverage criterion if good mutation operators are no faults in P.
applied.4 Indeed, killing a set of mutants gener-
ated with a good set of operators helps to fulfill Mutation Testing
two goals. The first is to have a good test suite, T Figure 1 shows a testing process that slightly mod-
(if T discovers all the artificial faults). The second ifies the one that A. Jefferson Offutt proposed.5

May/June 2010 I E E E S O F T W A R E  81
Offutt’s process proposes generating mutants and mutation, which consists of generating mutants
iteratively executing the test cases against the liv- using only a reduced subset of mutation opera-
100 percent ing ones. As long as the process doesn’t reach a tors. The criterion for selecting the operators

mutation score minimal, preestablished mutation score threshold,


the tester must add new test cases to the suite until
lies in the “goodness” of the mutants generated.
Elfurjani S. Mresa and Leonardo Bottaci������������
con-
for 10 percent he or she finds the desired number of introduced ducted an empirical study to determine the best
of the mutants faults. Then, the tester compares the results of ex-
ecuting the test cases against the original program
operators and when it’s preferable to use random
selection.6 They observed, for example, that SVR
is nearly with the expected results. If any are incorrect, the (scalar variable replacement), ASR (array refer-
adequate for developers must fix the original program and re- ence for scalar variable replacement), and CSR
start the process. So, it’s possible to detect errors (constant for sc���������������������������������
alar variable replacement) opera-
������
a full mutation in the original program after having achieved tors���������������������������������������������
generate the most mutants (confirming a pre-
analysis. the desired mutation score, which may require a vious analysis12) and that these operators perhaps
new execution of all the steps. Once the original shouldn’t be included in a selective set. Mresa and
program has changed, some or all of the first- Bottaci classify the mutation operators in several
generation mutants are no longer valid, so�����������
the tes- categories. They reached two main conclusions.
ter (perhaps using a tool) must create and execute
new test cases. Then the tester, assisted by the mu- ■■ If the program under test requires a muta-
tation tool, should analyze the results again. tion score very close to 100 percent, then ran-
To mitigate this possibility, the tester should dom selection is more efficient than selective
first execute the test cases against the input pro- mutation.
gram to find any faults as soon as possible. In the ■■ If less stringent test coverage is acceptable,
refined process in Figure 1, the tester evaluates the then selective mutation based on a restricted
correction of the input program for the initial test set of efficient operators—AOR (arithmetic
suite. It’s obviously important to write a good set operator replacement), SAN (statement analy-
of test cases that provide as much coverage as pos- sis), SDL (statement deletion), ROR (relational
sible. To this end, the literature includes a broad operator replacement), and UOI (unary opera-
set of test data selection techniques and combina- tion insertion)—is more efficient.
tion strategies to produce good test suites.6–7 (For
additional resources, tools, and so on, see Muta- In a previous study, Offutt and his colleagues
tion Testing Online; www.mutationtest.net.) concluded that test sets that are adequate for the
Once the tester has a process model to follow, mutants generated by �����������������������
AOR, ROR, UOI, ABS (ab-
the next point of interest is reducing costs in the solute value insertion), and LCR (logical connector
“create mutants,” “run T on each alive mutant,” replacement) achieve a full mutation score of 99
and “threshold reached” boxes of the figure. percent, reducing the number of mutants generated
Eliminating ineffective test cases can occur dur- by 77 percent.13
ing test case execution or after, by applying a test- The suitability of mutation operators for test-
suite-reduction algorithm based on mutation. ing a program might depend on its programming
language. The selected operators apply to almost
Mutant Generation any programming language; however, they don’t
When mutants are generated, several operators consider, for example, the manipulation of point-
can mutate almost every executable instruction in ers in languages such as C or C++ or the character-
the original program, meaning that the number of istics of object orientation, such as inheritance and
mutants generated for a normal program can be polymorphism. In this respect, James Andrews
huge. Depending on the system, this can result in and his colleagues conducted an experiment on
high costs for compilation and further steps. To C programs by applying Offutt’s selected opera-
reduce such costs, ���������������������������������
the tester ����������������������
can select either ran- tors and adding SDL because the subject programs
dom mutants or the best mutation operators (se- “contained a large number of pointer-manipula-
lective mutation). tion and field-assignment statements that would
Regarding random selection, research has shown not be vulnerable to any of the sufficient mutation
that a 100 percent mutation score for 10 percent of operators.”14
the mutants is nearly adequate for a full mutation Mutation operators try to imitate common
analysis.8–11 However, such selection requires gen- errors that programmers commit (such as using
erating, compiling, and linking all the mutants. a null pointer or not overriding an inherited op-
Researchers have concentrated on selective eration) and rely on the coupling effect, in which

82 IEEE SOFT WARE w w w. c o m p u t e r. o rg /s o f t w a re


a test data set that detects all simple faults in a Table 1
program is so sensitive that it also detects more
complex faults.15–16 One criticism of mutation A killing matrix for a supposed program
testing is the artificial nature of the faults seeded. Each X represents that the tci test case has killed the mj mutant
Andrews and his colleagues reached two impor-
Mutant tc1 tc2 tc3 tc4 tc5 tc6
tant conclusions. First, using selectively generated
mutants (from which the equivalent mutants must m1 X X
be removed) can indicate a test suite’s fault detec-
m2 X X X
tion ability. Second, their experiment “shows the
danger of using faults selected by humans, since it m3 X X
leads to underestimating the fault detection abil-
m4 X X
ity of test suites.” Thus, they also note the con-
venience of automatic mutant generation, which m5 X X
provides a well-defined, fault-seeding process
m6 X X
and the possibilities of replication and criteria
subsumption.14 m7 X
In a study regarding sufficient mutation oper-
ators for C, Ellen F. Barbosa and her colleagues
selected from the mutation operators in the Pro- mutants and a test suite with six test cases. At
teum tool this set of operators: SWDD (while re- first glance, the complete execution requires 6 ×
placement by do-while), SMTC (n-trip continue), 7 = 42 executions (setting aside the six executions
SSDL (statement deletion), OLBN (logical opera- in the original program). If test cases execute only
tor by bitwise operator), OASN (arithmetic op- against those mutants that remain alive, then
erator by shift operator), ORRN (relational op- the number of executions might decrease signifi-
erator mutation), VTWD (twiddle mutations), cantly: tc1 kills m1 and m2, which are removed
VDTR (domain traps), Cccr (constant for con- from the mutant suite (there are seven executions
stant replacement) and Ccsr (constant for scalar at this point). Then, tc2 executes against m3 to
replacement).17 m7 (five executions), and m3 is removed from
In addition, Yu-Seung Ma and her colleagues the mutant suite because it’s killed. Then, tc3 ex-
have developed MuJava, a Java mutation testing ecutes on m4 to m7, removing m4, m5, and m6
tool that uses mutant schemata generation to di- from the mutant set. In this example, tc4 and tc5
rectly manipulate Java bytecode, thus saving time execute with no positive results against the only
in mutant compilation.18 live mutant (m7). Finally, one more execution of
tc6 kills m7. In this way, only 19 test case execu-
Test Case Generation tions are required instead of 42, and the test suite
and Execution can be reduced to four test cases.
Having a reduced-size test suite is important, es- Another possibility is reducing the suite after
pecially for regression testing during software all test cases execute. Although the problem of
maintenance. Mats Grindal and his colleagues minimizing a test suite (the “optimal test-suite
reviewed strategies for test case generation, each reduction problem”) has been shown to be NP-
with its advantages and drawbacks: Each choice, hard19 (and thus has no solution in polynomial
for example, produces small test suites but pro- time), several approaches present greedy algo-
vides low coverage. All combinations provides rithms for its solution (along with several au-
the highest coverage but produces the largest test thors, Neelam Gupta has worked intensively
suites. Moreover, many of those cases are redun- in this area 20). These approaches require com-
dant, because they don’t increase the coverage plete execution of all test cases against all the
reached by other test cases in the same suite.7 mutants: in Table 1, tc1 and tc6 reach the same
Regarding test case execution, the most com- mutation score as the complete test suite. If the
mon way to eliminate the ineffective test cases test case selection occurs during mutant execu-
(see the box in Figure 1) is to execute each test tion (as in the example given in the previous
case only against the mutants that remain alive. paragraph, where four test cases were selected),
So, after the execution, the tester obtains a re- the reduced suite can be farther from the min-
duced test suite that reaches the same mutation imum size obtained by a greedy algorithm (as
score as the whole test suite. Consider, for exam- in this example, where only two test cases are
ple, Table 1, which shows a program with seven selected). Since testing is often programmed as

May/June 2010 I E E E S O F T W A R E  83
unattended, nightly batch processes, the com- mutants and shows meaningful cost reductions
plete execution and further application of a in mutation testing, especially during result anal-
Combining greedy algorithm is a good choice for approach- ysis. At first glance, the number of second-order

these ing the optimal reduced suites, which can be an


important benefit in regression testing.
mutants corresponding to a set of first-order
mutants is one-half (although seach algorithm
techniques Another promising approach in test case ex- produces different quantities of second-order
means a cost ecution is weak mutation, 21–23 which requires
the continuous observation of the mutant being
mutants). In general, there will be 1/n n-order
mutants. The number of test case executions also
savings that executed to check its intermediate state changes decreases (against 1/n instead of against n mu-
could surpass with respect to the original program. Classic mu- tants). Perhaps more important, the percentage
tation (also called strong mutation) considers a of equivalent mutants significantly decreases be-
75 percent test case that kills a mutant when the test case cause with about 20 percent of �������������������
first��������������
-order equiva-
of the original output differs after executing it on the original lent mutants, the probability of combining two
costs. and the mutant. More formally, this requires
three conditions: reachability (the mutated state-
equivalent mutants to produce a new one de-
creases to 4 percent. Obviously, the counterpart
ment must be reached), necessity (once the state- is the possibility of killing all the n-order mutants
ment has been reached, the test case must cause with test cases that only discover one of the n
an erroneous state on the mutant), and suffi- seeded faults. The authors of the paper include an
ciency (the erroneous state must be propagated experimental study with benchmark programs
to the output). Weak mutation only requires the and some pieces of industrial software, conclud-
two first conditions to detect the change intro- ing that, as long as the tester is aware of this risk,
duced in the mutant, considering it’s killed just even sixth-order mutation can be effective.28
when the different state is detected.

A
Result Analysis n industrially applicable mutation-test-
The most important obstacle in this third step is ing tool should have these requirements:
the presence of equivalent mutants. Phyllis Frankl
and her colleagues discuss the almost prohibitive
cost of detecting equivalent mutants.24 Bernhard ■■ Users should be able to generate mutants with
Grün and his colleagues report of a duration of a selective set of generally applicable mutation
15 minutes to assess the equivalence of a single operators, most likely AOR, ROR, UOI, ABS,
mutation.25 and LCR. Additionally, and for specific lan-
From a formal point-of-view, the problem guages or environments, the tool should con-
with detecting all equivalent mutants is undecid- sider including other concrete operators.
able, although in practice you can detect some by ■■ Users should be able to select a random set of
annotating the program under test with restric- mutants.
tions26 and program slicing.27 However, the in- ■■ Also depending on the specific environment,
dustry doesn’t usually apply these techniques, so the tool should allow mutation at compiled-
they aren’t easily adaptable to common software code level (bytecode for Java, Microsoft Inter-
development practice. mediate Language for .NET, and so on).
Many selective-mutation concepts aim to re- ■■ In test execution, the tool should support
duce the number of equivalent mutants, which both executing test cases on only the mutants
implies a considerable reduction during result remaining alive and, regarding batch, unat-
analysis. From the automatable-techniques per- tended testing cycles, selecting a reduced test
spective, a recent paper discusses perhaps the suite with, for example, a greedy algorithm.
most significant results and relies on n-order mu- ■■ The tool should support instrumentation of
tation.28 An n-order mutant contains n faults both the original program and the mutants to
instead of 1 and proceeds from a previous gen- keep a log of the execution. Changes in a log
eration’s combination of mutants. Thus, two would highlight a behavior difference, mean-
first-order mutants (each with a fault) are com- ing that the corresponding mutant has been
bined into a second-order mutant with two faults, killed, and making this technique a type of
which might in turn be combined with another weak mutation.
first-order mutant to obtain a third-order mutant. ■■ To reduce result analysis costs, the tool should
The paper describes three algorithms for pro- allow n-order mutation, which is easily auto-
ducing second-order mutants from first-order matable and transferable to industry.

84 IEEE SOFT WARE w w w. c o m p u t e r. o rg /s o f t w a re


Data from some experiments shows that a About the Authors
mean of 18.66 percent of the mutants generated
are equivalent.28 Taking the triangle-type prob- Macario Polo Usaola is a professor of computer science in the Department of
Information Systems and Technologies at the University of Castilla-La Mancha. He’s also an
lem as a possible baseline (a small program which
active member of the Alarcos Research Group. His research interests relate to the automa-
many researchers have used for testing experi- tion of software testing tasks. Polo has a PhD in computer science from the University of
ments), the MuJava tool generated 309 mutants Castilla-La Mancha. Contact him at [email protected].
(70 of them equivalent, 22.65 percent):

■■ Applying selective mutation could reduce the


number of mutants by three-fourths—78 mu-
Pedro Reales Mateo is a PhD student of computer science in the University of
tants. As we’ve discussed, there’s significant Castilla-La Mancha’s Department of Information Systems and Technologies. His research
confidence that a test suite killing these 78 interests relate to the automation of software testing. Reales has an MSc in computer science
mutants would also kill the original 309. from the University of Castilla-La Mancha. Contact him at [email protected].
■■ Supposing a uniform distribution of equiva-
lent mutants per operator (which actually
isn’t true, because each operator has a differ-
ent proneness to produce this kind of noise),
the process would generate 16 equivalent
mutants.
■■ Combining the 78 selected mutants with a
good combination algorithm (DifferentOper-
ators is the best of the three presented) would
ISSE-TR-96-01, Dept. of Information and Software
produce between 50 and 55 percent second- Systems Eng., George Mason Univ., 1996.
order mutants (39 to 43), with about 5 per- 3. A.J. Offutt et al., “An Experimental Evaluation of Data
Flow and Mutation Testing,” Software: Practice and
cent (2) of equivalent mutants.
Experience, vol. 26, no. 2, 1996, pp. 165–176.
4. M. Polo, M. Piattini, and S. Tendero, “Integrating
Combining these techniques means a cost sav- Techniques and Tools for Testing Automation,” Soft-
ware Testing, Verification and Reliability, vol. 17, no.
ings that could surpass 75 percent of the origi- 1, 2007, pp. 3–39.
nal costs (important savings, for example, for the 5. A.J. Offutt, “A Practical System for Mutation Testing:
case of Grün, who reported 40 percent of equiv- Help for the Common Programmer,” Proc. 12th Int’l
Conf. Testing Computer Software (ICST 95), IEEE CS
alent mutants in an industrial project25). Addi- Press, 1995, pp. 99–109.
tionally, this type of tool could even reduce the 6. E.S. Mresa and L. Bottaci, “Efficiency of Mutation Op-
cost of test case execution via code instrumen- erators and Selective Mutation Strategies: An Empirical
Study,” Software Testing, Verification and Reliability,
tation for supporting weak mutation (because vol. 9, no. 4, 1999, pp. 205–232.
it wouldn’t require executing each test case un- 7. M. Grindal, A.J. Offutt, and S.F. Andler, “Combina-
til its termination). Currently, we’re developing tion Testing Strategies: A Survey,” Software Testing,
Verification and Reliability, vol. 15, no. 3, 2005, pp.
Bacterio, a tool with many of these characteris- 167–199.
tics. To view Bacterio, along with the experimen- 8. R.A. DeMillo and E.H. Spafford, “The Mothra Soft-
tal material cited in this section, visit http://alar- ware Testing Environment,” Proc. 11th NASA Software
Eng. Laboratory Workshop, Goddard Space Center,
cos.esi.uclm.es/testing. 1986.
9. A.T. Acree, “On Mutation,” doctoral dissertation,
School of Information and Computer Science, Georgia
Inst. of Technology, 1980.
10. R.A. DeMillo et al., “An Extended Overview of the
Acknowledgments Mothra Software Testing Environment,” Proc. 2nd
The PRALÍN (Pruebas en Líneas de Producto, Workshop Software Testing, Verification, and Analysis,
Junta de Comunidades de Castilla-La Mancha/ IEEE CS Press, 1988, pp. 142–151.
European Social Fund, grant PAC08-121-1374) and 11. K.N. King and A.J. Offutt, “A Fortran Language Sys-
the PEGASO/MAGO (Ministerio de Ciencia Inno- tem for Mutation-Based Software Testing,” Software:
vación, grant TIN2009-13718-C02-01) projects par- Practice and Experience, vol. 21, no. 7, 1991, pp.
tially supported this work. 685–718.
12. A.P. Mathur, “Performance, Effectiveness, and Reliabil-
ity Issues in Software Testing,” Proc. 15th Ann. Int’l
Computer Software and Applications Conf., IEEE CS
References Press, 1991, pp. 604–605.
1. R. DeMillo, R.J. Lipton, and F.G. Sayward, “Hints on 13. A.J. Offutt et al., “An Experimental Determination of
Test Data Selection: Help for the Practicing Program- Sufficient Mutant Operators,” ACM Trans. Software
mer,” IEEE Computer, vol. 11, no. 4, 1978, pp. 34–41. Eng. and Methodology, vol. 5, no. 2, 1996, pp. 99–118.
2. A.J. Offutt and J.M. Voas, Subsumption of Condition 14. J. Andrews, L. Briand, and Y. Labiche, “Is Mutation an
Coverage Techniques by Mutation Testing, tech. report Appropriate Tool for Testing Experiments?” Proc. 2005

May/June 2010 I E E E S O F T W A R E  85
Int’l Conf. Software Eng. (ICSE 05), ACM Press, 2005, 24. P.G. Frankl, S.N. Weiss, and C. Hu, “All-Uses versus
pp. 402–411. Mutation Testing: An Experimental Comparison of
15. R. DeMillo, R.J. Lipton, and F.G. Sayward, “Hints on Effectiveness,” J. Systems and Software, vol. 38, no. 3,
Test Data Selection: Help for the Practicing Program- 2007, pp. 235–253.
mer,” IEEE Computer, vol. 11, no. 4, 1978, pp. 34–41. 25. B.J.M. Grün, D. Schuler, and A. Zeller, “The Impact of
16. A.J. Offut, “Investigations of the Software Testing Equivalent Mutants,” Proc. IEEE Int’l Conf. Software
Coupling Effect,” ACM Trans. Software Eng. and Testing, Verification, and Validation Workshops (ICST
Methodology, vol. 1, no. 1, 1992, pp. 15–20. 09), IEEE CS Press, 2009, pp. 192–199.
17. E.F. Barbosa et al., “Toward the Determination of 26. A.J. Offutt and J. Pan, “Automatically Detecting
Sufficient Mutant Operators for C,” Software Testing, Equivalent Mutants and Infeasible Paths,” Software
Verification and Reliability, vol. 11, no. 2, 2001, pp. Testing, Verification and Reliability, vol. 7, no. 3, 1997,
113–136. pp. 165–192.
18. Y.-S. Ma, “MuJava: An Automated Class Mutation 27. R. Hierons and M. Harman, “Using Program Slicing to
System,” Software Testing, Verification and Reliability, Assist in the Detection of Equivalent Mutants,” Soft-
vol. 15, no. 2, 2005, pp. 97–133. ware Testing, Verification and Reliability, vol. 9, no. 4,
19. M.R. Garey and D.S. Johnson, Computers and Intrac- 1999, pp. 233–262.
tability, W.H. Freeman, 1979. 28. M. Polo, M. Piattini, and I. García-Rodríguez, “De-
20. D. Jeffrey and N. Gupta, “Test Suite Reduction with creasing the Cost of Mutation Testing with Second-
Selective Redundancy,” Proc. 21st Int’l Conf. Software Order Mutants,” Software Testing, Verification and
Maintenance (ICSM 05), IEEE CS Press, 2005, pp. Reliability, vol. 19, no. 2, 2008, pp. 111–131.
549–558.
21. R. DeMillo, E. Krauser, and A. Mathur, “Compiler-
Integrated Program Mutation,” Proc. 15th Ann.
Computer Software and Applications Conf. (Compsac
91), pp. 351–356. 1991.
22. W.E. Howden, “Weak Mutation Testing and Complete-
ness of Test Sets,” IEEE Trans. Software Eng., vol. 8,
no. 4, 1982, pp. 371–379.
23. A.J. Offutt and S.D. Lee, “An Empirical Evaluation of
Weak Mutation,” IEEE Trans. Software Eng., vol. 20, Selected CS articles and columns are also available
no. 5, 1994, pp. 337–344. for free at http://ComputingNow.computer.org.

Running in Circles Looking for a


Great Computer Job or Hire?
The IEEE Computer Society Career > Software Engineer
Center is the best niche employment > Member of Technical Staff
source for computer science and
> Computer Scientist
engineering jobs, with hundreds of
jobs viewed by thousands of the > Dean/Professor/Instructor
finest scientists each month - in > Postdoctoral Researcher
Computer magazine and/or online! > Design Engineer
> Consultant

http://careers.computer.org

The IEEE Computer Society Career Center is part of the


Physics Today Career Network, a niche job board network
for the physical sciences and engineering disciplines. Jobs
and resumes are shared with four partner job boards -
Physics Today Jobs and the American Association of Physics
Teachers (AAPT), American Physical Society (APS), and
AVS: Science and Technology of Materials, Interfaces, and
Processing Career Centers.

86 IEEE SOFT WARE w w w. c o m p u t e r. o rg /s o f t w a re

You might also like