Assertive Testing Reliable Code
Assertive Testing Reliable Code
Holzmann
RELIABLE CODE NASA/JPL
[email protected]
Assertive Testing
Gerard J. Holzmann
A COLLEAGUE ASKED me recently, ability. Here we’re on fi rmer ground. common approach is therefore to de-
“Are there any generally accepted Indeed, generally accepted methods fine reliability by measuring its op-
methods for accurately predicting exist that can measurably improve posite: the probability of failure. This
software reliability?” Sadly, the hon- reliability. Software testing is an ob- is similar to trying to define health
est answer is no. Surely there are vious example of such a method, but as the absence of illness. If you’re
generally accepted, and practiced, not the only, and perhaps not even healthy, the probability that you’ll get
methods, but no one would claim the best, such method. Here, I look sick in some interval of time should
be small, although it likely will never
be zero. So it is for software.
To measure a software applica-
tion’s reliability, then, we can try to
If you can’t measure it, express the rate of discovery of de-
fects that might lead to failure as a
you can’t manage it. probability.
For instance, if the long-term
probability of an application exhib-
iting a failure is p, that application’s
reliability (the probability of failure-
that they can make accurate predic- at simple, effective ways to augment free operation) is 1 – p. If p is 10 –9
tions. And if the predictions aren’t standard software testing. per hour of operation, we shouldn’t
accurate, how useful are they really? expect to see more than one failure
If that sounds overly pessimistic, Measuring Reliability per 100,000 years of operation on
it’s because the question was phrased How can we measure software reli- average, which should satisfy even
more or less as an absolute. Instead ability? Does a generally accepted met- the most demanding applications.
of asking whether methods exist that ric exist? A familiar dictum is “If you Reaching that target of 10 –9 fail-
can predict reliability accurately, it’s can’t measure it, you can’t manage it.” ures per hour can be extraordinarily
perhaps more helpful to ask whether Reliability clearly has something difficult. For instance, a recent gov-
methods exist that can improve reli- to do with the absence of failures. A ernment report specified the required
12 IEEE SOFTWARE | PUBLISHED BY THE IEEE COMPUTER SOCIETY 0740-7459/15/$31.00 © 2015 IEEE
Authorized licensed use limited to: University of London: Online Library. Downloaded on July 07,2023 at 18:37:08 UTC from IEEE Xplore. Restrictions apply.
RELIABLE CODE
M AY / J U N E 2 0 1 5 | I E E E S O F T WA R E 13
Authorized licensed use limited to: University of London: Online Library. Downloaded on July 07,2023 at 18:37:08 UTC from IEEE Xplore. Restrictions apply.
RELIABLE CODE
3. Boundary cases. Test the code ated from the high-level model don’t during normal system test phases
for the correct handling of cover all of the code, the model is but also later, when your code has
boundary conditions, where the incomplete and should be extended. reached the end user.
code is exercised at the edge of It’s also possible that the software For instance, you can place an
its operational profi le. contains too many parts that are assertion in the body of every loop
4. Stress testing. Test the code un- unrelated to the software require- in the code, to ensure that a reason-
der stress or overload conditions. ments. This can mean that you able maximum number of iterations
5. Error handling. Test the code for should delete them to slim the code is never exceeded. You’d be surprised
the correct handling of all con- base down to a more manageable how many bugs this one measure
ceivable error conditions, such (and testable) size. can catch early in software develop-
as invalid inputs, and ideally for In running the tests, look for ment. If you’re unsure about what
different combinations of com- cases in which the results differ from upper bound to use, multiply your
ponent failures. the model’s predictions. The prob- most generous guess by a thousand
lem can be with the model, the soft- or more. The real problem you’re de-
Error-handling code is often the ware, or the requirements. Model- fending against is an execution get-
least thoroughly tested part of any based testing can also make it easier ting stuck in an infi nite loop—for
software system and therefore the for formal-methods types like me to instance, when a linked list acciden-
most likely to contain latent defects. apply more rigorous forms of soft- tally becomes circular.
This is precisely the part of the sys- ware verification—for instance, with Another good strategy is to place
tem you want to be the most robust, the help of logic-model checkers. an assertion before every division op-
but it rarely is. An effective technique eration, to ensure you’re not acciden-
in this stage is to use test randomiza- Assert Yourself tally dividing by zero or a number
tion, also called fuzz testing, which Another way to improve the thor- very close to zero. Similarly, place an
has proven remarkably effective in oughness of a software test, and assertion before pointer dereference
fi nding unsuspected breaking points. with it the reliability of the target ap- operations, to check that they can’t
Another way to improve the rigor plication, is relatively simple: use as- cause a crash. You can use asser-
of software testing is to use model- sertions. As a rule of thumb, aim for tions similarly to check that param-
based testing. First, the system en- an average assertion density of one eters passed to a function are in a
gineer or software developer con- to two percent across all your code. safe range or that the result returned
structs a high-level model of how If you follow this rule, you won’t be to a caller passes a sanity check. If
you’re worried that in a time-critical
system, you can’t afford the cost of
evaluating a few extra Boolean ex-
pressions, you’re operating too close
Another way to improve to the margin. You should take this
the rigor of software testing as an indication that it’s time to
is to use model-based testing. refactor the code. No policeman will
be persuaded either if you claim that
you had no time to stop at a red traf-
fic light.
the software should work. This alone: Microsoft follows it in the Of- Statement Coverage
high-level model can then be used to fice software suite, 3 and NASA’s Jet A common goal in testing, inspired
derive, often automatically, a suite Propulsion Laboratory (JPL) uses it by guidelines such as DO-178B/C
of test cases. The model should en- in the development of its mission- (which deals with software safety
capsulate as many software require- critical fl ight code. for airborne systems), is to ensure
ments as possible, which means that Using assertions can ensure that that all your tests combined secure
the tests can check that the require- you catch defects at the earliest pos- full statement and branch coverage.
ments are met. If the tests gener- sible point in an execution, not only This means that each statement in
14 I E E E S O F T WA R E | W W W. C O M P U T E R . O R G / S O F T W A R E | @ I E E E S O F T WA R E
Authorized licensed use limited to: University of London: Online Library. Downloaded on July 07,2023 at 18:37:08 UTC from IEEE Xplore. Restrictions apply.
RELIABLE CODE
your code must be exercised by at And, oh yeah, don’t disable those Accident Reports,” presentation at the
Software and Complex Electronic Hard-
least one test, and every clause in carefully crafted assertions when
ware Standardization Conf., 2005.
every conditional test must indepen- you ship a product to your custom- 3. C.A.R. Hoare, “Assertions: A Personal
dently evaluate to true and to false in ers. Microsoft doesn’t do so in Of- Perspective,” IEEE Annals of the History
of Computing, vol. 25, no. 2, 2003, pp.
at least one test. What’s sometimes fice, and neither does JPL when its 14–25.
forgotten is that it’s not enough to embedded software hitches a ride 4. A. Van Wijngaarden, B.J. Mailloux,
merely execute a statement; a test to Mars. The assertions can help and J.E.L. Peck, Revised Report on the
Algorithmic Language Algol 68, Springer,
must also actually check something. you detect, diagnose, and fix the la- 1976.
This is where assertions can again tent defects in your code before they 5. L.A. Clarke and D.S. Rosenblum, “A
prove their value: they provide some can do harm. In a sense, removing Historical Perspective on Runtime Asser-
tion Checking in Software Development,”
additional independent checks of an or disabling software assertions be- ACM SIGSOFT Software Eng. Notes, vol.
execution’s sanity. fore shipping a system to customers 31, no. 3, 2006, pp. 25–37.
The insight that assertions can would make as much sense as a car
help make systems more reliable isn’t maker removing the seatbelts and
new, of course. The familiar include airbags from a car after all crash GERARD J. HOLZMANN works at the Jet
file <assert.h>, with the definition of a tests have been completed. Propulsion Laboratory on developing stronger
methods for software analysis, code review, and
few macros to support the use of as-
testing. Contact him at [email protected].
sertions in C code, was added to the References
Unix C compilers as early as 1978. 1. F-35 Joint Strike Fighter: Problems
Completing Software Testing May Hinder
Mike Lesk (also responsible for the Delivery of Expected Warfighting Capa-
Unix tools lex and uucp) first added bilities, GAO-14-322, US Government
Selected CS articles and columns
Accountability Office, Mar. 2014, p. 18;
this file as one of several improve- www.gao.gov/assets/670/661842.pdf.
are also available for free at
ments he made to the C preprocessor. 2. C.M. Holloway, “Why You Should Read
http://ComputingNow.computer.org.
An assert keyword appeared ear-
lier in the 1972 definition of Algol
W. The language report on Algol
M AY / J U N E 2 0 1 5 | I E E E S O F T WA R E 15
Authorized licensed use limited to: University of London: Online Library. Downloaded on July 07,2023 at 18:37:08 UTC from IEEE Xplore. Restrictions apply.