Today: Random Testing Again
Today: Random Testing Again
More YAFFS & project Grill Alex! Maybe Ive forgotten something important CUTE: concolic testing
1
Random Testing
Random testing is, of course, the most used and least useful method
We mean random in the mathematical sense Take a stream of pseudo-random numbers and map them into test operations/cases
Random Testing
Hamlet talks about one advantage of random testing (that often doesnt really appear):
With random testing and an operational profile giving usage patterns for the program, with probabilities
In program testing, with systematic methods we know what we are doing, but not what it means; only by giving up all systematization can the significance of testing be known. - Hamlet, Random Testing
Its 99% certain that P will fail no more than 1 in 1,000,000 times. Its 95% certain that P has a mean-time-tofailure greater than 100 hours of operation. Real statistics!
Sadly, usable operational profiles with probabilities attached are very rare
And the numbers mean nothing if the profile is something you make up
4
Random Testing
Hamlet also notes that random testing is a good baseline for other methods to compare to
Keeps us honest If systematic is no better, then it may not be a very good approach Whats good about 80% (no loop) path coverage?
If, on the other hand, a comparison with random testing as the standard were available, it might help us to write better standards, or to improve the significance of systematic methods. - Hamlet, Random Testing
Hamlets Claims
Two cases when only random testing will do (Hamlet, Workshop on Random Testing 06)
Cases where systematic testing is meaningless (no plan has a rational basis) Cases where systematic testing is too difficult to carry out Hamlet emphasizes the dangers of adding systematic choice without justification: confusing what software should do with what it does do
6
Hamlets Claims
Danger of ignoring a test case because
Oh come on, it couldnt possibly fail to handle that correctly or Nobody would ever do that
Compare to game theory: cases where if we really know something about opponents play we can take advantage
But, lacking that, random strategy may be inefficient but is the only strategy that cannot be gamed if opponent knows what were up to This is not to imply that programs we test
are adversaries, out to get us but its sometimes useful to act as if they are
7
...
int fs_read (int fd, char* buf, size_t nbytes) { if (buf == NULL) { errno = EINVAL; return -1 ; } if (!in_table(fd)) { errno = EBADF; return -1 ; } assert(1 ); ... } int main () { int i; int fd = nondet_int(); int nbytes = nondet_size_t(); havoc(file_system_state); file_system_state_old = file_system_state; ... int res = fs_read (fd, NULL, nbytes); assert(file_system_state = old_file_system_state); }
What, exactly, is a random C program? Is a random C program going to fit any sane (but unknown) operational profile? Are these the bugs we care about most? For some programs, producing wellformed input that makes for interesting tests is fundamentally hard
10
If the program only breaks when x = 2^31 dont expect to find that randomly
11
Good luck finding that if you dont bake it into the random tester explicitly. . .
12
Document, preferably a pdf More on how to submit in a second Submit as .c (or .h I guess) file, where the name is original_yaffs_name.login.bug#.c And two test cases (more on this too)
Tester
13
And see it run Use whatever language you see fit, so long as that holds true Admittedly, if I cant make head or tail of your tester (say its in FORTRAN or unlambda), grading it fairly will be harder
14
Tester Output
If YAFFS passes the test, your tester should terminate with error code 0 and print (on standard output) the string:
TEST FAILED
Tester Output
Ill count it as a case where you find a bug in YAFFS if the program hangs: Hasnt terminated by the time limit of 60 minutes Is not producing any new output
16
Tester Output
Sanity check Im going to make sure none of your testers say TEST FAILED or hang when run with the original YAFFS So let me know if you have found a YAFFS bug
17
Tester Output
If you want to use a script to have your directtest2k run another tool on YAFFS, and then parse the output to produce that result, thats ok with me Its worth some points, but not strictly required, that your tool also be able to produce a test case when a test fails something more specific than run the tester
18
Tester Output
Bonus if you include delta-debugging tools for your test case format C programs (or python scripts) are very nice test cases, and easily deltadebuggable Document your test case format and why you chose it in the test report
19
vs.
21
22
What to Test
You must test these functions:
yaffs_StartUp yaffs_mount yaffs_unmount yaffs_open yaffs_write yaffs_read yaffs_close yaffs_mkdir yaffs_rmdir yaffs_unlink
23
What to Test
Can use other functions to figure out whats going on with YAFFS Might make it easier to find some bugs But only use these basics in the test cases you submit for your bugs make sure the bug can be exposed using only the core operations! For open, need to test these options:
24
How to Test
Perform all tests on /ram2k Use my replacement yaffscfg2k.c On the website
25
Warning: academic software, dont expect it to work (Im having difficulties right now)
CIL is a very useful tool if your testing ambitions involve instrumenting the code somehow (http://hal.cs.berkeley.edu/cil)
E.g., want to compute path coverage? Instrument every branch with a bit vector insertion
26
Look around you might find something useful that will save you a lot of work Warning, again: the academic software is often not-quite-ready-for-prime-time
27
Questions?
28
Some advice
Take a look at my tester Take a look at one of the bugs Figure out why my tester cant find it
29