STMT Unit-2
STMT Unit-2
UNIT II
UNIT-II: Flow graphs and Path testing: Basics concepts of path testing,
predicates, path predicates and achievable paths, path sensitizing, path
instrumentation, application of path testing.
If the set of paths is properly chosen, then we have achieved some measure
of test thoroughness.
Motivation:
Path testing is most applicable to new software for unit testing. It is a
structural technique.
Path testing is rarely used for system testing. For the programmer it is the
basic test technique.
Path-testing techniques are the oldest of all structural test techniques. They
are recorded as being in use at IBM for more than two decades.
Path testing is most applicable to new software for unit testing. It requires
complete knowledge of the program‘s structure (i.e., source code). It is most
often used by programmers to unit test their own code.
The effectiveness of path testing rapidly deteriorates [to become worse, degrade]
as the size of the software aggregate under test increases. For the programmer,
it is the basic test technique.
Process blocks
Decisions
Case statements
Junctions
A process has one entry and one exit. It can consist of a single statement or
instruction, a sequence of statements or instructions;
Process:
Decision:
Case Statement:
From the point of test design, there are no differences between decisions and
case statements.
2.2.3. Junctions
A junction is a point in the program where the control flow can merge.
Examples of junctions are: the target of a jump or skip (In assembly language),
a label that is the target of a GOTO, the END-IF and CONTINUE statements in
FORTRAN, END and UNTIL (In Pascal).
Junctions:
The flowchart focuses on process steps, whereas the control flow graph
ignores them.
The first step is translating this code to a control flow graph is as shown in the
below figure, where we have the typical one-to-one flowchart.
Note that the complexity has increased, clarity has decreased, and that we had
to add auxiliary labels (LOOP, XX, and YY), which have no actual program
counterpart. In the below figure, we merged the process steps and replaced
with the single process box. We now have a control flow graph.
But this representation is still too busy. We simplify this notation further to
achieve figure 4. To do that, we had to make several more notational changes.
2. We don‘t need to know the specifics of the decisions, just the fact that
there is a branch—so we can do away with things such as ―U>V?‖,
―yes‖, ―no‖, and so on.
3. The specific target label names aren‘t important--- just the fact that
they exist. So we replace them by simple numbers.
Figure 4 is the way we usually represent the program‘s control flow graph. In
this, there are two types of components: Circles and Arrows, which joins
circles.
Note: The entry and exit are also denoted by circles (nodes).
Nodes are usually numbered (or) labeled by using the original program labels.
The link name can be formed from the names of the nodes it spans. Thus a link
from node 7 to node 4 is called link (7, 4), where as a link from node 4 to node
7 is called link (4,7). For parallel links, between a pair of nodes (nodes 12 and
13 in figure 4), we can use subscripts, to denote each one or some
unambiguous notation such as ―(12, 13 upper)‖ and ―(12, 13 lower)‖.
An alternate way to name links that avoids this problem is to use a unique
lowercase letter for each link in the flow graph.
The final transformation is shown in figure 5, where we‘ve dropped the node
numbers to achieve an even simpler representation.
The way to work with control flow graphs is to use the simplest possible
representation—that is, no more information available to correlate back to the
source program.
The linked list is the representation of choice for programs that manipulate or
create flow graphs.
The translation from a flow graph element to a statement and vice versa is not
always unique. Myers cites an anomaly based on different representations of
the FORTRAN statement IF (A=0).AND. (B=1) THEN…..‖. it has the alternate
representations shown in figure 7.
Note: The control Flow graph is a simplified version of the earlier flowchart.
The name of a path is the name of the nodes along the path. Alternatively,
if we choose to label the links, the name of the path is the succession of link
names along the path.
A path has a loop in it if any node (link) name is repeated in that path.
The terms entry/exit path and complete path are also used in the
literature to denote a path that starts at an entry and goes to an exit.
There are many paths between the entry and exit of a typical routine. Every
decision doubles the number of potential paths, and every loop multiplies the
number of potential paths by the number of different iteration values are
possible for the loop.
A lavish test approach might consist of testing all the paths, but that would not
be a complete test, because a bug could create unwanted paths (or) make
mandatory paths unexecutable. Just because of all paths are right, doesn‘t
mean that the routine is doing the required processing along those paths.
If ‗P1‘ is followed, then ‗P2‘ and ‗P3‘ are automatically followed; but ‗P1‘ is
impractical for most of the routines. It can be done only for the routines, those
have no loops.
‗P2‘ and ‗P3‘ might appear to be equivalent, but they are not. Here is a correct
version of routine:
For „X‘ is negative, the output is „X+A‟, while for X greater than or equal to
zero, the output is X+2A. Following P2 and executing every statement, but not
every branch, would not reveal the bug in the following incorrect version:
The hidden loop around label 100 is not revealed by tests based on ‗P3‘ alone
because no test forces the execution of statement100 and the fallowing GOTO
statement.
Any testing strategy based on paths must at least exercise every instruction
and take branches in all directions. Here, we are discussing three different
testing criteria or strategies out of a potentially infinite family of strategies.
They are:
Execute all possible control flow paths through the program. Typically it is
restricted to all possible entry/exit paths through the program. If we achieve
this prescription, we are said to have achieved 100% path coverage. This is
the strongest criterion in the path-testing strategy family. It is generally
impossible to achieve.
Execute all statements in the program at least once under some test. If we do
enough tests to achieve this, we are said to have achieved 100% statement
coverage (100% node coverage). We denote this by C1. This is the weakest
criterion in the family. Testing less than this for a new software is
unconscionable [not guided by, not conforming to reason] and should be
criminalized.
Execute enough tests to assure that every branch alternative has been
exercised at least once under some test. If we do enough tests to achieve this
prescription, then we have achieved 100% branch coverage (100% link
coverage). We denote branch coverage by C2.
NOTE:
1) We usually drop the ―100%” so, when we say that we have ―achieved
branch coverage‖ means that, we have achieved 100% branch coverage.
Similarly for path and statement coverage also.
Statement and Branch coverage have also been used for more than two
decades as minimum mandatory unit test requirements for new code at IBM
and other major software companies.
You must pick enough paths to achieve C1+C2. The question is, what is the
fewest number of such paths is interesting to the designer of test tools that
help automate path testing.
It is better to take many simple paths than a few complicated paths. There‘s no
harm in taking paths that will exercise the same code more than once.
Start at the beginning and take the most obvious path to the exit. The most
obvious path in the above figure is (1, 3, 4, 5, 6, 2), if we name it by nodes, or
abcde if we name it by the links.
Then take the next most obvious path, abhkgde. All other in this example
leads to loops. Take a single loop first, build if possible on a previous path,
such as abhlibcde. Then take another loop abcdfjgde. Finally, abcdfmibcde.
4) Continue the tracing of paths until all lines on the master sheet are
covered, that indicates that you appear to have achieved C1+C2.
As you trace the paths, create a table that shows the paths, the coverage status
of each process, and each decision. The above path leads to the following table:
After you have traced a covering path set on the master sheet and filled in the
table for every path, check the following:
Note:
You could select paths with the idea of achieving coverage without
knowing anything about what the routine is supposed to do.
Favor short paths over long paths, simple paths over complicated paths.
Don‘t follow the rules slavishly [blindly imitative] --- except for
Coverage.
2.4. Loops
Cases for a single loop: A Single loop can be covered with two cases: Looping
and Not looping. But, experience shows that many loop-related bugs are not
discovered by C1+C2. Bugs hide themselves in corners and congregate at
boundaries - in the cases of loops, at or around the minimum or maximum
number of times the loop can be iterated. The minimum number of iterations is
often zero, but it need not be.
1. Try bypassing the loop (zero iterations). If you can't, you either have a
bug, or zero is not the minimum and you have the wrong case.
1. Try one less than the expected minimum. What happens if the loop
control variable's value is less than the minimum? What prevents the value
from being less than the minimum?
2. The minimum number of iterations.
3. One more than the minimum number of iterations.
4. Once, unless covered by a previous test.
5. Twice, unless covered by a previous test.
6. A typical value.
7. One less than the maximum value.
8. The maximum number of iterations.
9. Attempt one more than the maximum number of iterations.
1. Treat single loops with excluded values as two sets of tests consisting of
loops without excluded values, such as case 1 and 2 above.
2. Example, the total range of the loop control variable was 1 to 20, but that
values 7, 8,9,10 were excluded. The two sets of tests are 1-6 and 11-20.
Prepared by: Dept. of CSE, RGMCET Page 19
SOFTWARE TESTING METHODOLOGIES AND TOOLS
3. The test cases to attempt would be 0,1,2,4,6,7 for the first range and
10,11,15,19,20,21 for the second range.
2.4.1. The Kinds of Loops
Nested loops
Concatenated loops
Horrible loops
i) Nested Loops:
If you had five tests(assuming that one less than the minimum and one
more than the maximum were not achievable) for one loop, a pair of nested
loops would require 25 tests, and three nested loops would require 125. This is
heavy process. You can‘t always afford to test all combinations of nested loops‘
iteration values. Here‘s a tactic to use to discard some of these values:
1. Start at the innermost loop. Set all the outer loops to their minimum
values.
2. Test the minimum, minimum+1, typical, maximum-1, and maximum for
the innermost loop, while holding the outer loops at their minimum-
iteration-parameter values Expand the tests as required for out-of-range
and excluded values.
3. If you‘ve done the outermost loop, GOTO step 5, ELSE move out one loop
and set it up as in step 2-with all other loops set to typical values.
4. Continue outward in this manner until all loops have been covered.
5. Do the five cases for all loops in the nest simultaneously.
This procedure works out to twelve tests for a pair of nested loops,
sixteen for three nested loops, and nineteen for four nested loops. Practicality
may prevent testing in which all loops achieve their maximum values
simultaneously.
Concatenated loops fall between single and nested loops with respect to
test cases. Two loops are concatenated if it‘s possible to reach one after exiting
the other while still on a path from entrance to exit. If the loops cannot be on
the same path, then they are not concatenated and can be treated as individual
loops. Even if the loops are on the same path and you can be sure that they are
independent of each other, you can still treat them as individual loops; but if
the iteration values in one loop are directly or indirectly related to the iteration
values of another loop, and they can occur on the same path, then treat them
as you would nested loops. The problem of excessive processing time for
combinations of loop-iteration values should not occur because the loop-
iteration values are additive rather than multiplicative as they are for nested
loops.
Although some methods give you some insight into the design of test
cases for horrible loops, the resulting cases are not definitive and are usually
too many to executive. The thinking required checking the end points and
looping values for intertwined loops appears to be unique for each program. It‘s
also difficult at times to see how deeply nested the loops are, or indeed whether
there are any nested loops. The use of code that jumps into and out of loops,
makes iteration-value selection for test cases an awesome and ugly task, which
is another reason such structures should be avoided.
Predicates:
Path Predicate:
A predicate associated with a path is called a Path Predicate. For example, "x is
greater than zero", "x+y>=90", "w is either negative or equal to 10 is true" is a
sequence of predicates whose truth values will cause the routine to take a
specific path.
Multiway Branches:
The path taken through a multiway branch such as a computed GOTO's, case
statement, or jump tables cannot be directly expressed in TRUE/FALSE terms.
For example a three way case statement can be written as: If case=1 DO A1
ELSE (IF Case=2 DO A2 ELSE DO A3 ENDIF)ENDIF.
Inputs:
In testing, the word input is not restricted to direct inputs, such as variables
in a subroutine call, but includes all data objects referenced by the routine
whose values are fixed prior to entering it.
For example, inputs in a calling sequence, objects in a data structure,
values left in registers, or any combination of object types.
The input for a particular test is mapped as a one dimensional array called
as an Input Vector.
Predicate interpretation:
The simplest predicate depends only on input variables.
For example if x1,x2 are inputs, the predicate might be x1+x2>=7, given the
values of x1 and x2 the direction taken through the decision is based on the
predicate is determined at input time and does not depend on processing.
The path predicates take on truth values based on the values of input
variables, either directly or indirectly.
If a variable's value does not change as a result of processing, that
variable is independent of the processing.
If the variable's value can change as a result of the processing, the
variable is process dependent.
A predicate whose truth value can change as a result of the processing is
said to be process dependent and one whose truth value does not
change as a result of the processing is process independent.
Example: X1+3X2+17>=0
X3=17
X4-X1>=14X2
Any set of input values that satisfy all of the conditions of the path
predicate expression will force the routine to the path.
Compound Predicate:
Predicate Coverage:
Testing Blindness:
1. Assignment Blindness
2. Equality Blindness
3. Self-Blindness
1. Assignment Blindness:
For Example:
Correct Buggy
X=7 X=7
........ ........
if Y > 0 then ... if X+Y > 0 then ...
If the test case sets Y=1 the desired path is taken in either case, but there
is still a bug.
2. Equality Blindness:
For Example:
Correct Buggy
if Y = 2 then if Y = 2 then
........ ........
if X+Y > 3 then ... if X > 1 then ...
The first predicate if Y=2 forces the rest of the path, so that for any
positive value of x. the path taken at the second predicate will be the
same for the correct and buggy version.
3. Self-Blindness:
For Example:
Correct Buggy
X=A X=A
........ ........
if X-1 > 0 then ... if X+A-2 > 0 then ...
4. PATH SENSITIZING:
ADFGHIJKL+AEFGHIJKL+BCDFGHIJKL+BCEFGHIJKL
Each product term denotes a set of inequalities that if solved will yield an
input vector that will drive the routine along the designated path.
Solve any one of the inequality sets for the chosen path and you have
found a set of input values for the path.
If you can find a solution, then the path is achievable.
If you can‘t find a solution to any of the sets of inequalities, the path is
unachievable.
The act of finding a set of solutions to the path predicate expression is
called PATH SENSITIZATION.
5. PATH INSTRUMENTATION:
If we run the tested routine under a trace, then we have all the
information we need to confirm the outcome and, furthermore, to confirm
that it was achieved by the intended path.
The trouble with traces is that they give us far more information than we
need. In fact, the typical trace program provides so much information
that confirming the path from its massive output dump is more work
than simulating the computer by hand to confirm the path.
4. Link Counter:
New Code:
New code should always be subjected to enough path testing to achieve C2
Stubs are used where it is clear that the bug potential for the stub is
significantly lower than that of the called components
Old, trusted components will not be replaced by stubs
Some consideration is given to paths within called components
Typically, we will try to use the shortest entry/exit path that will do the task
Maintenance:
There is a great difference between maintenance testing and new code
testing
Maintenance testing is a completely different situation
It involves modifications which are accommodated in the system, as
required
Path testing is used firstly on the modified component
Rehosting:
Path testing with C1+C2 coverage is a powerful tool for rehosting old software
We get a very powerful, effective, rehosting process when C1+C2 coverage is
used in conjunction with automatic or semiautomatic structural test
generators
Software is rehosted because it is no longer cost effective to support the
environment in which it runs
The objective of rehosting is to change the operating environment and not
the rehosted software
Rehosting from one COBOL environment to another is easy by comparison
Rehosted software can be modified to improve efficiency and/or to
implement new functionality, which had been difficult in the old
environments
The test suites(collection) and all outcomes of the old environment become
the specification for the rehosted software