Software Testing Methodologies Course Page: Paths, Path Products and Regular Expressions
Software Testing Methodologies Course Page: Paths, Path Products and Regular Expressions
SCHEDULE
RESOURCES
GLOSSARY
Interpret the control flowgraph and identify the path products, path sums and
path expressions.
Find the all possible paths (Max. Path Count) of a given flow graph.
Calculate the probability of paths and understand the need for finding the
probabilities.
UNIT III
Transaction
Flows
Transaction
Flow Testing
Techniques
ASSIGNMENTS
Implementat
ion
Basics of
Data Flow
Testing
Strategies in
Data Flow
Testing
Application
of Data Flow
Testing
Summary
TOP
UNIT IV
Domains
and Paths
Nice and
Ugly
domains
Domain
Testing
Domain and
Interface
Testing
Domains
and
Testability
Summary
MOTIVATION:
o
PATH PRODUCTS:
o
UNIT V
Path
products
and Path
expression
Reduction
Procedure
Applications
Regular
Expressions
and Flow
Anomaly
Detection
Summary
For example, if you traverse links a,b,c and d along some path,
the name for that path segment is abcd. This path name is also
called a path product. Figure 5.1 shows some examples:
UNIT VI
Logic Based
Testing
Decision
Tables
Path
Expressions
KV Charts
Specification
s
Summary
PATH EXPRESSION:
o
Denote that set of paths by Upper case letter such as X,Y. From
Figure 5.1c, the members of the path set can be listed as
follows:
ac,
abc,
abbbbc.............
o
abbc,
abbbc,
ac+abc+abbc+abbbc+abbbbc+...........
o
"Path Expression.".
PATH PRODUCTS:
o
XY=abcdefghij
o
Similarly,
o YX=fghijabcde
o aX=aabcde
o Xa=abcdea
XaX=abcdeaabcde
o
o a1 = a; a2 = aa;
aaaa . . . n times.
a3
aaa;
Similarly, if
X = abcde
then
X1 = abcde
X2 = abcdeabcde = (abcde)2
X3 = abcdeabcdeabcde = (abcde)2abcde
= abcde(abcde)2 = (abcde)3
an
RULE 1: A(BC)=(AB)C=ABC
where A,B,C are path names, set of path names or path
expressions.
o
o a0 = 1
o X0 = 1
PATH SUMS:
o
The "+" sign was used to denote the fact that path names were
part of the same set of paths.
Links a and b in Figure 5.1a are parallel paths and are denoted
by a + b. Similarly, links c and d are parallel paths between the
next two nodes and are denoted by c + d.
If X and Y are sets of paths that lie between the same pair of
nodes, then X+Y denotes the UNION of those set of paths. For
example, in Figure 5.2:
o RULE 2: X+Y=Y+X
o RULE 3: (X+Y)+Z=X+(Y+Z)=X+Y+Z
DISTRIBUTIVE LAWS:
o
o e(a+b)(c+d)f=e(ac+ad+bc+bd)f
eacf+eadf+ebcf+ebdf
ABSORPTION RULE:
o
If X and Y denote the same set of paths, then the union of these
sets is unchanged; consequently,
For example,
o if X=a+aa+abc+abcd+def then
X+a = X+aa = X+abc = X+abcd = X+def
= X
It follows that any arbitrary sum of identical path expressions
reduces to the same path expression.
LOOPS:
o
ab*c=ac+abc+abbc+abbbc+................
o
Evidently,
Xn =
+Xn
X0+X1+X2+X3+X4+X5+..................
RULES 6 - 16:
o
o RULE 6: Xn + Xm = Xn if n>m
RULE 6: Xn + Xm = Xm if m>n
RULE 7: XnXm = Xn+m
RULE
RULE
RULE
RULE
RULE
8: XnX* = X*Xn = X*
9: XnX+ = X+Xn = X+
10: X*X+ = X+X* = X+
11: 1 + 1 = 1
12: 1X = X1 = X
RULE 13: 1n = 1n = 1* = 1+ = 1
No matter how often you traverse a path of zero length,It is a
path of zero length.
REDUCTION PROCEDURE:
TOP
(a + b)(c + d + e) = ac + ad
+ + ae + bc + bd + be
In the first way, we remove the self-loop and then multiply all
outgoing links by Z*.
In the second way, we split the node into two equivalent nodes,
call them A and A' and put in a link between them whose path
expression is Z*. Then we remove node A' using steps 4 and 5 to
yield outgoing links whose path expressions are Z*X and Z*Y.
PARALLEL
TERM
(STEP
6):
Removal of node 8 above led to a pair of parallel links between
nodes 4 and 5. combine them to create a path expression for an
equivalent link whose path expression is c+gkh; that is
LOOP
TERM
(STEP
7):
Removing node 4 leads to a loop term. The graph has now been
replaced with the following equivalent simpler graph:
a(bgjf)*b(c+gkh)d((ilhd)*imf(b
jgf)*b(c+gkh)d)*(ilhd)*e
APPLICATIONS:
TOP
APPLICATIONS:
o
appropriate
set
of
"arithmetic"
characterizes the property.
rules
that
The question is not simple. Here are some ways you could ask
it:
1. What is the maximum number of different paths
possible?
2. What is the fewest number of paths possible?
3. How many different paths are there really?
4. What is the average number of paths?
Label each link with a link weight that corresponds to the number
of paths that link represents.
Also mark each loop with the maximum number of times that
loop can be taken. If the answer is infinite, you might as well stop
the analysis because it is clear that the maximum number of
paths will be infinite.
There are three cases of interest: parallel links, serial links, and
loops.
EXAMPLE:
1. The following
program.
is
reasonably
well-structured
Path
expression:
a(b+c)d{e(fi)*fgj(m+l)k}*e(fi)*f
gh
2. A: The flow graph should be annotated by replacing
the link name with the maximum of paths through
that link (1) and also note the number of times for
looping.
3. B: Combine the first pair of parallel loops outside the
loop and also the pair in the outer loop.
4. C: Multiply the things out and remove nodes to clear
the clutter.
5. For
the
Inner
Loop:
D:Calculate the total weight of inner loop, which can
execute a min. of 0 times and max. of 3 times. So, it
inner loop can be evaluated as follows:
13 = 10 + 11 + 12 + 13 = 1 + 1 + 1 + 1 = 4
6. E: Multiply the link weights inside the loop: 1 X 4 = 4
7. F: Evaluate the loop by multiplying the link wieghts: 2
X 4 = 8.
8. G: Simpifying the loop further results in the total
maximum number of paths in the flowgraph:
2 X 84 X 2 = 32,768.
Alternatively, you could have substituted a "1" for each link in the
path
expression
and
then
simplified,
as
follows:
a(b+c)d{e(fi)*fgj(m+l)k}*e(fi)*fgh
= 1(1 + 1)1(1(1 x 1)31 x 1 x 1(1 + 1)1)41(1 x 1)31 x 1 x 1
=
2(131
x
(2))413
4
=
2(4
x
2) x
4
= 2 x 84 x 4 = 32,768
o
Actually, the outer loop should be taken exactly four times. That
doesn't mean it will be taken zero or four times. Consequently,
there is a superfluous "4" on the outlink in the last step.
Therefore the maximum number of different paths is 8192 rather
than 32,768.
STRUCTURED FLOWGRAPH:
EXAMPLE:
1. Applying the arithmetic to the earlier example gives
us the identical steps unitl step 3 (C) as below:
previous example:
Path selection should be biased toward the low - rather than the
high-probability paths.
This
raises
an
interesting
question:
uninteresting nodes.
o
because PL + PA + PB + PC = 1, 1 - PL = PA + PB + PC,
and
EXAMPLE:
1. Here is a complicated bit of logic. We want to know
the probability associated with cases A, B, and C.
3. Case B is simpler:
probability.
7. How about path probabilities? That's easy. Just trace
the path of interest and multiply the probabilities as
you go.
8. Alternatively, write down the path name and do the
indicated arithmetic operation.
9. Say that a path consisted of links a, b, c, d, e, and
the associated probabilities were .2, .5, 1., .01, and I
respectively. Path abcbcbcdeabddea would have a
probability of 5 x 10-10.
10. Long paths are usually improbable.
The model has two weights associated with every link: the
processing time for that link, denoted by T, and the probability of
that link P.
EXAMPLE:
1. Start with the original flow graph annotated with
probabilities and processing time.
2. PUSH/POP, GET/RETURN:
o
The
question
is:
P(P
+
1)1{P(HH)n1HP1(P
H)1}n2P(HH)n1HPH
5. Simplifying by using the arithmetic tables,
4. G(G
+
=
G(G
=
(G
= (G4 + G2)R*
+
+
R)G(GR)*GGR*R
R)G3R*R
R)G3R*
The node-by-node reduction procedure, and most graph-theorybased algorithms work well when all paths are possible, but may
provide misleading results when some paths are unachievable.
TOP
THE PROBLEM:
o
The generic flow-anomaly detection problem (note: not just dataflow anomalies, but any flow anomaly) is that of looking for a
specific sequence of options considering all possible paths
through a routine.
THE METHOD:
o
You now have a regular expression that denotes all the possible
sequences of operators in that graph. You can now examine that
regular expression for the sequences of interest.
As
A
B
C
T
an
example,
let
= pp
= srr
= rp
= ss
However,
let
A
B
C
T
= p + pp + ps
= psr + ps(r + ps)
= rp
= P4
LIMITATIONS:
o
There are some nice theorems for finding sequences that occur
at the beginnings and ends of strings but no nice algorithms for
finding strings buried in an expression.
SUMMARY:
TOP
A flowgraph annotated with link names for every link can be converted into a
path expression that represents the set of all paths in that flowgraph. A
By substituting link weights for all links, and using the appropriate arithmetic
rules, the path expression is converted into an algebraic expression that can
be used to determine the minimum and maximum number of possible paths
in a flowgraph, the probability that a given node will be reached, the mean
processing time of a routine, and other models.
With links annotated with the appropriate weights, the path expression is
converted into a regular expression that denotes the set of all operator
sequences over the set of all paths in a routine. Rules for determining
whether a given sequence of operations are possible are given. In other
words, we have a generalized flow-anomaly detection method that'll work for
data-flow anomalies or any other flow anomaly.
All flow analysis methods lose accuracy and utility if there are unachievable
paths. Expand the accuracy and utility of your analytical tools by designs for
which all paths are achievable. Such designs are always possible.