Paths, Path Products and Regular Expressions: UNIT-3
Paths, Path Products and Regular Expressions: UNIT-3
Paths, Path Products and Regular Expressions: UNIT-3
1 3 4 5 2
e f
b d
(eacf, eadf, ebcf, ebdf)
c
1 2 3 4
a b d
a c
1 3 4 5 2
e f
b d
Path Sums- continued.,
If X and Y are sets of paths that lie
between the same pair of nodes, then X+Y
denotes the UNION of those
U
set of paths.
X V
W
Y
f h k
f
d
i
j
1 3 4 5 2
e f
b d
Absorption Rule
If X and Y denote the same set of paths, then the
union of these sets is unchanged; consequently,
RULE 5: X+X=X (Absorption Rule)
If a set consists of paths names and a member of
that set is added to it, the “new” name, which is
already in that set of names, contributes nothing
and can be ignored.
For example, if X=a+aa+abc+abcd+def then
X+a= X+aa= X+abc+ X+abcd= X+def= X
It follows that any arbitrary sum of identical path
expressions reduces to the same path expression.
Loop
Loops can be understood as an infinite set
of parallel paths. Say that the loop consists
of a single link b. then the set of all paths
through that loop point is
b0+b1+b2+b3+b4+b5+……………..
b0
b1
b2
b3
bn
Loops – continued.,
This potentially infinite sum is denoted by b* for an
individual link and by X* when X is a path expression.
b
1 a 2 c 3
AC
A C
AD
D
BC AE
E
B
BD
BE
A Reduction Procedure-
Example
Applying this algorithm to the following
graph, we remove several nodes in order;
that is
1 a 3 b 4 c 5 d 6 e 2
f g h i
7 8 k 9 10
j l
m
Remove node 10 by applying step 4 and
combine by step 5 to yield
1 a 3 b 4 c 5 d 6 e 2
f g h
il
7 8 k 9 10
j
im
Remove node 9 by applying step4 and 5 to
yield
1 a 3 b 4 c 5 d 6 e 2
f g ilh
kh
7 8 9 10
j
im
Remove node 7 by steps 4 and 5, as
follows:
1 a 3 b 4 c 5 d 6 e 2
g ilh
jf kh
7 8 9 10
imf
Remove node 8 by steps 4 and 5, to obtain
1 a 3 b 4 c 5 d 6 e 2
7 8 9 10
imf
Parallel Term (step 6)
Removal of node 8 above led to a pair of
parallel links between nodes 4 and 5.
combine them to create a path expression
for an equivalent link whose path
expression is c+gkh; that is
1 a 3 b 4 5 d 6 e 2
C+gkh
gjf ilh
imf
Loop Term (step 7)
Removing node 4 leads to a loop term. The
graph has now been replaced with the
following equivalent simpler graph:
bgjf
b(c+gkh)
1 a 3 4 5 d 6 e 2
ilh
imf
Loop-removal operations
z
x Z*x
y
Z*y
(bgjf)*b(c+gkh)d
1 a 3 6 e 2
imf
(bgjf)*b(c+gkh)d (ilhd)*e
1 a 3 6 2
(ilhd)* imf
Remove node 3 to yield
a(bgjf)*b(c+gkh)d (ilhd)*e
1 6 2
l
k j
b i
a
d e f g h
c
Each link represents a single link and consequently is given a
weight of “1”To start. Lets say the outer loop will be taken exactly
four times and inner Loop Can be taken zero or three times
Path expression: a(b+c)d{e(fi)*fgj(m+l)k}*e(fi)*fgh
1
(4-4)
1 1 1
1 1 (0-3)
1 1 1 1 1 1
1
Annotated the flow graph by replacing the link name with the
maximum of paths through that link(1) also noted the number of
times for looping.
2
(4-4)
1 1
1 (0-3)
1 2 1 1 1 1 1
Combined the first pair of parallel loops outside the loop and also
the pair in the outer loop
2
(4-4)
1 (0-3)
2 1 1 1 1
Multiplied the things out and removed nodes to clear the clutter
For the inner loop,
2
(4-4)
2 1 4 1
Take care of the inner loop, there are four possibilities leading to
four values. Then multiplied by the following link weight.
2
(4-4)
2 4 1
2 84 4
=32768
Alternatively, you could have substituted a “1” for each link in the
path expression and then simplified, as follows:
1(1+1)1(1(1*1)31*1*1(1+1)1)41(1*1)31*1*1
=2(131*(2))413
but 13=1+11+12+13=4
=2(4*24)*4=2*84*4=32768
Structured Flowgraph
A structured flowgraph is one that can be
reduced to a single link by successive
application of the transformation of the
following:
A B A,B
PROCESS
A B A B
IF THEN ELSE
A
B
WHILE DO
A B A,B
REPEAT UNTIL
Structured Flowgraph-
continued.,
Flow graphs that do not contain one or
more of the graphs shown below as
subgraphs are structured.
Jumping into loops
Jumping out of loops
Branching into decisions
Branching out of decisions
Unstructured Sub Graphs
X
Unstructured Sub Graphs
X
Branching into
decisions
X
Branching out of
decisions
Lower Path Count Arithmetic
A lower bound on the Case Path Weight
number of paths in a expressi expressi
routine can be
on on
approximated for
structured flow graphs.
Parallels A+B WA+WB
The arithmetic is as
follows: Series AB max(WA
The values of the weights WB )
are the number of
members in a set of
Loop A n
1, W1
paths.
Minimum Path Count-example.,
l
k j
b i
a
d e f g h
c
1
(4-4)
1 1 1
1 1 (0-3)
1 1 1 1 1 1
1
2
(4-4)
1 1
1 (0-3)
1 2 1 1 1 1 1
Combined the first pair of parallel loops outside the loop and also
the pair in the outer loop
2
(4-4)
1
2 1 1 1
2
(4-4)
2 1 1
(4-4)
2 2 1
2 2 1
2
Mean Processing times of
Routines
Given the execution time of all statements
or instructions for every link in a flowgraph
and the probability for each direction for all
decisions are to find the mean processing
time for the routine as a whole.
The model has two weights associated with
every link: the processing time for that link,
denoted by T, and the probability of that
link P.
The rules for mean processing times are:
Case Path Weight expression
expression
Parallels A+B TA+B=(PATA+PBTB)/(PA+PB)
PA+B= PA+PB
Series AB TAB=TA+TB
PAB =PAPB
Loop An TA=TLPL/(1-PL)
PA =PA/(1-PL)
Example
20
(0.95)
300
(0.05) 15
14
(0.3)
25 12
(0.6)
10 (0.3) 16 10 7
5
8 (0.4) (0.7)
(0.7)
40
34
15
14
(0.3)
12
(0.6)
10 16 10 5 7
35.5 8 (0.4) (0.7)
Combine as many as serial links as you can
63
(0.3)
12
(0.6)
61.5 10 5 7
8 (0.4) (0.7)
Use the cross term step to eliminate a node
and to create the inner self loop.
63
(0.3)
20
(0.6)
61.5 10 13 7
(0.4) (0.7)
63
(0.3)
61.5 10 13 7
30 (0.7)
63
(0.3)
61.5 53 7
(0.7)
116
(0.3)
61.5 60
(0.7)
61.5 49.714 60
171.214
Regular Expressions and Flow Anomaly
Detection
The Problem
The generic flow-anomaly detection problem is that of looking for a
specific sequence of operations considering all possible paths
through a routine.
Example:
Let’s the operations are SET and RESET, denoted by s and r
respectively, and we want to know if there is a SET followed
immediately a SET or a RESET followed immediately by a RESET
(i, an ss or an rr sequence).
62
Regular Expressions and Flow Anomaly
Detection
1) A file can be opened (o), closed (c), read (r), or written (w).
If the file is read or written to after it’s been closed, the sequence is nonsensical.
Therefore, cr and cw are anomalous.
Similarly, if the file is read before it’s been written, just after opening, we may
have a bug. Therefore, or is also anomalous
2) A tape transport can do a rewind (d), fast-forward (f), read (r), write (w), stop
(p), and skip (k). The following sequences are anomalous: df, dr, dw, fd, and fr.
Does the flowgraph lead to anomalous sequences on any path? If so, what
sequences and under what circumstances?
63
Regular Expressions and Flow Anomaly
Detection
The Method
Annotate each link in the graph with the appropriate operator or the
null operator 1
Simplify things to the extent possible, using the fact that
a + a = a and 12 = 1
We get a regular expression that denotes all the possible
sequences of operators in that graph.
Examine that regular expression for the sequences of interest
64
Regular Expressions and Flow Anomaly
Detection
Huang's Theorem
As an example, let
A = pp
B = srr
C = rp
T = ss
65
Regular Expressions and Flow Anomaly
Detection
The theorem states that ss will appear in pp(srr)nrp if it appears in
pp(srr)2rp. We don’t need the theorem to see that ss does not
appear in the given string. However, let
A = p + pp + ps
C = rp T = P4
66
Regular Expressions and Flow Anomaly
Detection
67
Regular Expressions and Flow Anomaly
Detection
Huang’s theorem can be easily generalized to cover sequences of
greater length than two characters.
Beyond three characters, though, things get complex and this method
has probably reached its utilitarian limit for manual application.
If A, B, and C are nonempty sets of strings of one or more characters,
and if T is a string of k characters, and if T is a substring of AB nC,
where n is greater than or equal to k, then T is a substring of ABkC.
68
A sufficient test for strings of length k can be obtained by
substituting Pk for every appearance of P* (or Pn, where n is
greater than or equal to k). Recall that Pk = 1 + + P + P2 + P3
+ . . . + Pk
A warning concerning the use of regular expressions: there are
almost no other useful identities beyond those shown earlier for
the path expressions.
All flow analysis methods lose accuracy and utility if there are
unachievable paths
69
Regular Expressions and Flow
Anomaly Detection
The flow anomaly detection problem is that of looking for
a specific sequence of operations considering all possible
paths through a routine.
Here we are interested in knowing whether a specific
sequence occurred but not what the net effect of the
routine is.
The method of anomaly detection:
Annotate each link in the graph with the appropriate operator or
the null operator (1).
Simplify things to the extent possible.
After performing the above two steps you obtain a Regular
Expression that denotes the possible sequences of operators in
that graph.
You can now examine that regular expression for the sequence of
interest.