Lecture3 E
Lecture3 E
• C statement
printf ("Total = %d\n" , score ) ;
• both printf and score are lexemes matching the
pattern for token id, and " Total = %d\n" is a lexeme
matching literal.
Covering most or all of the tokens
1. One token for each keyword. The pattern for a
keyword is the same as the keyword itself.
2. Tokens for the operators, either individually or in
classes.
3. One token representing all identifiers.
4. One or more tokens representing constants, such
as numbers & literal strings.
5. Tokens for each punctuation symbol, such as left
&right parentheses, comma, & semicolon.
Attributes for Tokens
• When more than one lexeme can match a pattern, the
lexical analyzer must provide the subsequent compiler
phases additional information about the particular lexeme
that matched.
• For example, the pattern for token number matches both
0 and 1, but it is extremely important for the code
generator to know which lexeme was found in the source
program.
• Thus, in many cases the lexical analyzer returns to the
parser not only a token name, but an attribute value that
describes the lexeme represented by the token ;
• Token name influences parsing decisions, while the
attribute value influences translation of tokens after the
parse.
Attributes for Tokens
• Tokens have at most one associated attribute,
although this attribute may have a structure
that combines several pieces of information.
• Normally, information about an identifier-e.g.,
its lexeme, its type, and the location at which
it is first found is kept in the symbol table.
• Thus, the appropriate attribute value for an
identifier is a pointer to the symbol-table
entry for that identifier.
Example 3.2 : The token names & associated
attribute values for the Fortran statement
E = M * C ** 2
• Sequence of pairs:
<id, pointer to symbol-table entry for E>
<assign_op> [no need to assign]
<id, pointer to symbol-table entry for M>
<mult_op> [no need to assign]
<id, pointer to symbol-table entry for C>
<exp_op> [no need to assign]
<number, integer value 2>
Lexical Errors
• It is hard for a lexical analyzer to tell, without the
aid of other components, that there is a source-
code error.
• For instance, if the string fi is encountered for the
first time in a C program in the context :
fi ( a == f (x) ) . . .
• A lexical analyzer cannot tell whether fi is a
misspelling of the keyword if or an undeclared
function identifier.
• Since fi is a valid lexeme for the token id, the
lexical analyzer must return the token id to the
parser
Lexical Errors
• Let the parser - handle an error due to transposition of the
letters.
• However, suppose a situation arises in which the lexical
analyzer is unable to proceed because none of the patterns
for tokens matches any prefix of the remaining input.
• Error Recovery
The simplest recovery strategy is "panic mode" recovery.
We delete successive characters from the remaining input,
until the lexical analyzer can find a well-formed token at the
beginning of what input is left.
This recovery technique may confuse the parser, but in an
interactive computing environment it may be quite
adequate.
• Other Recovery
This last rule says that we can add additional pairs of parentheses
around expressions without changing the language they denote.
• We may drop certain pairs of parentheses
• Conventions
• a) The unary operator * has highest precedence & is
left associative.
• b) Concatenation has second highest precedence
and is left associative.
• c) I has lowest precedence and is left associative.
• (a) I((b) *(c)) == alb* c.
• Both expressions denote the set of strings that are
either a, single a or are zero or more b's followed by
one c.
Example 3.4 : Let = {a, b} .
1. The regular expression a|b denotes the language {a, b} .
2. (alb) (alb) denotes {aa, ab, ba, bb} , the language of all
strings of length two over the alphabet . Another regular
expression for the same language : aa l ab l ba l bb
3. a* denotes the language consisting of all strings of zero or
more a's: {, a, aa, aaa, ... }.
4. ( a I b) * denotes the set of all strings consisting of zero or
more instances of a or b,
all strings of a's and b's: { , a, b, aa, ab, ba, bb, aaa, ... }.
Another regular expression for same language: (a* b * )*.
5. ala* b denotes the language {a, b, ab, aab, aaab, ... },
the string a & all strings consisting of zero or more a's &
ending in b.
• Regular set: A language that can be defined by
a regular expression
• If two regular expressions r and s denote the
same regular set , we say they are equivalent
and write r = s. For instance, (alb) = (b la).
Algebraic laws for regular expressions r, s, & t
Regular Definition
• If is an alphabet of basic symbols, then a
regular definition is a sequence of definitions
of the form:
• d1 r1
• d2 r2
------
Dn rn
• 1. Each di is a new symbol, not in and not the
same as any other of the d's,
• 2. Each ri is a regular expression over the
alphabet U {d1 , d2 , . . . , di-1
• How: first replacing uses of d1 in r2 (which cannot use any of the d's
except for d1), then replacing uses of d1 and d2 in r3 by r1 and (the
substituted) r2 , and so on.
• digit 0| 1 |· · · | 9
• optionalFraction . digits I
• optionalExponent ( E ( + I - I ) digits ) I
re Meaning
+ single + character
! single ! character
= single = character
!= 2 character sequence
<= 2 character sequence
xyzzy 5 character sequence
Extensions of Regular Expressions
• 1 . One or more instances. The unary, postfix operator + represents the
positive closure of a regular expression and its language.
That is, if r is a regular expression, then (r)+ denotes the language (L(r) ) + .
The operator + has the same precedence and associativity as the operator *
Two useful algebraic laws, r* = r+ |e and r+ = rr* = r*r relate the Kleene
closure & positive closure.
• 2. Zero or one instance. The unary postfix operator ? means "zero or one
occurrence." That is, r? is equivalent to r l , or put another way, L (r?) = L (r)
U {}.
The ? operator has the same precedence and associativity as * and + .
• 3. Character classes. A regular expression a1l a2| · · · I an , where the ai 's are
each symbols of the alphabet, can be replaced by the shorthand [a1 a2 · · ·
an].
when a1 , a2 , . · · , an form a logical sequence, e.g., consecutive uppercase
letters, lowercase letters, or digits, we can replace them by a1-an , that is,
just the first and last separated by a hyphen.
[abc] == a|b|c, [a-z] == a|b| · · · |z
Recognition of Tokens
• Build a piece of code that examines the input
string & finds a prefix that is a lexeme
matching one of the patterns.
Example: Patterns
Terminals:
if, then, else,
relop , id,
number---names
of tokens
ws ( blank | tab | newline )+
a) The same symbol can label edges from one state to several
different states,
b) An edge may be labeled by , the empty string, instead of, or
in addition to, symbols from the input alphabet.
• Example: R = (alb) * abb
NFA
Thus, the only strings getting to the accepting state are those that
end in abb.
Transition Tables
• Rows : states
• Columns: input symbols and .
• The entry for a given state & input is
value of the transition function
applied to those arguments.
• If the transition function has no
information about that state-input
pair, put .
• Adv: Easily find the transitions on a
given state and input.
• Disadv: takes a lot of space, when
the input alphabet is large,
Acceptance of Input Strings by
Automata
• An NFA accepts input string x if & only if there
is some path in the transition graph from the
start state to one of the accepting states
• labels along the path are effectively ignored,
since the empty string does not contribute to
the string constructed along the path.
Example : The string aabb is accepted by the NFA
• States = squares.
• Inputs = r (move to an adjacent red square)
and b (move to an adjacent black square).
• Start state, final state are in opposite
corners.
55
Example: Chessboard – (2)
1 2 3
r b
1 2,4 5
4 5 6 2 4,6 1,3,5
3 2,6 5
7 8 9 4 2,8 1,5,7
5 2,4,6,8 1,3,7,9
r b b 6 2,8 3,5,9
1 2 1 5 7 4,8 5
4 3 1
8 4,6 5,7,9
5 3
* 9 6,8 5
7 7
9 Accept, since final state reached 56
Example
• An NFA accepting all strings that end in 01
0,1
Start 0 1
q0 q1 q2
Input: 00101
q0 q0 q0 q0 q0 q0
q1 q1 q1
(Stuck)
q2 q2 Accepted
(Stuck)
1 0 1
0 0 57
Example
• NFA that has an input alphabet {0} consisting of a
single symbol. It accepts all strings of the form 0k
where k is a multiple of 2 or 3 (accept: , 00, 0000,
000000 but not 0, 00000)
58
Example
q2 q3
a, b
59
Transition Table
NFA A= ({q0,q1,q2},{0,1}, d ,q0,{q2})
0,1
Start 0 1
q0 q1 q2
0 1
q0 {q0,q1} {q0}
q1 Ø {q2}
*q2 Ø Ø
60
Transition Table
• Accept all strings that contains either 101
or 11 as a substring (010110)
0,1
0,1
Start 1 0, 1
q1 q2 q3 q4
61
Deterministic Finite Automata (DFA)
1. There are no moves on input
2. For each state s & input symbol a, there is
exactly one edge out of s labeled a
• If we are using a transition table to represent
a DFA, then each entry is a single state.
• Represent this state without the curly braces
that we use to form sets.
• Lexical Analyzer---DFA
Algorithm: Simulating a DFA.
• INPUT: An input string x terminated by an end-of-
file character eof. A DFA D with start state s0 ,
accepting states F, and transition function move.
• OUTPUT: Answer "yes" if D accepts x ; "no"
otherwise.
• METHOD: Apply the algorithm to the input string
x. The function move(s, c) gives the state to which
there is an edge from state s on input c. The
function nextChar returns the next character of
the input string x.
(a|b)* abb
ababb,
Sequence of states: 0, 1 , 2, 1 , 2, 3
& returns "yes."
Example
Draw the Transition Diagram for the DFA
accepting all string with a substring 01.
1 0 0,1
Start 0 1
q0 q2 q1
A=({q0,q1,q2},{0,1}, d ,q0,{q1})
Check with the string 01,11010,100011,
0111,110101,11101101, 111000 65
Transition Function & Table
1 0 0,1
Start 0 1
q0 q2 q1
(q0,0)=q2
(q0,1)=q0 0 1
(q1,0)=q1
(q1,1)=q1
q0 q2 q0
(q2,0)=q2 *q1 q1 q1
(q2,1)=q1
q2 q2 q1
Example
Let us design a DFA to accept the language
L={w | w has both an even number of 0’s
and even number of 1’s} q00(even) 1 (even)
q 0(even) 1 (odd) 1
q20(odd) 1 (even)
1 q30(odd) 1 (odd)
Start
q q
0
1 1
0 1
*q0 q2 q1 0 0 0 0
q1 q3 q0
1
q2 q0 q3 q q
2 3
q3 q3 q1 1
67
Example: Try Yourself
• A = {w | w contains at least one 1 and an even
number of 0s follow the last 1
• Hints: A1 = (Q, , d, q1, F)
1. Q = {q1, q2, q3}
2. = {0, 1}
3. d try yourself
4. Start state: q1
5. Final state: {q2}
68
Example
0 1 1
q1 q2
q1 q2
71
DFA vs. NFA
Parallel computation
tree
reject
accept
Accept/reject
72
NFA to DFA
• Subset Construction Algorithm
Subset Construction
• Given an NFA with states Q, inputs Σ,
transition function δN, state state q0, and
final states F, construct equivalent DFA with:
– States 2Q (Set of subsets of Q).
– Inputs Σ.
– Start state {q0}.
– Final states = all those with a member of F.
74
Subset Construction
• Given, NFA: N = (QN, Σ, dN, q0, FN)
• Goal: DFA, D = (QD, Σ, dD, {q0}, FD)
• L(D) = L(N)
States
QD is the set of subsets of QN
- QD is the power set of QN
- If QN has n states, QD will have 2n states
Inaccessible states can be thrown away, so
effectively, the number of states D << 2n
75
Subset construction
Final States
• FD is the set of subsets S of QN such that S FN
. That is FD is all sets of N’s states that
include at least one accepting state of N.
Transition Function
• The transition function δD is defined by:
δD({q1,…,qk}, a) is the union over all i = 1,…,k of
δN(qi, a).
76
Subset Construction: Example 1
• Example: We’ll construct the DFA
equivalent of our “chessboard” NFA.
1 2 3
4 5 6
7 8 9
77
Example: Subset Construction
r b r b
78
Example: Subset Construction
r b
r b
1 2,4 5 {1} {2,4} {5}
{2,4} {2,4,6,8} {1,3,5,7}
2 4,6 1,3,5
{5}
3 2,6 5
{2,4,6,8}
4 2,8 1,5,7 {1,3,5,7}
5 2,4,6,8 1,3,7,9
6 2,8 3,5,9
7 4,8 5
*
8 4,6 5,7,9
9 6,8 5
79
Example: Subset Construction
r b r b
80
Example: Subset Construction
r b
r b
{1} {2,4} {5}
1 2,4 5
{2,4} {2,4,6,8} {1,3,5,7}
2 4,6 1,3,5
{5} {2,4,6,8} {1,3,7,9}
3 2,6 5 {2,4,6,8} {2,4,6,8} {1,3,5,7,9}
4 2,8 1,5,7 {1,3,5,7}
5 2,4,6,8 1,3,7,9 * {1,3,7,9}
6 2,8 3,5,9 * {1,3,5,7,9}
7 4,8 5
8 4,6 5,7,9
* 9 6,8 5
81
Example: Subset Construction
r b
r b
{1} {2,4} {5}
1 2,4 5
{2,4} {2,4,6,8} {1,3,5,7}
2 4,6 1,3,5
{5} {2,4,6,8} {1,3,7,9}
3 2,6 5 {2,4,6,8} {2,4,6,8} {1,3,5,7,9}
4 2,8 1,5,7 {1,3,5,7} {2,4,6,8} {1,3,5,7,9}
5 2,4,6,8 1,3,7,9 * {1,3,7,9}
6 2,8 3,5,9 * {1,3,5,7,9}
7 4,8 5
8 4,6 5,7,9
* 9 6,8 5
82
Example: Subset Construction
r b
r b
{1} {2,4} {5}
1 2,4 5
{2,4} {2,4,6,8} {1,3,5,7}
2 4,6 1,3,5
{5} {2,4,6,8} {1,3,7,9}
3 2,6 5 {2,4,6,8} {2,4,6,8} {1,3,5,7,9}
4 2,8 1,5,7 {1,3,5,7} {2,4,6,8} {1,3,5,7,9}
5 2,4,6,8 1,3,7,9 * {1,3,7,9} {2,4,6,8} {5}
6 2,8 3,5,9 * {1,3,5,7,9}
7 4,8 5
8 4,6 5,7,9
* 9 6,8 5
83
Example: Subset Construction
r b
r b
{1} {2,4} {5}
1 2,4 5
{2,4} {2,4,6,8} {1,3,5,7}
2 4,6 1,3,5
{5} {2,4,6,8} {1,3,7,9}
3 2,6 5 {2,4,6,8} {2,4,6,8} {1,3,5,7,9}
4 2,8 1,5,7 {1,3,5,7} {2,4,6,8} {1,3,5,7,9}
5 2,4,6,8 1,3,7,9 * {1,3,7,9} {2,4,6,8} {5}
6 2,8 3,5,9 * {1,3,5,7,9} {2,4,6,8} {1,3,5,7,9}
7 4,8 5
8 4,6 5,7,9
* 9 6,8 5
84
Example 2
0,1
Start 0 1
q0 q1 q2
0 1
Ø Ø Ø
{q0} {q0,q1} {q0}
{q1} Ø {q2}
*{q2} Ø Ø
{q0,q1} {q0,q1} {q0,q2}
*{q0,q2} {q0,q1} {q0}
*{q1,q2} Ø {q2}
*{q0,q1,q2} {q0,q1} {q0,q2} 85
Example 2
0 1
• NFA N Accepts all A A A
strings that end in 01 B E B
• N’s set of states: {q1, C A D
q2, q3} =03 *D A A
• Subset construction: E E F
DFA need 23 = 8 states *F E B
• Assign new names: A for
, B for {q0} *G A D
*H E F
86
Example 2
1 0
Start 0 1
B E F
0
1
0 1
A A A
B E B
•From 08 states, starting in start
C A D
state B, can only reach states B, E
*D A A
&F
E E F
other 05 states are inaccessible
*F E B
from B
*G A D
*H E F
87
Example 3
• N = (Q, {a, b}, d, 1, {1})
• Q = {1, 2, 3} = 03 states 1
• DFA states = 08
• {, {1}, {2}, {3}, {1, 2}, {1, 3},
2 3
{2, 3}, {1, 2, 3}} a, b
88
a b
{1} {2} {3}
{2} {2, 3} {3}
{3} {1, 3}
{1, 2} {2, 3} {2, 3}
{1, 3} {1, 3} {2}
{2, 3} {1, 2, 3} {3}
{1, 2, 3} {1, 2, 3} {2, 3}
a, b
a b {2}
{1} {1, 2}
a
b b a
a
{2, 3} {1, 2, 3}
{3} {1, 3} a
a b
89
Example 3
Simplified: no incoming arrows point at states {1} & {1, 2}
May be removed without affecting the performance
a, b
a
a b
{1, 3}
{3}
b b b a
a
{2} {2, 3} {1, 2, 3}
a
b
90
Closure of States
• CL(q) = set of states you can reach from state
q following only arcs labeled ε.
• Example: CL(A) = {A}; ε
1 1
CL(E) = {B, C, D, E}. 1 B C D
A ε ε 0
0 E F
0
Set of states
The subset construction
Computing -closure(T)
Example: NFA accepting R = (alb) *abb
= (a, b)
Marked
• -closure(0) = {0, 1, 2,4, 7} = A
• Mark A, Compute Dtran [A, a] & Dtran [A, b]
• Dtran [A, a] = -closure (move(A, a))
= -closure (move({0, 1, 2, 4, 7}, a))
= -closure ({3, 8})
= {3, 6, 7, 1, 2, 4} U {8}
= {1, 2, 3, 4, 6, 7, 8} = B
= (a, b)
Parse tree
Step 1: For sub expression r1 = a
b
8 9
Step 9: For sub expression r10 = b
b
9` 10
b b
8 9 10
Important States of NFA
• A state of an NFA important if it has a non- out-transition.
• Notice that the subset construction uses only the important
states in a set T when it computes
- closure (move(T, a)),
-the set of states reachable from T on input a.
• The set of states move(s , a) is nonempty only if state s is
important.
• During the subset construction, two sets of NFA states can
be identified (treated as if they were the same set) if they:
• 1. Have the same important states, and
• 2. Either both have accepting states or neither does.
• The only important states are those introduced as
initial states in the basis part for a particular
symbol position in the regular expression.
• Each important state corresponds to a particular
operand in the regular expression.
• The constructed NFA has only one accepting
state, but this state, having no out-transitions, is
not an important state
By concatenating a unique right end marker # to a regular expression r, we
give the accepting state for r a transition on #, making it an important state of
the NFA for (r) #.
augmented regular expression (r)#,
when the construction is complete, any state with a transition on # must be
an accepting state.
Nodes
• The important states of the NFA correspond directly to
the positions in the regular expression that hold
symbols
• present the regular expression by its syntax tree
-leaves correspond to operands
-interior nodes correspond to operators
• An interior nodes:
.
• cat-node: concatenation operator ( dot)
• or-node: union operator (I)
• star-node: star operator (*)
Syntax tree: (alb)* abb#
Syntax tree: (alb)* abb#
• Leaves in a syntax tree are labeled by or by an
alphabet symbol.
To each leaf not labeled , attach a unique
integer.
(the position of the leaf and also as a position of
its symbol)
a symbol can have several positions (a: 1 & 3 )
• The positions in the syntax tree correspond to the
important states of the constructed NFA.
Example: NFA [for r=(a|b)*abb#] with the important states numbered and
other states represented by letters
b b
8 9 10
Functions Computed From the Syntax Tree
• To construct a DFA directly from a regular
expression, we construct its syntax tree and
then compute four functions:
nullable
firstpos
lastpos
followpos
04 Functions
1. nullable(n) is true for a syntax-tree node n if & only if the sub
expression represented by n has in its language.
sub expressiorn can be "made null" or the empty string, even
though there may be other strings it can represent as well.
2. firstpos(n) is the set of positions in the subtree rooted at n that
correspond to the first symbol of at least one string in the
language of the sub expression rooted at n.
3. lastpos(n) is the set of positions in the subtree rooted at n that
correspond to the last symbol of at least one string in the
language of the sub expression rooted at n
4. followpos(p), for a position p, is the set of positions q in the
entire syntax tree such that there is some string x = a1 a2 . . . an
in L ( (r ) #) such that for some i, there is a way to explain the
membership of x in L( (r) #) by matching ai to position p of the
syntax tree and ai+1 to position q
Example: Consider the aa
cat-node n corresponds ba
to expression (alb) *a aba
Cat node
• nullable(n) is false,
since this node
generates all strings of
a’s & b’s ending in an
a; does not generate
•
• the star-node below it firstpos (n) = {1, 2, 3}
is nullable; it generates lastpost (n) = {3}
along with all other followpos (1) = {1, 2, 3}
strings of a’s & b’s
Computing nullable, firstpos, & lastpos
• Compute nullable, firstpos, & lastpos by a
straightforward recursion on height of the tree
• Basis & inductive rules for nullable & firstpos
Example : only the star-
node is nullable.
• none of the leaves are
nullable, because they
each correspond to non-
operands.
• The or-node is not
nullable, because neither
of its children is.
• The star-node is nullable,
because every star-node
is nullable.
• each of the cat-nodes,
having at least one non
null able child, is not
nullable.
firstpos(n) to the left of node n, and lastpos(n) to its right.
Each of the leaves has only itself for firstpos & lastpos, as required by
the rule for non- leaves
For the or-node, we take the union of firstpos
at the children and do the same for lastpos.
• consider the lowest cat-node, which we shall call n.
• To compute firstpos(n) , we first consider whether the
left operand is nullable, which it is in this case.
• Therefore, firstpos for n is the union of firstpos for each
of its children, that is {1, 2 } U {3} = {I, 2, 3}.
• The rule for lastpos are the same as for firstpos, with
the children interchanged.
• To compute lastpos(n) we must ask whether its right
child (the leaf with position 3) is nullable, which it is
not.
• Therefore, lastpos(n) is the same as lastpos of the right
child, or {3}.
Computing Followpos
• two ways that a position of a regular
expression can be made to follow another:
1. If n is a cat-node with left child C1 & right child
C2 , then for every position i in lastpos(C1) , all
positions in firstpos(C2) are in followpos(i).
2. If n is a star-node, & i is a position in
lastpos(n) , then all positions in firstpos(n) are
in followpos(i).
Example: Rule 1 for followpos requires that we look
at each cat-node, & put each position in firstpos of
its right child in followpos for each position in
lastpos of its left child.
firstpos
lastpos
For the lowest cat-node, that rule says position 3 is in
followpos(1) and followpos(2)
The next cat-node says that 4 is in followpos (3) ,
remaining two cat-nodes give us 5 in followpos (4) & 6 in
followpos(5)
C1 C2
F|L F|L
For the lowest cat-node, that rule says position 3 is in followpos(1) &
followpos(2)
Rule 2 to the star-node. positions 1 & 2 are in both followpos(1) &
followpos(2) , since both firstpos & lastpos for this node are {1 , 2} .
C1 C2
F|L F|L
Directed graph for the function followpos
Converting a Regular Expression
Directly to a DFA
Algorithm: Construction of a DFA from a regular expression r.
INPUT : A regular expression r.
OUTPUT: A DFA D that recognizes L (r) .
METHOD:
1 . Construct a syntax tree T from the augmented regular
expression (r) #.
2. Compute nullable, firstpos, lastpos, & followpos for T
3. Construct Dstates, the set of states of DFA D , & Dtran, the
transition function for D. The states of D are sets of positions in T.
Initially, each state is "unmarked," & a state becomes "marked" just
before we consider its out-transitions.
The start state of D is firstpos(no) , where node no is the root of T.
The accepting states are those containing the position for
endmarker symbol #
Construction of a DFA directly from a
regular expression
Example: construct a DFA for the regular expression
r = (a|b)*abb.
The value of firstpos for the root of the tree: {1, 2, 3}
A = {1, 2, 3} ----Start state
• Compute Dtran[A, a] & Dtran[A, b].
• Among the positions of A, 1 & 3 correspond to a, while 2
corresponds to b.
• Dtran[A, a] = followpos(1) U followpos(3) = {1, 2, 3, 4} = B
• Compute Dtran[A, b].
• Among the positions only 2 corresponds to b.
• Dtran[A, b] = followpos(2) = {1, 2, 3} = A
• Compute Dtran[B, a] = Dtran[{1, 2, 3, 4}, a]
• Among the positions 1, 3 corresponds to a.
• Dtran[B, a] = followpos(1) U followpos(3)
= {1, 2, 3, 4} = B
• Compute Dtran[B, b] = Dtran[{1, 2, 3, 4}, b]
• Among the positions 2 & 4 corresponds to b.
• Dtran[B, b] = followpos(2) U followpos(4)
= {1, 2, 3, 5} = C
• Compute Dtran[C, a] = Dtran[{1, 2, 3, 5}, a]
• Among the positions 1 & 3 corresponds to a.
• Dtran[C, a] = followpos(1) U followpos(3)
= {1, 2, 3, 4} = B
• Compute Dtran[C, b] = Dtran[{1, 2, 3, 5}, b]
• Among the positions 2 & 5 corresponds to b.
• Dtran[C, b] = followpos(2) U followpos(5)
= {1, 2, 3, 6} = D
• Compute Dtran[D, a] = Dtran[{1, 2, 3, 6}, a]
• Among the positions 1 & 3 corresponds to a.
• Dtran[D, a] = followpos(1) U followpos(3)
= {1, 2, 3, 4} = B
• Compute Dtran[D, b] = Dtran[{1, 2, 3, 6}, b]
• Among the positions 2 corresponds to b.
• Dtran[D, b] = followpos(2)
= {1, 2, 3} = A
A = {1, 2, 3}
Dtran[A, a] = followpos(1) U followpos(3) = {1, 2, 3, 4} = B
A B C D
Conclusion
• Tokens
• Lexemes
• Patterns
• Regular Expressions
• Regular Definitions
• Transition Diagrams
• Finite Automata
• DFA & NFA
• Conversion (NFA to DFA, Regular Expression to
NFA/DFA)