0% found this document useful (0 votes)
167 views15 pages

6.implementing Lexical Analyzer Using Finite Automation

This document describes implementing a lexical analyzer using finite automata. It discusses using transition diagrams to recognize tokens like keywords, identifiers, and numbers. The lexical analyzer uses a series of states and transitions between those states based on the input characters. When a valid token is recognized, the lexical analyzer returns the token to the parser and proceeds to find the next token. Pseudocode is provided for the main function of the lexical analyzer that switches on the current state and transitions to new states based on the input character.

Uploaded by

Sam Alex
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
167 views15 pages

6.implementing Lexical Analyzer Using Finite Automation

This document describes implementing a lexical analyzer using finite automata. It discusses using transition diagrams to recognize tokens like keywords, identifiers, and numbers. The lexical analyzer uses a series of states and transitions between those states based on the input characters. When a valid token is recognized, the lexical analyzer returns the token to the parser and proceeds to find the next token. Pseudocode is provided for the main function of the lexical analyzer that switches on the current state and transitions to new states based on the input character.

Uploaded by

Sam Alex
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15

IMPLEMENTING

LEXICAL ANALYZER
USING FINITE
AUTOMATION
 We are given the following regular
definition:
if -> if
then -> then
else -> else
relop -> <| <=|=|<>|>|>=
id -> letter(letter|digit)*
num -> digit+(.digit+)? (E(+|-)?digit+)?
letter -> [a-z]|[A-Z]
digit ->[0-9]
 Recognize the keyword: if, then, else and
lexemes: relop, id, num
 delim -> blank|tab|newline
ws -> delim+
if a match for ws is found lexical analyzer
does not return a token to parser. It
proceeds to find a token following the white
space and return that to parser.
TRANSITION DIAGRAMS
 Transition diagram depicts the actions that
takes place when a lexical analyzer is called by
parser to get the next token
 TD keeps track of information about characters
that are seen as fwd pointer scans the input
 Position in TD are drawn as circles called states
 States are connected by arrows called edges
 Edges leaving state s have labels indicating i/p
characters that can next appear after
transition diagram have reached state s.
 Start state: state where control resides when
we begin to recognize a token.
 No valid transitions indicate failure
 Accepting state: state in which token can be
found.
 * indicates state in which retraction must
takes place
letter/digit

start letter
*
delimiter
0 1 2
 There may be several transition diagrams
 If failure occurs while following one transition
diagram, then retract the fwd pointer to where it
was in start state of this diagram and activate
next transition diagram
 If failure occurs in all transition diagrams, lexical
error will be detected and error recovery
routines will be invoked
 e.g. DO 5 I=1.25
DO 5 I=1,25
RECOGNITION OF RESERVED WORDS
 Initialize appropriately the symbol table in which
information about identifiers is stored
 Enter the reserved words into symbol table before
any characters in the i/p are seen.
 Make a note in the symbol table of the token to be
returned when the keyword is identified.
 Return statement next to accepting state uses
gettoken() and install_id() to obtain token and
attribute value
 When a lexeme is identified, symbol table is
checked
 if found as keyword install_id() will return 0
 If an identifier , pointer to symbol table entry will be
returned
 gettoken() will return the corresponding token
RECOGNITION OF NUMBERS
 When accepting state is reached,
 call a procedure install_num() that enters the
lexeme into table of numbers and returns a
pointer to created entry
 Returns the token NUM
IMPLEMENTING LEXICAL ANALYZER
 Token nexttoken( )
 {
 While (1)
 {
 switch(state) {
 case 0: c=nextchar();
 If (c==blank|| c==tab|| c==newline) {
 State =0;
 lexeme_beginning++;
 }
 else if (c==’<’) state=1;
 else if (c ==’=’)state=5;
 else if (c==’>’) state=6;
 else state=fail();
 break;
 case 1: c= nextchar();
 if (c==’=’) state=2;
 else if (c==’>’) state=3;
 else state=4;
 break;
 case 2: token.attribute=LE;
 token.name=relop;
 return token;
 case 8: retract (1);
 token.attribute=GT;
 token.name=relop;
 return token;
 case 9: c= nextchar();
 if (isletter(c)) state=10;
 else state= fail();
 break;
 case 10: c= nextchar();
 if (isletter(c)) state=10;
 else if (isdigit(c)) state=10;
 else state=11;
 break;
 case11: retract (1);
 entry=install_id( );
 name=gettoken();
 token.name= name;
 token. attribute=entry;
 return token;
 break;
 /* cases 12-24 here for numbers*/
 case 25: c= nextchar();
 if (isidgit(c)) state=26;
 else state=fail();
 break;
 case 26: c= nextchar();
 if (isidgit(c)) state=26;
 else state=27;
 break;
 case 27:retract (1); install_num( );
 return (NUM);
 }
 }
 }
CODE FOR NEXT STATE
 int state=0, start=0;
 int lexical_value;
 int fail()
 {
 forward=token_beginning;
 switch( start){
 case 0:start=9; break;
 case 9: start=12; break;
 case 12: start=20; break;
 case 20: start=25; break;
 case 25: recover( ); break;
 default: /* compiler error*/
 }
 return start;
 }

You might also like