0% found this document useful (0 votes)
31 views31 pages

Lecture 09

The document discusses compiler construction and lexical analyzers. It describes how to minimize DFA states using Hopcroft's algorithm and how lexical analyzers work by using regular expressions to recognize tokens in a character stream and return the longest matching token.

Uploaded by

Hammad Rajput
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views31 pages

Lecture 09

The document discusses compiler construction and lexical analyzers. It describes how to minimize DFA states using Hopcroft's algorithm and how lexical analyzers work by using regular expressions to recognize tokens in a character stream and return the longest matching token.

Uploaded by

Hammad Rajput
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 31

Compiler Construction

LECTURE 9
DFA Minimization 2

The generated DFA may


have a large number of
states.
Hopcroft’s algorithm:
minimizes DFA states
DFA Minimization 3

The generated DFA may


have a large number of
states.
Hopcroft’s algorithm:
minimizes DFA states
DFA Minimization 4

Idea: find groups of


equivalent states.
All transitions from states
in one group G1 go to
states in the same group G2
DFA Minimization 5

Idea: find groups of


equivalent states.
All transitions from states
in one group G1 go to
states in the same group G2
DFA Minimization 6

Construct the minimized


DFA such that there is one
state for each group of
states from the initial
DFA.
DFA Minimization 7
a a
a b b
A B D E
a
b a
b
C

b
DFA for (a | b )*abb
DFA Minimization 8
b a a
a b b
A,C B D E
a
b

Minimized DFA for (a | b )*abb


Optimized Acceptor 9

RE R RE=>NFA

NFA=>DFA

Min. DFA

input w Simulate yes, if w  L(R)


string DFA no, if w  L(R)
Lexical Analyzers 10
 Lexical analyzers (scanners) use the same mechanism
 but they:
 Have multiple RE descriptions for multiple tokens
 Have a character stream at the input
Lexical Analyzers 11
 Lexical analyzers (scanners) use the same mechanism
 but they:
 Have multiple RE descriptions for multiple tokens
 Have a character stream at the input
Lexical Analyzers 12
 Lexical analyzers (scanners) use the same mechanism
 but they:
 Have multiple RE descriptions for multiple tokens
 Have a character stream at the input
Lexical Analyzers 13

 Return a sequence of matching tokens at the output (or an error)


 Always return the longest matching token
Lexical Analyzers 14

 Return a sequence of matching tokens at the output (or an error)


 Always return the longest matching token
Lexical Analyzers 15

R1…R2 RE=>NFA
NFA=>DFA

Min. DFA
character Simulate Token
stream DFA stream
Lexical Analyzer Generators 16
 The lexical analysis process can automated
 We only need to specify
 Regular expressions for tokens
 Rule priorities for multiple longest match cases
Lexical Analyzer Generators 17
 The lexical analysis process can automated
 We only need to specify
 Regular expressions for tokens
 Rule priorities for multiple longest match cases
Lexical Analyzer Generators 18
 Flex
generates lexical analyzer in C or C++
 Jlex
written in Java. Generates lexical analyzer in Java
Lexical Analyzer Generators 19
 Flex
generates lexical analyzer in C or C++
 Jlex
written in Java. Generates lexical analyzer in Java
Using Flex 20

 Provide a specification file


 Flex reads this file and produces C or C++ output file contains the scanner.
 The file consists of three sections
Using Flex 21

 Provide a specification file


 Flex reads this file and produces C or C++ output file contains the scanner.
 The file consists of three sections
Using Flex 22

 Provide a specification file


 Flex reads this file and produces C or C++ output file contains the scanner.
 The file consists of three sections
Flex Specification File 23

1 C or C++ and flex definitions


Flex Specification File 24

1 C or C++ and flex definitions


%%
2 token definitions and actions
Flex Specification File 25

1 C or C++ and flex definitions


%%
2 token definitions and actions
%%
3 user code
Specification File lex.l 26
%{
#include “tokdefs.h”
%}
D [0-9]
L [a-zA-Z_]
id {L}({L}|{D})*
%%
"void" {return(TOK_VOID);}
"int" {return(TOK_INT);}
"if" {return(TOK_IF);}
Specification File lex.l 27

"else" {return(TOK_ELSE);}
"while"{return(TOK_WHILE)};
"<=" {return(TOK_LE);}
">=" {return(TOK_GE);}
"==" {return(TOK_EQ);}
"!=" {return(TOK_NE);}
{D}+ {return(TOK_INT);}
{id} {return(TOK_ID);}
[\n]|[\t]|[ ];
%%
File tokdefs.h 28
#define TOK_VOID 1
#define TOK_INT 2
#define TOK_IF 3
#define TOK_ELSE 4
#define TOK_WHILE 5
#define TOK_LE 6
#define TOK_GE 7
#define TOK_EQ 8
#define TOK_NE 9
#define TOK_INT 10
#define TOK_ID 111
Invoking Flex 29

lex.l flex lex.cpp


Using Generated Scanner 30
void main()
{
FlexLexer lex;
int tc = lex.yylex();
while(tc != 0)
cout << tc << “,”
<<lex.YYText() << endl;
tc = lex.yylex();
}
Creating Scanner EXE 31

flex lex.l
g++ –c lex.cpp
g++ –c main.cpp
g++ –o lex.exe lex.o main.o

lex <main.cpp

You might also like