Compiler Lecture 3
Compiler Lecture 3
Compiler Lecture 3
Today’s lecture:
Towards automated Lexical Analysis.
CSE 359 - Compiler Masud Ibn Afjal, CSE, HSTU 1
Design
The Big Picture
First step in any translation: determine whether the text to be translated
is well constructed in terms of the input language. Syntax is
specified with parts of speech - syntax checking matches parts of
speech against a grammar.
In natural languages, mapping words to part of speech is idiosyncratic.
In formal languages, mapping words to part of speech is syntactic:
• based on denotation
• makes this a matter of syntax
• reserved keywords are important
Can be efficient; but requires a lot of work and may be difficult to modify!
CSE 359 - Compiler Masud Ibn Afjal, CSE, HSTU 12
Design
Building Lexical Analysers “automatically”
Idea: try the regular expressions one by one and find the longest match:
set (token.class, token.length) (NULL, 0)
// first
find max_length such that input matches T1RE1
if max_length > token.length
set (token.class, token.length) (T1, max_length)
// second
find max_length such that input matches T2RE2
if max_length > token.length
set (token.class, token.length) (T2, max_length)
…
// n-th
find max_length such that input matches TnREn
if max_length > token.length
set (token.class, token.length) (Tn, max_length)
// error
if (token.class == NULL) { handle no_match }
digit
start r digit
S0 S1 S2