Compiler Design Lexical Analysis
Compiler Design Lexical Analysis
Sakhi Bandyopadhyay
Department of Computer Science and BCA
Kharagpur College
The role of lexical analyzer
token
Source To semantic
Lexical Analyzer Parser
program analysis
getNextToken
Symbol
table
Why to separate Lexical analysis and
parsing
1. Simplicity of design
2. Improving compiler efficiency
3. Enhancing compiler portability
Tokens, Patterns and Lexemes
• E = M * C ** 2
• <id, pointer to symbol table entry for E>
• <assign-op>
• <id, pointer to symbol table entry for M>
• <mult-op>
• <id, pointer to symbol table entry for C>
• <exp-op>
• <number, integer value 2>
Lexical errors
E = M * C * * 2 eof
Sentinels
d1 -> r1
d2 -> r2
…
dn -> rn
• Example:
letter_ -> A | B | … | Z | a | b | … | Z | _
digit -> 0 | 1 | … | 9
id -> letter_ (letter_ | digit)*
Extensions
• Example:
• letter_ -> [A-Za-z_]
• digit -> [0-9]
• id -> letter_(letter|digit)*
Recognition of tokens
Lex Source
Lexical Compiler lex.yy.c
program
lex.l
lex.yy.c
C a.out
compiler
declarations
%%
translation rules Pattern {Action}
%%
auxiliary functions
Thank You