We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13
LEXICAL
ANALYSER ROLE OF LEXICAL ANALYSER
● Lexical Analysis Vs Parsing
● Tokens, Patterns and Lexemes ● Attributes for tokens ● Lexical Errors Role of Lexical Analyser Other Tasks Short Form 1. Lexical Analysis vs Parsing 1. Simplicity of design a. The separation of lexical and syntactic analysis often allows us to simplify at least one of these tasks. b. For example, a parser that had to deal with comments and whitespace as syntactic units would be considerably more complex than one that can assume comments and whitespace have already been removed by the lexical analyzer. 2. Compiler efficiency is improved. a. A separate lexical analyzer allows us to apply specialized techniques that serve only the lexical task, not the job of parsing. b. In addition, specialized buffering techniques for reading input characters can speed up the compiler significantly. 3. Compiler portability is enhanced. a. Input-device-specific peculiarities can be restricted to the lexical analyzer. 2. Tokens, Patterns and Lexemes ● Token ○ A token is a pair consisting of a token name and an optional attribute value. ○ The token name is an abstract symbol representing a kind of lexical unit, e.g., a particular keyword, or a sequence of input characters denoting an identifier. ○ The token names are the input symbols that the parser processes. ● Pattern ○ A pattern is a description of the form that the lexemes of a token may take. In the case of a keyword as a token, the pattern is just the sequence of characters that form the keyword. ○ For identifiers and some other tokens, the pattern is a more complex structure that is matched by many strings. ● Lexeme ○ A lexeme is a sequence of characters in the source program that matches the pattern for a token and is identified by the lexical analyzer as an instance of that token. Classes of Tokens 3. Attributes for Tokens - Why value? 4. Lexical Errors Panic mode recovery. We delete successive characters from the remaining input, until the lexical analyzer can nd a well-formed token at the beginning of what input is left THANK YOU