CS606 Assignment 1
CS606 Assignment 1
ASSIGNMENT NO 1
NAME: NARMEEN SHAHID
VU ID: BC230428551
Question 1
Explain the role of a lexical analyzer in a compiler. What are the different phases it
involves?
Use a small code snippet in C to illustrate how the lexical analyzer breaks down the
source code into tokens.
Solution:
A lexical analyzer (also known as a lexer or scanner) is the first phase of a compiler that
processes the source code and converts it into a sequence of tokens. These tokens are the
basic building blocks of the source code, such as keywords, operators, identifiers, literals, and
punctuation. The primary role of the lexical analyzer is to scan the input source code and
group characters into meaningful units that can be further processed by the parser in later
stages of compilation.
1. Input Buffering: The source code is read into a buffer to allow efficient scanning.
2. Lexeme Recognition: A lexeme is a sequence of characters in the source code that
matches a regular expression or pattern for a token. The lexical analyzer identifies
lexemes and groups them into tokens.
3. Token Classification: Each lexeme is classified into a specific token type based on its
pattern (e.g., keywords, identifiers, operators, etc.).
4. Output Tokens: The analyzer outputs a sequence of tokens to the parser, often along
with additional information like the lexeme's value or position in the source code.
5. Error Reporting: If an invalid lexeme is encountered, the lexical analyzer reports an
error.
Breaking Down C Code into Tokens:
int x = 20;
if (x> 10) {
x=x+ 5;
}
Solution
Breakdown Lexemes and Tokens