0% found this document useful (0 votes)
10 views3 pages

CS606 1

CS606
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views3 pages

CS606 1

CS606
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

ID: BC210208298

Name: Muhammad Umar Raza


CS606 Assignment 1

Solution
Question 1 (10 Marks)

Explain the role of a lexical analyzer in a compiler. What are the different phases it
involves? Use a small code snippet in C to illustrate how the lexical analyzer breaks
down the source code into tokens.

Solution

Introduction
Compilers begin with the lexical analyzer. Reading the source code and parsing it into a
string of tokens is its principal function. Tokens like keywords, identifiers, operators, and
literals serve as the foundation for later stages of the compiler. Verifying that the source
code follows the language's syntax rules at a basic level is done by the lexical analyzer.

Key Functions of the Lexical Analyzer


1. Tokenization: Divides the input source code into smaller pieces called tokens, which
stand for important parts of the code.
2. Error Detection: Finds sequences or tokens that are not valid in the code.
3. Whitespace and Comment Removal: Eliminates extra white space, tabs, and
comments to make analysis easier.
4. Symbol Table Generation: Updates the symbol table with identifiers (such variable
names) for use during compilation.

Phases Involved in Lexical Analysis


1. Input Buffering: Quickly understands the source code.
2. Lexeme Recognition: uses the computer language's conventions to group characters
into lexemes, or fundamental units.
3. Token Generation: Converts lexemes into tokens.
4. Error Handling: Reports illegal sequences or invalid characters.

Example: Tokenization of a C Code Snippet


Consider the following C code:
int main() {
int a = 10;
float b = 20.5;
a = a + b;
return 0;
}

The lexical analyzer would break this code into tokens as follows:
Lexeme Token Type
int Keyword
main Identifier
( Left Parenthesis
) Right Parenthesis
{ Left Brace
int Keyword
a Identifier
= Assignment Operator
10 Integer Constant
; Semicolon
float Keyword
b Identifier
= Assignment Operator
20.5 Float Constant
; Semicolon
a Identifier
= Assignment Operator
a Identifier
+ Addition Operator
b Identifier
; Semicolon
return Keyword
0 Integer Constant
; Semicolon
} Right Brace

Explanation of Lexical Analysis with the Code Example


1. Lexeme Recognition: Every word, symbol, and character group (such as "int," "main,"
and "10) is recognized as a lexeme.
2. Tokenization: Each lexeme is given a token type by the lexical analyzer (e.g., 'int' →
Keyword).
3. Output to Parser: Tokens are sent to syntax analysis, the compiler's next stage, for
additional processing.

Question 2 (10 Marks)


Consider the following code snippet. Identify the lexemes and corresponding tokens for
each line:
```
int x = 20;
if (x > 10) {
x = x + 5;
}
```

Solution
The following table identifies the lexemes and their corresponding tokens for each part of
the code:
Lexeme Token
int KEYWORD
x IDENTIFIER
= ASSIGNMENT_OPERATOR
20 NUMERIC_LITERAL
; SEMICOLON
if KEYWORD
( LEFT_PARENTHESIS
x IDENTIFIER
> RELATIONAL_OPERATOR
10 NUMERIC_LITERAL
) RIGHT_PARENTHESIS
{ LEFT_BRACE
x IDENTIFIER
= ASSIGNMENT_OPERATOR
x IDENTIFIER
+ ARITHMETIC_OPERATOR
5 NUMERIC_LITERAL
; SEMICOLON
} RIGHT_BRACE

You might also like