0% found this document useful (0 votes)
12 views4 pages

CS606 Assignment 1

The document discusses the role of a lexical analyzer in a compiler, detailing its functions such as reading input, tokenization, classification, and error detection. It outlines the phases involved in lexical analysis, including input buffering, lexeme recognition, and output tokens. Additionally, it provides examples of breaking down C code into tokens, illustrating the identification of lexemes and their corresponding token types.

Uploaded by

narmeenshahid388
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views4 pages

CS606 Assignment 1

The document discusses the role of a lexical analyzer in a compiler, detailing its functions such as reading input, tokenization, classification, and error detection. It outlines the phases involved in lexical analysis, including input buffering, lexeme recognition, and output tokens. Additionally, it provides examples of breaking down C code into tokens, illustrating the identification of lexemes and their corresponding token types.

Uploaded by

narmeenshahid388
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

CS606

ASSIGNMENT NO 1
NAME: NARMEEN SHAHID
VU ID: BC230428551

Question 1
Explain the role of a lexical analyzer in a compiler. What are the different phases it
involves?
Use a small code snippet in C to illustrate how the lexical analyzer breaks down the
source code into tokens.

Solution:
A lexical analyzer (also known as a lexer or scanner) is the first phase of a compiler that
processes the source code and converts it into a sequence of tokens. These tokens are the
basic building blocks of the source code, such as keywords, operators, identifiers, literals, and
punctuation. The primary role of the lexical analyzer is to scan the input source code and
group characters into meaningful units that can be further processed by the parser in later
stages of compilation.

Role of a Lexical Analyzer in a Compiler:


1. Reading Input: It reads the raw source code, which is typically in the form of text,
character by
character.
2. Tokenization: The lexical analyzer splits the input into a series of tokens. A token is
a meaningful sequence of characters, often associated with a specific type (such as a
keyword, identifier, operator, etc.).
3. Classification: It classifies each token into a specific type (e.g., keyword, identifier,
operator, literal, etc.), which helps the parser understand the structure of the code.
4. Handling Whitespace and Comments: It removes whitespace (spaces, tabs,
newlines) and comments from the source code since they are not needed for syntax
analysis, but they help in human readability.
5. Error Detection: If the lexical analyzer encounters a character or sequence of
characters that do not conform to the expected syntax of tokens, it generates an error
.
Phases of a Lexical Analyzer:

1. Input Buffering: The source code is read into a buffer to allow efficient scanning.
2. Lexeme Recognition: A lexeme is a sequence of characters in the source code that
matches a regular expression or pattern for a token. The lexical analyzer identifies
lexemes and groups them into tokens.
3. Token Classification: Each lexeme is classified into a specific token type based on its
pattern (e.g., keywords, identifiers, operators, etc.).
4. Output Tokens: The analyzer outputs a sequence of tokens to the parser, often along
with additional information like the lexeme's value or position in the source code.
5. Error Reporting: If an invalid lexeme is encountered, the lexical analyzer reports an
error.
Breaking Down C Code into Tokens:

Simple C code snippet


int mainO
int x = 10;
x= x + 5;
return 0;
}

Lexem Token TypeDescription


e

Int Keyword Keyword: Indicates a data


type.

main Identifier Identifier: Name of the


function.

( Punctuation Opening parenthesis, part of


function declaration.

) Punctuation Closing parenthesis, part of


function declaration.

{ Punctuation Opening curly brace, start of


function body.

int Keyword Keyword: Indicates a data


type.
x Identifier Identifier: Variable name
= Operator Assignment operator
10 Literal Integer literal value.
; Punctuation Semicolon, statement
terminator.
x Identifier Identifier: Variable name.
= Operator Assignment operator.
x Identifier Identifier: Variable name.
+ Operator Addition operator.
5 Literal Integer literal value.
; Punctuation Semicolon, statement
terminator.
return Keyword Keyword: Return statement
in function.
0 Literal Integer literal value.
; Punctuation Semicolon, statement
terminator.
} Punctuation Closing curly brace, end of
function body.
Question 2:
Consider the following code snippet. Identify the lexemes and corresponding tokens for
each line.

int x = 20;
if (x> 10) {
x=x+ 5;
}

Solution
Breakdown Lexemes and Tokens

Line Lexeme Token Type Description


Line 1 : int x =20; Int Keyword A key Word
indicating the data
type int.
x Identifier An identifier
representing avariable
name.
= Operator The assignment
operator
20 Literal An integer literal
value
; Punctuation Statement
terminator(semicolon).
Line 2 : if(x>10){ If Keyword A keyword that
introduces a
conditional statement
( Punctuation Opening parenthesis
for the conditional
expression.
x Identifier An identifier
representing a
variable name.
> Operator The greater than
operator.
10 Literal an integer literal
value.
) Punctuation closing parentheses
for the conditional
expression.
{ Punctuation opening curly brace
indicating the start of
the block.
Line 3: x=x+5; x Identifier An identifier
representing variable
name.
= Operator The assignment
operator.
x Identifier An identifier
representing a variable
name
+ Operator The addition operator.
5 Literal An integer literal
value.
; Punctuation Statement
terminator(semicolon).
Line 4:} } Punctuation Closing curly brace,
indicating the end of
the block

You might also like