0% found this document useful (0 votes)
23 views6 pages

Fall 2024 - CS606 - 1 (BSCS) .

Uploaded by

bin.azad101
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views6 pages

Fall 2024 - CS606 - 1 (BSCS) .

Uploaded by

bin.azad101
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

CS606 – Compiler Construction

Total Marks: 20
Assignment No. 01
Due Date: Nov 25, 2024
Semester: Fall 2024

Please read the following instructions carefully before solving & submitting assignment:

Uploading Instructions:
o You are supposed to consult recommended book/s to clarify your concepts as handouts are not sufficient.
o The assignment file must be an MS Word file. Any other software/tool is not allowed.
o The required file format is .doc or .docx. Any other format like scan images, txt, pdf, png or jpeg etc. will
not be accepted.
o Place all solutions in a single MS Word file along with your own Student Id at top.
o Submit the MS Word file at VULMS within the due date.

Rules for Marking:


It should be cleared that your assignment will not get any credit if:
o The assignment is submitted after due date.
o The assignment is not submitted in .doc or .docx format.
o The submitted assignment does not open or file is corrupt.
o The assignment is fully or partially copied from other student or ditto copy from handouts or Internet;
strict disciplinary action will be taken in this case.
o The submitted file does not contain your own Student Id, or contain other than yours; Zero Marks will be
awarded, and no excuse will be accepted in any case.

Note:
o No assignment will be accepted after the due date via email in any case (whether it is the case of load
shedding or internet malfunctioning etc.). Hence refrain from uploading assignment in the last hour of
deadline.
o It is recommended to upload solution file at least one day before its closing date.
o Do not put any query on MDB regarding this assignment, if you have any query then email at
[email protected]

Lectures Covered: This assignment covers Lectures # 1 to 10.

1
CS606 – Compiler Construction
Total Marks: 20
Assignment No. 01
Due Date: Nov 25, 2024
Semester: Fall 2024

Question 1 (10 Marks)


Explain the role of a lexical analyzer in a compiler. What are the
different phases it involves? Use a small code snippet in C to
illustrate how the lexical analyzer breaks down the source code into
tokens.

In a compiler, the lexical analyzer (also called the lexer) plays a crucial role in the first phase of
the compilation process. Its primary function is to take the raw source code and convert it into a
sequence of tokens that are easier for the compiler to process. A token is a categorized unit of
the source code, which typically corresponds to keywords, identifiers, operators, literals, and
punctuation.
Role of a Lexical Analyzer:
The lexical analyzer performs the following tasks:
1. Reading the Source Code: It reads the input source code character by character.
2. Tokenization: It groups characters into meaningful sequences and classifies them into categories
(e.g., keywords, operators, variables, numbers).
3. Skipping Whitespace and Comments: It ignores irrelevant whitespace, newline characters, and
comments.
4. Error Handling: It detects invalid sequences of characters and reports lexical errors (e.g.,
unrecognized symbols).
Phases Involved in Lexical Analysis:
The process of lexical analysis typically involves several phases:
1. Input Reading: The lexical analyzer reads the source code character by character.
2. Pattern Recognition: The analyzer uses regular expressions or other pattern matching
techniques to identify token types (e.g., keywords, identifiers, literals).
3. Tokenization: Once a match is found, the sequence of characters is grouped into a token and
passed to the next phase of the compiler.
4. Error Detection: If an illegal sequence of characters is encountered (such as a malformed
identifier), the lexical analyzer generates an error message.
Example Code in C:
Consider the following simple C code snippet:
int main() {
int x = 10;
float y = 20.5;
if (x < y) {
x = x + 1;
}
return 0;
}

2
CS606 – Compiler Construction
Total Marks: 20
Assignment No. 01
Due Date: Nov 25, 2024
Semester: Fall 2024

Breakdown by Lexical Analyzer:


1. Characters Read: The lexical analyzer reads characters from the source code one by one.
2. Tokenization: The analyzer identifies and classifies sequences of characters into tokens.
o "int" → Keyword (int)
o "main" → Identifier (main)
o "(" → Symbol (()
o ")" → Symbol ())
o "{" → Symbol ({)
o "int" → Keyword (int)
o "x" → Identifier (x)
o "=" → Operator (=)
o "10" → Integer literal (10)
o "float" → Keyword (float)
o "y" → Identifier (y)
o "=" → Operator (=)
o "20.5" → Floating-point literal (20.5)
o "if" → Keyword (if)
o "(" → Symbol (()
o "x" → Identifier (x)
o " < " → Operator (<)
o "y" → Identifier (y)
o ")" → Symbol ())
o "{" → Symbol ({)
o "x" → Identifier (x)
o "=" → Operator (=)
o "x" → Identifier (x)
o " + " → Operator (+)
o "1" → Integer literal (1)
o "}" → Symbol (})
o "return" → Keyword (return)
o "0" → Integer literal (0)
o ";" → Symbol (;)
o "}" → Symbol (})
Visualized Breakdown:
Here's how the lexical analyzer might break down the given C code snippet:
Token 1: Keyword("int")
Token 2: Identifier("main")
Token 3: Symbol("(")
Token 4: Symbol(")")
Token 5: Symbol("{")
Token 6: Keyword("int")
Token 7: Identifier("x")
Token 8: Operator("=")

3
CS606 – Compiler Construction
Total Marks: 20
Assignment No. 01
Due Date: Nov 25, 2024
Semester: Fall 2024

Token 9: IntegerLiteral("10")
Token 10: Keyword("float")
Token 11: Identifier("y")
Token 12: Operator("=")
Token 13: FloatingPointLiteral("20.5")
Token 14: Keyword("if")
Token 15: Symbol("(")
Token 16: Identifier("x")
Token 17: Operator("<")
Token 18: Identifier("y")
Token 19: Symbol(")")
Token 20: Symbol("{")
Token 21: Identifier("x")
Token 22: Operator("=")
Token 23: Identifier("x")
Token 24: Operator("+")
Token 25: IntegerLiteral("1")
Token 26: Symbol("}")
Token 27: Keyword("return")
Token 28: IntegerLiteral("0")
Token 29: Symbol(";")
Token 30: Symbol("}")

Each of these tokens represents a meaningful element of the source code. After this tokenization process, the
lexical analyzer passes the tokens to the parser phase, where the syntactic structure of the code is analyzed.

Question 2 (10 Marks)

Consider the following code snippet. Identify the lexemes and


corresponding tokens for each line.
int x = 20;
if (x > 10) {
x = x + 5;
}

Let's break down the code snippet line by line and identify the lexemes (the individual
components or substrings that represent a meaningful unit of the source code) and the
corresponding tokens (the category or type that the lexeme belongs to).
Here is the code:
int x = 20;
if (x > 10) {

4
CS606 – Compiler Construction
Total Marks: 20
Assignment No. 01
Due Date: Nov 25, 2024
Semester: Fall 2024

x = x + 5;
}

Line 1: int x = 20;


 Lexeme: int, Token: Keyword
int is a keyword in C/C++ and other languages, used to define an integer type.
 Lexeme: x, Token: Identifier
x is an identifier representing a variable name.
 Lexeme: =, Token: Assignment Operator
The = symbol is used for assignment in C/C++.
 Lexeme: 20, Token: Integer Literal
20 is a literal integer value.
 Lexeme: ;, Token: Semicolon
The semicolon marks the end of the statement.
Line 2: if (x > 10) {
 Lexeme: if, Token: Keyword
if is a conditional statement keyword in C/C++.
 Lexeme: (, Token: Left Parenthesis
The opening parenthesis is used to start the condition in the if statement.
 Lexeme: x, Token: Identifier
Again, x is an identifier, representing the variable x.
 Lexeme: >, Token: Relational Operator
The > symbol is a relational operator that means "greater than."
 Lexeme: 10, Token: Integer Literal
10 is an integer literal.
 Lexeme: ), Token: Right Parenthesis
The closing parenthesis ends the condition of the if statement.
 Lexeme: {, Token: Left Brace
The opening brace marks the beginning of the block of code to execute if the condition is
true.
Line 3: x = x + 5;
 Lexeme: x, Token: Identifier
The variable x.
 Lexeme: =, Token: Assignment Operator
The assignment operator.
 Lexeme: x, Token: Identifier
Again, the variable x.
 Lexeme: +, Token: Addition Operator
The + symbol is used for addition.
 Lexeme: 5, Token: Integer Literal
The literal integer 5.

5
CS606 – Compiler Construction
Total Marks: 20
Assignment No. 01
Due Date: Nov 25, 2024
Semester: Fall 2024

 Lexeme: ;, Token: Semicolon


The semicolon ends the statement.
Line 4: }
 Lexeme: }, Token: Right Brace
The closing brace ends the block of code in the if statement.

You might also like