0% found this document useful (0 votes)
14 views2 pages

671 B 4 e 6 Ef 2 B 93 CCassignment 01

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views2 pages

671 B 4 e 6 Ef 2 B 93 CCassignment 01

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

CS-342 Compiler Construction

Assignment 1 (CLO-1)
(Lexical Analyzer)

Assigned: October 23, 2024. Due: October 31, 2024 01:00 PM

Lexical analysis
Lexical analysis is the process of reading in the stream of characters making up the source code of a program
and dividing the input into tokens. In this assignment, you will use regular expressions and DFAs to
implement a lexical analyzer for a subset of C or C++ programming language.

Problem Statement:
You are required to design and implement a simple lexical analyzer for a basic programming language that
consists of the following elements:
1. Keywords: if, else, while, return, int, float
2. Identifiers: Sequences of letters and digits that begin with a letter.
3. Operators: +, -, *, /, =, ==, !=, >, <, >=, <=
4. Delimiters: (, ), {, }, ;
5. Constants: Integer and floating-point numbers.
6. Comments: Single-line comments that begin with //.
Your program may be written in C, C++, Java or any other programming language.

Task Requirements:
1. Input:
o The input to your lexical analyzer will be a source code file (input.txt) written in the
language described above.
2. Output:
o The output will be a list of tokens with their corresponding token type and value. For
example:
Class : Lexeme
Keyword : if
Identifier : main
Constant : 10
Delimiter: )
3. Functionalities:
o Tokenize Input: The program should be able to read the input and identify valid tokens
based on the provided specifications.
oHandle Errors: If the lexical analyzer encounters an invalid token, it should return an error
message indicating the position of the error.
o Symbol Table: The analyzer should store each identifier in a symbol table for reference.
4. Modules:
o Keyword Recognition: Write a function to recognize reserved keywords.
o Identifier Recognition: Write a function to distinguish between identifiers and keywords.
o Operator & Delimiter Recognition: Write a function to recognize operators and
delimiters.
o Constant Recognition: Write a function to detect integer and floating-point constants.
o Comment Handling: Implement logic to ignore single-line comments.
5.

Construction of Deterministic Finite Automata (DFA) or Transition


Diagrams
You can construct single DFA (also called transition diagram) for recognizing all tokens in this language
by combining individual DFAs for each type of token. For example you can construct transition diagram
for identifiers and numbers and them merge the start states of two diagrams to create a single transition
diagram that recognizes both numbers and identifiers.
For identifying keywords, you can store keywords in some data structure (hashmap or string array).
Whenever your program recognizes a token of identifier, you can check your map or string array if this
identifier matches any keyword. If it matches any keyword then consider it keyword otherwise
consider it identifier.

Submission
 Submit a zip file containing your complete files (single source file + PDF report file) on the
submission portal.
 The name of zip file should be roll numbers as follows: YourRollNumber
 You must work on this assignment individually.

You might also like