CSC 333-HW02
CSC 333-HW02
if cur.char = '='
read the next character
if it is '=' return the relational operator token
else return assign
• Implementation: The code uses regular expressions to detect both the
assignment operator (=) and the relational operator (==):
if cur.char = '/'
peek at the next character
if it is '*' or '/'
read additional characters until "*/" or newline is seen, respectively
• Implementation: The lexer recognizes the division operator (/) and handles
comments using a regular expression for comments starting with #:
if cur.char is a digit
read any additional digits and at most one decimal point
return number
• Implementation: The lexer checks for integers and floating-point numbers
using two separate regular expressions:
if cur.char is a letter
read any additional letters and digits
check to see whether the resulting string is a keyword
if so, return the corresponding token
else return id
• Implementation: Identifiers and keywords are handled by a regular expression
that matches letters followed by letters, digits, or underscores. The lexer checks if
the matched token is a keyword:
The lexical analyzer is designed to follow the logic described in the pseudocode. It
systematically processes the input character by character, using regular expressions to
detect tokens and handle errors. Each token is processed as per its type, and results are
printed to the console and saved to a file.
This lexer can be extended easily for additional language features by adding more token
patterns and modifying the state transitions.