0% found this document useful (0 votes)
10 views8 pages

Pdf&rendition 1

The document outlines an assignment for implementing a syntax analyzer that checks the correctness of a program's syntax based on context-free grammar (CFG). It details the steps to create a parser using recursive descent parsing for arithmetic expressions and provides a Lexer class for tokenization. The implementation includes error handling and example runs demonstrating both valid and invalid expressions.

Uploaded by

kaishwarya978
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views8 pages

Pdf&rendition 1

The document outlines an assignment for implementing a syntax analyzer that checks the correctness of a program's syntax based on context-free grammar (CFG). It details the steps to create a parser using recursive descent parsing for arithmetic expressions and provides a Lexer class for tokenization. The implementation includes error handling and example runs demonstrating both valid and invalid expressions.

Uploaded by

kaishwarya978
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

PRINCIPLE OF COMPILER DESIGN - 21UCS601

ASSIGNMENT – 2

Submitted by

MAHALAKSHMI M (Reg.no.921722102075)

NOORJAHAN A (Reg.no.921722102113)

BACHELOR OF ENGINEERING

in

COMPUTER SCIENCE AND ENGINEERING

SETHU INSTITUTE OF TECHNOLOGY


(An Autonomous Institution | Accredited with ‘A++’ Grade by NAAC)
PULLOOR, KARIAPATTI-626 115.

ANNA UNIVERSITY: CHENNAI 600 025.


PROBLEM STATEMENT:
Implement a syntax analyzer that takes the token stream from the lexical analyzer and
checks whether the syntax of the program is correct according to the grammar of the
programming language. This could be based on context-free grammar (CFG)

SOLUTION:
A syntax analyzer (parser) checks whether the input token stream from the lexical
analyzer follows the syntax rules defined by a context-free grammar (CFG).
Steps to Implement a Syntax Analyzer
1. Define the Grammar: Choose a CFG that represents the structure of the
programming language.
2. Build a Parser: Implement a parsing algorithm such as:
o Recursive Descent Parsing (Top-Down)
o LL(1) Parsing (Top-Down)
o LR Parsing (Bottom-Up)
3. Implement Parsing Logic: Use token sequences from the lexical analyzer to
check if the input follows the grammar.
4. Handle Errors: Detect and report syntax errors.

Implementation: A Simple Recursive Descent Parser


We will implement a syntax analyzer for arithmetic expressions using recursive
descent parsing.
Grammar Definition (CFG)
Expr → Term Expr'
Expr' → ('+' | '-') Term Expr' | ε
Term → Factor Term'
Term' → ('*' | '/') Factor Term' | ε
Factor → '(' Expr ')' | NUMBER
Lexical Analyzer (Tokenizer)

1
import re
# Token types
NUMBER, PLUS, MINUS, MUL, DIV, LPAREN, RPAREN, EOF = 'NUMBER',
'PLUS', 'MINUS', 'MUL', 'DIV', 'LPAREN', 'RPAREN', 'EOF'
# Tokenizer Class
class Lexer:
def init (self, text):
self.text = text
self.pos = 0
self.current_char = self.text[self.pos]
def advance(self):
"""Move to the next character."""
self.pos += 1
self.current_char = self.text[self.pos] if self.pos < len(self.text) else None
def skip_whitespace(self):
"""Ignore spaces."""
while self.current_char is not None and self.current_char.isspace():
self.advance()
def integer(self):
"""Extract a multi-digit number."""
result = ''
while self.current_char is not None and self.current_char.isdigit():
result += self.current_char
self.advance()
return int(result)
def get_next_token(self):
"""Tokenize the input string."""
while self.current_char is not None:

2
if self.current_char.isspace():
self.skip_whitespace()
continue
if self.current_char.isdigit():
return (NUMBER, self.integer())
if self.current_char == '+':
self.advance()
return (PLUS, '+')
if self.current_char == '-':
self.advance()
return (MINUS, '-')
if self.current_char == '*':
self.advance()
return (MUL, '*')
if self.current_char == '/':
self.advance()
return (DIV, '/')
if self.current_char == '(':
self.advance()
return (LPAREN, '(')
if self.current_char == ')':
self.advance()
return (RPAREN, ')')
raise Exception(f'Invalid character: {self.current_char}')
return (EOF, None)

3
Syntax Analyzer (Parser)
The parser will use recursive descent parsing to match the CFG rules.
class Parser:
def init (self, lexer):
self.lexer = lexer
self.current_token = self.lexer.get_next_token()

def eat(self, token_type):


"""Consume a token if it matches the expected type."""
if self.current_token[0] == token_type:
self.current_token = self.lexer.get_next_token()
else:
RaiseException(f'Syntaxerror:expected{token_type},
got {self.current_token[0]}')

def factor(self):
"""Factor → '(' Expr ')' | NUMBER"""
if self.current_token[0] == NUMBER:
self.eat(NUMBER)
elif self.current_token[0] == LPAREN:
self.eat(LPAREN)
self.expr()
self.eat(RPAREN)
else:
raise Exception('Syntax error: expected NUMBER or "("')

4
def term(self):
"""Term → Factor Term'"""
self.factor()
while self.current_token[0] in (MUL, DIV):
self.eat(self.current_token[0]) self.factor()
def expr(self):
"""Expr → Term Expr'"""
self.term()
while self.current_token[0] in (PLUS, MINUS): self.eat(self.current_token[0])
self.term()
def parse(self):
"""Start parsing from Expr."""
self.expr()
if self.current_token[0] != EOF:
raise Exception('Syntax error: unexpected token at the end')
print("Parsing successful! The expression is syntactically correct.")

Testing the Syntax Analyzer


def main():
while True:
try:
text = input("Enter an expression: ")
if text.lower() == "exit":
break

5
lexer = Lexer(text)
parser = Parser(lexer)
parser.parse()
except Exception as e:
print(f"Error: {e}")

if name == " main ":


main()

How It Works
1. Lexical Analysis: The Lexer converts an input string (e.g., "3 + 5 * (2 - 8)")
into a sequence of tokens.
2. Parsing: The Parser processes these tokens using recursive descent parsing to
ensure the input follows the given grammar.
3. Error Handling: If there is a syntax error (e.g., 3 + * 5), the parser raises an
exception.

Example Runs
Valid Expressions
Enter an expression: 3 + 5 * (2 - 8)
Parsing successful! The expression is syntactically correct.
Enter an expression: (4 / 2) + 6
Parsing successful! The expression is syntactically correct.
Invalid Expression
Enter an expression: 3 + * 5
Error: Syntax error: expected NUMBER or "(", got MUL

6
Conclusion
This syntax analyzer successfully checks whether arithmetic expressions are
syntactically correct according to a defined context-free grammar (CFG)
using recursive descent parsing. It can be extended to support a full programming
language syntax.

You might also like