0% found this document useful (0 votes)
34 views57 pages

Delhi Technological University Department of Computer Science and Engineering

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views57 pages

Delhi Technological University Department of Computer Science and Engineering

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

DELHI TECHNOLOGICAL UNIVERSITY

DEPARTMENT OF COMPUTER SCIENCE


AND ENGINEERING

COMPILER DESIGN LAB FILE

CO302

Submitted by: Submitted to:

Tushar Kheterpal Dr Rajeev Kumar

(2K21/CO/493)
INDEX

S. No. Experiment Name Date Signature

1 Write a program to find tokens in a program.

2 Write a program to convert NFA to DFA.

3 Write a program to remove left recursion in a grammar.

4 Write a program to left factor the given grammar.

5 Write a program to compute First and Follow.

6 Write a program to implement Shift Reduce parser.

7 Write a program to construct LL(1) parsing table.

8 Write a program to construct LR(0) parsing table.

9 Write a program to construct SLR(1) parsing table.

10 Write a program to construct LALR(1) parsing table.

11 Write a program to construct CLR(1) parsing table.

12 Write a program to construct operator precedence parser.


EXPERIMENT - 1
AIM: Write a program to find tokens in a program.

THEORY: As it is known that Lexical Analysis is the first phase of compiler also known as
scanner. It converts the input program into a sequence of tokens. A C program consists of various
tokens and a token is either a keyword, an identifier, a constant, a string literal or a symbol.

1) Keywords: Examples-for,while,if etc.

2) Identifier Examples-Variable name, function name etc.

3) Operators: Examples-'+','++','-'etc.

ALGORITHM:

1. Open the program file using fstream


2. If file is opened successfully, do:
● Parse the file line by line.
● Check if the word in buffer currently is

i. Keyword ii. Operator iii. Identifier

3. While it is not EoF:


4. Close the file
5. Output the number of:

a. Keywords b. Operators c. Identifiers

PROGRAM:

#include <iostream>

#include <string>

#include <sstream>

#include <unordered_set>

using namespace std;

bool isKeyword(const string& str) {

static const unordered_set<string> keywords = {


"auto", "break", "case", "char", "const", "continue", "default",
"do",

"double", "else", "enum", "extern", "float", "for", "goto", "if",

"int", "long", "register", "return", "short", "signed", "sizeof",


"static",

"struct", "switch", "typedef", "union", "unsigned", "void",


"volatile", "while"

};

return keywords.find(str) != keywords.end();

bool isOperator(char c) {

static const string operators = "+-*/%=&|<>^!~?:";

return operators.find(c) != string::npos;

void countTokens(const string& program, int& keywordsCount, int&


operatorsCount, int& identifiersCount) {

stringstream ss(program);

string token;

while (ss >> token) {

if (isKeyword(token)) {

keywordsCount++;

} else if (token.length() == 1 && isOperator(token[0])) {

operatorsCount++;

} else {

identifiersCount++;
}

int main() {

string program;

cout << "Enter the C++ program (end with Ctrl+D or Ctrl+Z):" << endl;

stringstream buffer;

buffer << cin.rdbuf();

program = buffer.str();

int keywordsCount = 0;

int operatorsCount = 0;

int identifiersCount = 0;

countTokens(program, keywordsCount, operatorsCount, identifiersCount);

int totalTokens = keywordsCount + operatorsCount + identifiersCount;

cout << "Number of keywords: " << keywordsCount << endl;

cout << "Number of operators: " << operatorsCount << endl;

cout << "Number of identifiers: " << identifiersCount << endl;

cout << "Total number of tokens: " << totalTokens << endl;

return 0;

}
OUTPUT:
EXPERIMENT - 2
AIM: Write a program to convert NFA to DFA.

THEORY:

An NFA can have zero, one or more than one move from a given state on a given input symbol.
An NFA can also have NULL moves (moves without input symbol). On the other hand, DFA has
one and only one move from a given state on a given input symbol.
Steps for converting NFA to DFA:
Step 1: Convert the given NFA to its equivalent transition table
To convert the NFA to its equivalent transition table, we need to list all the states, input symbols,
and the transition rules. The transition rules are represented in the form of a matrix, where the
rows represent the current state, the columns represent the input symbol, and the cells represent
the next state.
Step 2: Create the DFA’s start state
The DFA’s start state is the set of all possible starting states in the NFA. This set is called the
“epsilon closure” of the NFA’s start state. The epsilon closure is the set of all states that can be
reached from the start state by following epsilon (?) transitions.
Step 3: Create the DFA’s transition table
The DFA’s transition table is similar to the NFA’s transition table, but instead of individual states,
the rows and columns represent sets of states. For each input symbol, the corresponding cell in
the transition table contains the epsilon closure of the set of states obtained by following the
transition rules in the NFA’s transition table.
Step 4: Create the DFA’s final states
The DFA’s final states are the sets of states that contain at least one final state from the NFA.
Step 5: Simplify the DFA
The DFA obtained in the previous steps may contain unnecessary states and transitions. To
simplify the DFA, we can use the following techniques:
● Remove unreachable states: States that cannot be reached from the start state can be
removed from the DFA.
● Remove dead states: States that cannot lead to a final state can be removed from the
DFA.
● Merge equivalent states: States that have the same transition rules for all input
symbols can be merged into a single state.
Step 6: Repeat steps 3-5 until no further simplification is possible
After simplifying the DFA, we repeat steps 3-5 until no further simplification is possible. The
final DFA obtained is the minimized DFA equivalent to the given NFA.

PROGRAM:

def epsilon_closure(nfa, states):

closure = set(states)

stack = list(states)

while stack:

current_state = stack.pop()

if 'ε' in nfa[current_state]:

for state in nfa[current_state]['ε']:

if state not in closure:

closure.add(state)

stack.append(state)

return closure

def move(nfa, states, symbol):

result = set()

for state in states:

if symbol in nfa[state]:

result |= set(nfa[state][symbol])

return result

def nfa_to_dfa(nfa):

dfa = {}
alphabet = set()

for state in nfa:

alphabet |= set(nfa[state].keys())

initial_state = frozenset(epsilon_closure(nfa, {nfa['initial']}))

dfa[initial_state] = {}

stack = [initial_state]

while stack:

current_states = stack.pop()

for symbol in alphabet:

next_states = epsilon_closure(nfa, move(nfa, current_states,


symbol))

dfa[current_states][symbol] = next_states

if next_states and next_states not in dfa:

dfa[next_states] = {}

stack.append(next_states)

return dfa

def print_dfa(dfa):

print("DFA Transition Table:")

for state in dfa:

print(state, ": ", end="")

for symbol in dfa[state]:

print(symbol, "->", dfa[state][symbol], ", ", end="")

print()

nfa = {
'initial': 'q0',

'q0': {'0': ['q0', 'q1'], '1': ['q1'], 'ε': ['q2']},

'q1': {'1': ['q2'], 'ε': ['q0']},

'q2': {'0': ['q2'], '1': ['q2']}

dfa = nfa_to_dfa(nfa)

print_dfa(dfa)

OUTPUT:
EXPERIMENT - 3
AIM: Write a program to remove left recursion in a grammar.

THEORY:

Left recursion is a phenomenon that occurs in a grammar when a non-terminal can derive a string
that begins with itself. For example, consider a production rule like this:

A → Aα | β

In this production rule, A is left-recursive because it can derive a string starting with A. Left
recursion can cause issues in parsing algorithms because it can lead to infinite recursion or
ambiguity.

To remove left recursion from a grammar, you can use a technique called left recursion
elimination. The basic idea is to rewrite the production rules so that the left recursion is
converted to a right Recursion.

PROGRAM:

def remove_left_recursion(grammar):

new_grammar = {}

for non_terminal, productions in grammar.items():

recursive_productions = []

non_recursive_productions = []

for production in productions:

if production.startswith(non_terminal):

recursive_productions.append(production)

else:

non_recursive_productions.append(production)

if len(recursive_productions) > 0:

new_non_terminal = non_terminal + "'"


new_grammar[new_non_terminal] = []

for production in non_recursive_productions:

new_grammar[non_terminal] = [p + new_non_terminal for p in


non_recursive_productions]

for production in non_recursive_productions:

new_grammar[new_non_terminal].append(production +
new_non_terminal)

new_grammar[new_non_terminal].append('')

for production in recursive_productions:

new_grammar[new_non_terminal].extend(production[1:] +
new_non_terminal)

else:

new_grammar[non_terminal] = productions

return new_grammar

def print_grammar(grammar):

for non_terminal, productions in grammar.items():

print(non_terminal + " -> " + " | ".join(productions))

grammar = {

'E': ['E+T', 'T'],

'T': ['T*F', 'F'],

'F': ['(E)', 'id']

print("Original Grammar:")

print_grammar(grammar)
new_grammar = remove_left_recursion(grammar)

print("\nGrammar after Left Recursion Removal:")

print_grammar(new_grammar)

OUTPUT:
EXPERIMENT - 4
AIM: Write a program to left factor the given grammar.

THEORY:

In compiler design, left factorization is a technique that simplifies and improves a programming
language's syntax. To reduce ambiguity and duplication in the grammar, it entails detecting
common prefixes in productions of the grammar and factoring them out into independent
productions. This lowers the parser's complexity and increases parsing efficiency. The following
outlines the fundamentals of left factorization and how to construct a C++ programme for it:

Basic Theory of Left Factorization:

• In a context-free grammar, left factorization is the process of identifying common prefixes in


the right-hand sides of production rules.

• When multiple production rules share the same prefix, it can lead to ambiguity or redundancy
during parsing.

• Left factorization aims to factor out these common prefixes into separate production rules to
simplify the grammar and improve parsing efficiency.

• The resulting grammar should be unambiguous and capable of generating the same language as
the original grammar.

PROGRAM:

def left_factor(grammar):

new_grammar = {}

for non_terminal, productions in grammar.items():

common_prefixes = {}

for production in productions:

prefix = ''

for symbol in production:

prefix += symbol

if prefix not in common_prefixes:


common_prefixes[prefix] = []

common_prefixes[prefix].append(symbol)

for prefix, symbols in common_prefixes.items():

if len(symbols) > 1:

new_non_terminal = non_terminal + "'"

new_grammar[new_non_terminal] = [prefix]

updated_productions = []

for production in productions:

if production.startswith(prefix):

updated_productions.append(production[len(prefix):])

new_grammar[non_terminal] = updated_productions

for symbol in symbols:

if symbol != prefix:

if len(new_grammar.get(non_terminal, [])) == 0:

new_grammar[non_terminal] = []

new_grammar[non_terminal].append(symbol +
new_non_terminal)

new_grammar = left_factor(new_grammar)

break

return new_grammar

def print_grammar(grammar):

for non_terminal, productions in grammar.items():

print(non_terminal + " -> " + " | ".join(productions))


grammar = {

'S': ['abC', 'abD', 'abc', 'abd'],

'C': ['ef', 'efg'],

'D': ['fgh', 'fghi']

print("Original Grammar:")

print_grammar(grammar)

new_grammar = left_factor(grammar)

print("\nLeft Factored Grammar:")

print_grammar(new_grammar)

OUTPUT:
EXPERIMENT - 5
AIM: Write a program to compute First and Follow.

THEORY:

First(y) is the set of terminals that begin the strings derived from y. Follow(A) is the set of
terminals that can appear to the right of A. First and Follow are used in the construction of the
parsing table. A → abc / def / ghi First(A) = { a , d , g }

To compute First :-

1)X is a terminal First(X) = {X}

2)X → ε is a production add ε to First(X)

3)X is a non-terminal and X → Y1 Y2 … Yk is a production place z in First(X) if z is in


First(Yi) for some i and ε is in all of First(Y1) … First(Yi-1)

To compute Follow :-

1) Place $ in Follow(S), where S is the start symbol and $ is the end-of-input marker.

2) There is a production A → α B β everything in First(β) except for ε is placed in Follow(B).

3) There is a production A → α B, or a production A → α B β where First(β) contains ε


everything in Follow(A) is placed in Follow(B).

PROGRAM:

def compute_first(grammar):

first = {}

for non_terminal in grammar.keys():

first[non_terminal] = set()

while True:

updated = False

for non_terminal, productions in grammar.items():

for production in productions:


if len(production) == 0:

continue

first_symbol = production[0]

if first_symbol in first and len(first[first_symbol]) > 0:

old_size = len(first[non_terminal])

first[non_terminal] |= first[first_symbol]

if len(first[non_terminal]) != old_size:

updated = True

if '' not in first[first_symbol]:

break

else:

if '' not in first[non_terminal]:

old_size = len(first[non_terminal])

first[non_terminal].add('')

if len(first[non_terminal]) != old_size:

updated = True

if not updated:

break

return first

def compute_follow(grammar, first, start_symbol):

follow = {non_terminal: set() for non_terminal in grammar.keys()}

follow[start_symbol].add('$')

while True:
updated = False

for non_terminal, productions in grammar.items():

for production in productions:

for i in range(len(production)):

symbol = production[i]

if symbol in grammar:

if i < len(production) - 1:

next_symbol = production[i + 1]

if next_symbol in grammar:

follow_set = first[next_symbol]

if '' in follow_set:

follow_set -= {''}

follow_set |= follow[non_terminal]

else:

follow_set = {next_symbol}

old_size = len(follow[symbol])

follow[symbol] |= follow_set

if len(follow[symbol]) != old_size:

updated = True

else:

old_size = len(follow[symbol])

follow[symbol] |= follow[non_terminal]

if len(follow[symbol]) != old_size:
updated = True

if not updated:

break

return follow

def print_sets(first, follow):

print("First Set:")

for non_terminal, symbols in first.items():

print(non_terminal + ":", symbols)

print("\nFollow Set:")

for non_terminal, symbols in follow.items():

print(non_terminal + ":", symbols)

grammar = {

'S': ['AB'],

'A': ['aA', ''],

'B': ['bB', 'c']

first_set = compute_first(grammar)

follow_set = compute_follow(grammar, first_set, 'S')

print_sets(first_set, follow_set)
OUTPUT:
EXPERIMENT - 6
AIM: Write a program to implement Shift Reduce parser.

THEORY:

Shift Reduce parser attempts for the construction of parse in a similar manner as done in
bottom-up parsing i.e. the parse tree is constructed from leaves(bottom) to the root(up). A more
general form of the shift-reduce parser is the LR parser.
This parser requires some data structures i.e.
● An input buffer for storing the input string.
● A stack for storing and accessing the production rules.

PROGRAM:

class ShiftReduceParser:

def __init__(self, grammar, start_symbol, parsing_table):

self.grammar = grammar

self.start_symbol = start_symbol

self.stack = []

self.parsing_table = parsing_table

def shift(self, token):

self.stack.append(token)

print("Shifted token:", token)

print("Stack:", self.stack)

def reduce(self, production):

for _ in range(len(production.right)):

self.stack.pop()

self.stack.append(production.left)
print("Reduced by production:", production.left, "->",
production.right)

print("Stack:", self.stack)

def parse(self, input_tokens):

input_index = 0

while input_index < len(input_tokens):

top = self.stack[-1] if self.stack else None

next_token = input_tokens[input_index]

if (top, next_token) in self.parsing_table:

action = self.parsing_table[(top, next_token)]

if action.startswith('S'):

self.shift(next_token)

input_index += 1

elif action.startswith('R'):

production = self.grammar[int(action[1:])]

self.reduce(production)

elif action == 'ACC':

print("Parsing Successful!")

return True

else:

print("Error: No action defined for current state and


input token")

return False

print("Error: Parsing incomplete")


return False

class Production:

def __init__(self, left, right):

self.left = left

self.right = right

class Grammar:

def __init__(self, productions):

self.productions = productions

productions = [

Production('E', ['E', '+', 'E']),

Production('E', ['E', '*', 'E']),

Production('E', ['(', 'E', ')']),

Production('E', ['id'])

grammar = Grammar(productions)

parsing_table = {

('E', 'id'): 'S3',

('E', '('): 'S3',

('E', '+'): 'S1',

('E', '*'): 'S2',

('E', ')'): 'R4',

('$', '$'): 'ACC',

('E', 'E'): 'R4'


}

input_tokens = ['id', '+', 'id', '*', 'id', '$']

parser = ShiftReduceParser(grammar, 'E', parsing_table)

parser.parse(input_tokens)

OUTPUT:
EXPERIMENT - 7
AIM: Write a program to construct LL(1) parsing table.

THEORY:

LL(1) Parsing: Here the 1st L represents that the scanning of the Input will be done from the
Left to Right manner and the second L shows that in this parsing technique, we are going to use
the Left most Derivation Tree. And finally, the 1 represents the number of look-ahead, which
means how many symbols are you going to see when you want to make a decision.

PROGRAM:

class LL1Parser:

def __init__(self, grammar, parsing_table):

self.grammar = grammar

self.parsing_table = parsing_table

def parse(self, input_tokens):

stack = ['$']

input_tokens.append('$')

index = 0

stack.append(self.grammar.start_symbol)

while stack:

top = stack[-1]

current_token = input_tokens[index]

if top == current_token == '$':

print("Parsing Successful!")
return True

elif top in self.grammar.terminals:

if top == current_token:

stack.pop()

index += 1

else:

print("Error: Mismatch between stack and input token")

return False

elif (top, current_token) in self.parsing_table:

production = self.parsing_table[(top, current_token)]

stack.pop()

if production != 'ε':

for symbol in reversed(production.right):

stack.append(symbol)

else:

print("Error: No production rule defined for current stack


symbol and input token")

return False

print("Error: Parsing incomplete")

return False

class Production:

def __init__(self, left, right):

self.left = left

self.right = right
class Grammar:

def __init__(self, start_symbol, terminals, non_terminals,


productions):

self.start_symbol = start_symbol

self.terminals = terminals

self.non_terminals = non_terminals

self.productions = productions

start_symbol = 'E'

terminals = {'+', '*', '(', ')', 'id'}

non_terminals = {'E', 'T', 'F'}

productions = [

Production('E', ['E', '+', 'T']),

Production('E', ['T']),

Production('T', ['T', '*', 'F']),

Production('T', ['F']),

Production('F', ['(', 'E', ')']),

Production('F', ['id'])

grammar = Grammar(start_symbol, terminals, non_terminals, productions)

parsing_table = {

('E', 'id'): productions[1],

('E', '('): productions[1],

('T', 'id'): productions[3],

('T', '('): productions[3],


('F', 'id'): productions[5],

('F', '('): productions[4]

input_tokens = ['id', '+', 'id', '*', 'id']

parser = LL1Parser(grammar, parsing_table)

parser.parse(input_tokens)

OUTPUT:
EXPERIMENT - 8
AIM: Write a program to construct LR(0) parsing table.

THEORY:

It is an efficient bottom-up syntax analysis technique that can be used to parse large classes of
context free grammar is called LR(0) parsing.

L stands for the left to right scanning

R stands for rightmost derivation in reverse

0 stands for no. of input symbols of lookahead

PROGRAM:

class Production:

def __init__(self, head, body):

self.head = head

self.body = body def __repr__(self):

return f"{self.head} -> {' '.join(self.body)}"

class Item:

def __init__(self, production, dot_position):

self.production = production

self.dot_position = dot_position

def __repr__(self):

body = self.production.body[:]

body.insert(self.dot_position, '•')

return f"{self.production.head} -> {' '.join(body)}"

class LR0Parser:
def __init__(self, grammar, start_symbol):

self.grammar = grammar

self.start_symbol = start_symbol

self.canonical_collection = []

self.action_table = {}

self.goto_table = {}

def compute_closure(self, items):

closure = items[:]

added = True

while added:

added = False

for item in closure:

if item.dot_position < len(item.production.body):

symbol_after_dot =
item.production.body[item.dot_position]

if symbol_after_dot in self.grammar.non_terminals:

for production in self.grammar.productions:

if production.head == symbol_after_dot:

new_item = Item(production, 0)

if new_item not in closure:

closure.append(new_item)

added = True

return closure

def compute_goto(self, items, symbol):


new_items = []

for item in items:

if item.dot_position < len(item.production.body) and


item.production.body[item.dot_position] == symbol:

new_item = Item(item.production, item.dot_position + 1)

new_items.append(new_item)

return self.compute_closure(new_items)

def construct_canonical_collection(self):

start_production = Production(self.start_symbol + "'",


[self.start_symbol])

start_item = Item(start_production, 0)

self.canonical_collection = [self.compute_closure([start_item])]

added = True

while added:

added = False

for items in self.canonical_collection:

for symbol in self.grammar.terminals |


self.grammar.non_terminals:

new_items = self.compute_goto(items, symbol)

if new_items and new_items not in


self.canonical_collection:

self.canonical_collection.append(new_items)

added = True

def construct_parsing_table(self):

for i, items in enumerate(self.canonical_collection):


for item in items:

if item.dot_position < len(item.production.body) and


item.production.body[item.dot_position] in self.grammar.terminals:

symbol = item.production.body[item.dot_position]

new_state = self.compute_goto(items, symbol)

if new_state in self.canonical_collection:

self.action_table[(i, symbol)] = ('S',


self.canonical_collection.index(new_state))

elif item.dot_position == len(item.production.body):

if item.production.head == self.start_symbol + "'":

self.action_table[(i, '$')] = ('ACC',)

else:

for j, items2 in
enumerate(self.canonical_collection):

if item.production in [x.production for x in


items2]:

for symbol in
self.grammar.follow[item.production.head]:

self.action_table[(i, symbol)] = ('R',


j)

for symbol in self.grammar.non_terminals:

new_state = self.compute_goto(items, symbol)

if new_state in self.canonical_collection:

self.goto_table[(i, symbol)] =
self.canonical_collection.index(new_state)

def print_parsing_table(self):
print("LR(0) Parsing Table:")

print("Action Table:")

for (state, symbol), action in self.action_table.items():

print(f" State {state}, Symbol {symbol}:


{action[0]}{action[1]}")

print("Goto Table:")

for (state, symbol), goto_state in self.goto_table.items():

print(f" State {state}, Symbol {symbol}: {goto_state}")

productions = [

Production('E', ['E', '+', 'T']),

Production('E', ['T']),

Production('T', ['T', '*', 'F']),

Production('T', ['F']),

Production('F', ['(', 'E', ')']),

Production('F', ['id'])

start_symbol = 'E'

grammar = Grammar(set(['E', 'T', 'F']), set(['+', '*', '(', ')', 'id']),


productions)

grammar.compute_follow()

parser = LR0Parser(grammar, start_symbol)

parser.construct_canonical_collection()

parser.construct_parsing_table()

parser.print_parsing_table()
OUTPUT:
EXPERIMENT - 9
AIM: Write a program to construct SLR(1) parsing table.

THEORY:

SLR Parser :
SLR is simple LR. It is the smallest class of grammar having few number of states. SLR is very
easy to construct and is similar to LR parsing. The only difference between SLR parser and
LR(0) parser is that in LR(0) parsing table, there’s a chance of ‘shift reduced’ conflict because
we are entering ‘reduce’ corresponding to all terminal states. We can solve this problem by
entering ‘reduce’ corresponding to FOLLOW of LHS of production in the terminating state. This
is called SLR(1) collection of items
Steps for constructing the SLR parsing table :
1. Writing augmented grammar
2. LR(0) collection of items to be found
3. Find FOLLOW of LHS of production
4. Defining 2 functions:goto[list of terminals] and action[list of non-terminals] in the
parsing table.

PROGRAM:

class Production:

def __init__(self, head, body):

self.head = head

self.body = body

def __repr__(self):

return f"{self.head} -> {' '.join(self.body)}"

class Item:

def __init__(self, production, dot_position, lookahead=None):

self.production = production
self.dot_position = dot_position

self.lookahead = lookahead or []

def __repr__(self):

body = self.production.body[:]

body.insert(self.dot_position, '•')

return f"{self.production.head} -> {' '.join(body)}, {',


'.join(self.lookahead)}"

class SLRParser:

def __init__(self, grammar, start_symbol):

self.grammar = grammar

self.start_symbol = start_symbol

self.canonical_collection = []

self.action_table = {}

self.goto_table = {}

def compute_closure(self, items):

closure = items[:]

added = True

while added:

added = False

for item in closure:

if item.dot_position < len(item.production.body):

symbol_after_dot =
item.production.body[item.dot_position]

if symbol_after_dot in self.grammar.non_terminals:
for production in self.grammar.productions:

if production.head == symbol_after_dot:

new_item = Item(production, 0)

if new_item not in closure:

closure.append(new_item)

added = True

return closure

def compute_goto(self, items, symbol):

new_items = []

for item in items:

if item.dot_position < len(item.production.body) and


item.production.body[item.dot_position] == symbol:

new_item = Item(item.production, item.dot_position + 1,


item.lookahead)

new_items.append(new_item)

return self.compute_closure(new_items)

def construct_canonical_collection(self):

start_production = Production(self.start_symbol + "'",


[self.start_symbol])

start_item = Item(start_production, 0, ['$'])

self.canonical_collection = [self.compute_closure([start_item])]

added = True

while added:

added = False
for items in self.canonical_collection:

for symbol in self.grammar.terminals |


self.grammar.non_terminals:

new_items = self.compute_goto(items, symbol)

if new_items and new_items not in


self.canonical_collection:

self.canonical_collection.append(new_items)

added = True

def construct_parsing_table(self):

for i, items in enumerate(self.canonical_collection):

for item in items:

if item.dot_position < len(item.production.body) and


item.production.body[item.dot_position] in self.grammar.terminals:

symbol = item.production.body[item.dot_position]

new_state = self.compute_goto(items, symbol)

if new_state in self.canonical_collection:

self.action_table[(i, symbol)] = ('S',


self.canonical_collection.index(new_state))

elif item.dot_position == len(item.production.body):

for lookahead in item.lookahead:

self.action_table[(i, lookahead)] = ('R',


self.grammar.productions.index(item.production))

for symbol in self.grammar.non_terminals:

new_state = self.compute_goto(items, symbol)

if new_state in self.canonical_collection:
self.goto_table[(i, symbol)] =
self.canonical_collection.index(new_state)

def print_parsing_table(self):

print("SLR(1) Parsing Table:")

print("Action Table:")

for (state, symbol), action in self.action_table.items():

print(f" State {state}, Symbol {symbol}:


{action[0]}{action[1]}")

print("Goto Table:")

for (state, symbol), goto_state in self.goto_table.items():

print(f" State {state}, Symbol {symbol}: {goto_state}")

productions = [

Production('E', ['E', '+', 'T']),

Production('E', ['T']),

Production('T', ['T', '*', 'F']),

Production('T', ['F']),

Production('F', ['(', 'E', ')']),

Production('F', ['id'])

grammar = Grammar(set(['E', 'T', 'F']), set(['+', '*', '(', ')', 'id']),


productions)

parser = SLRParser(grammar, 'E')

parser.construct_canonical_collection()

parser.construct_parsing_table()
parser.print_parsing_table()

OUTPUT:
EXPERIMENT - 10
AIM: Write a program to construct LALR(1) parsing table.

THEORY:

LALR Parser is lookahead LR parser. It is the most powerful parser which can handle large
classes of grammar. The size of CLR parsing table is quite large as compared to other parsing
table. LALR reduces the size of this table.LALR works similar to CLR. The only difference is , it
combines the similar states of CLR parsing table into one single state.

The general syntax becomes [A->∝.B, a ]

where A->∝.B is production and a is a terminal or right end marker $

LR(1) items=LR(0) items + look ahead

PROGRAM:

class Production:

def __init__(self, head, body):

self.head = head

self.body = body

def __repr__(self):

return f"{self.head} -> {' '.join(self.body)}"

class Item:

def __init__(self, production, dot_position, lookahead=None):

self.production = production

self.dot_position = dot_position

self.lookahead = lookahead or set()

def __repr__(self):

body = self.production.body[:]
body.insert(self.dot_position, '•')

return f"{self.production.head} -> {' '.join(body)}, {',


'.join(self.lookahead)}"

class LALRParser:

def __init__(self, grammar, start_symbol):

self.grammar = grammar

self.start_symbol = start_symbol

self.canonical_collection = []

self.action_table = {}

self.goto_table = {}

def compute_closure(self, items):

closure = items[:]

added = True

while added:

added = False

for item in closure:

if item.dot_position < len(item.production.body):

symbol_after_dot =
item.production.body[item.dot_position]

if symbol_after_dot in self.grammar.non_terminals:

for production in self.grammar.productions:

if production.head == symbol_after_dot:

new_item = Item(production, 0,
item.lookahead)
if new_item not in closure:

closure.append(new_item)

added = True

return closure

def compute_lookahead(self, item):

if item.dot_position == len(item.production.body) - 1:

return item.lookahead

else:

symbol_after_dot = item.production.body[item.dot_position + 1]

first_set = self.grammar.compute_first(symbol_after_dot)

lookahead = set()

for symbol in first_set:

if symbol != 'ε':

lookahead.add(symbol)

if 'ε' in first_set or len(first_set) == 0:

for terminal in item.lookahead:

lookahead.add(terminal)

return lookahead

def compute_goto(self, items, symbol):

new_items = []

for item in items:

if item.dot_position < len(item.production.body) and


item.production.body[item.dot_position] == symbol:
new_item = Item(item.production, item.dot_position + 1,
item.lookahead)

new_items.append(new_item)

return self.compute_closure(new_items)

def construct_canonical_collection(self):

start_production = Production(self.start_symbol + "'",


[self.start_symbol])

start_item = Item(start_production, 0, {'$'})

self.canonical_collection = [self.compute_closure([start_item])]

added = True

while added:

added = False

for items in self.canonical_collection:

for symbol in self.grammar.terminals |


self.grammar.non_terminals:

new_items = self.compute_goto(items, symbol)

if new_items and new_items not in


self.canonical_collection:

self.canonical_collection.append(new_items)

added = True

def merge_states(self):

merged_states = {}

for i, items in enumerate(self.canonical_collection):

core = frozenset((item.production, item.dot_position) for item


in items)
lookahead_sets = [item.lookahead for item in items]

if core in merged_states:

merged_states[core].update(lookahead_sets)

else:

merged_states[core] = set(lookahead_sets)

self.canonical_collection = [self.compute_closure([Item(core[0],
core[1], lookahead)])

for core, lookahead in


merged_states.items()]

def construct_parsing_table(self):

for i, items in enumerate(self.canonical_collection):

for item in items:

if item.dot_position < len(item.production.body) and


item.production.body[item.dot_position] in self.grammar.terminals:

symbol = item.production.body[item.dot_position]

new_state = self.compute_goto(items, symbol)

if new_state in self.canonical_collection:

self.action_table[(i, symbol)] = ('S',


self.canonical_collection.index(new_state))

elif item.dot_position == len(item.production.body) and


item.production.head != self.start_symbol + "'":

for lookahead in item.lookahead:

self.action_table[(i, lookahead)] = ('R',


self.grammar.productions.index(item.production))

for symbol in self.grammar.non_terminals:

new_state = self.compute_goto(items, symbol)


if new_state in self.canonical_collection:

self.goto_table[(i, symbol)] =
self.canonical_collection.index(new_state)

def print_parsing_table(self):

print("LALR(1) Parsing Table:")

print("Action Table:")

for (state, symbol), action in self.action_table.items():

print(f" State {state}, Symbol {symbol}:


{action[0]}{action[1]}")

print("Goto Table:")

for (state, symbol), goto_state in self.goto_table.items():

print(f" State {state}, Symbol {symbol}: {goto_state}")

productions = [

Production('E', ['E', '+', 'T']),

Production('E', ['T']),

Production('T', ['T', '*', 'F']),

Production('T', ['F']),

Production('F', ['(', 'E', ')']),

Production('F', ['id'])

start_symbol = 'E'

grammar = Grammar(set(['E', 'T', 'F']), set(['+', '*', '(', ')', 'id']),


productions)

grammar.compute_follow()
parser = LALRParser(grammar, start_symbol)

parser.construct_canonical_collection()

parser.merge_states()

parser.construct_parsing_table()

parser.print_parsing_table()

OUTPUT:
EXPERIMENT - 11
AIM: Write a program to construct CLR(1) parsing table.

THEORY:

The CLR parser stands for canonical LR parser.It is a more powerful LR parser.It makes use of
lookahead symbols. This method uses a large set of items called LR(1) items.The main
difference between LR(0) and LR(1) items is that, in LR(1) items, it is possible to carry more
information in a state, which will rule out useless reduction states.This extra information is
incorporated into the state by the lookahead symbol. The general syntax becomes [A->∝.B, a ]
where A->∝.B is the production and a is a terminal or right end marker $
LR(1) items=LR(0) items + look ahead

PROGRAM:

class Production:

def __init__(self, head, body):

self.head = head

self.body = body

def __repr__(self):

return f"{self.head} -> {' '.join(self.body)}"

class Item:

def __init__(self, production, dot_position, lookahead=None):

self.production = production

self.dot_position = dot_position

self.lookahead = lookahead or set()

def __repr__(self):

body = self.production.body[:]

body.insert(self.dot_position, '•')
return f"{self.production.head} -> {' '.join(body)}, {',
'.join(self.lookahead)}"

class CLRParser:

def __init__(self, grammar, start_symbol):

self.grammar = grammar

self.start_symbol = start_symbol

self.canonical_collection = []

self.action_table = {}

self.goto_table = {}

def compute_closure(self, items):

closure = items[:]

added = True

while added:

added = False

for item in closure:

if item.dot_position < len(item.production.body):

symbol_after_dot =
item.production.body[item.dot_position]

if symbol_after_dot in self.grammar.non_terminals:

for production in self.grammar.productions:

if production.head == symbol_after_dot:

new_item = Item(production, 0,
self.compute_lookahead(item))

if new_item not in closure:


closure.append(new_item)

added = True

return closure

def compute_lookahead(self, item):

if item.dot_position == len(item.production.body) - 1:

return item.lookahead

else:

symbol_after_dot = item.production.body[item.dot_position + 1]

first_set = self.grammar.compute_first(symbol_after_dot)

lookahead = set()

for symbol in first_set:

if symbol != 'ε':

lookahead.add(symbol)

if 'ε' in first_set or len(first_set) == 0:

for terminal in item.lookahead:

lookahead.add(terminal)

return lookahead

def compute_goto(self, items, symbol):

new_items = []

for item in items:

if item.dot_position < len(item.production.body) and


item.production.body[item.dot_position] == symbol:

new_item = Item(item.production, item.dot_position + 1,


item.lookahead)
new_items.append(new_item)

return self.compute_closure(new_items)

def construct_canonical_collection(self):

start_production = Production(self.start_symbol + "'",


[self.start_symbol])

start_item = Item(start_production, 0, {'$'})

self.canonical_collection = [self.compute_closure([start_item])]

added = True

while added:

added = False

for items in self.canonical_collection:

for symbol in self.grammar.terminals |


self.grammar.non_terminals:

new_items = self.compute_goto(items, symbol)

if new_items and new_items not in


self.canonical_collection:

self.canonical_collection.append(new_items)

added = True

def construct_parsing_table(self):

for i, items in enumerate(self.canonical_collection):

for item in items:

if item.dot_position < len(item.production.body) and


item.production.body[item.dot_position] in self.grammar.terminals:

symbol = item.production.body[item.dot_position]

new_state = self.compute_goto(items, symbol)


if new_state in self.canonical_collection:

self.action_table[(i, symbol)] = ('S',


self.canonical_collection.index(new_state))

elif item.dot_position == len(item.production.body) and


item.production.head != self.start_symbol + "'":

for lookahead in item.lookahead:

self.action_table[(i, lookahead)] = ('R',


self.grammar.productions.index(item.production))

for symbol in self.grammar.non_terminals:

new_state = self.compute_goto(items, symbol)

if new_state in self.canonical_collection:

self.goto_table[(i, symbol)] =
self.canonical_collection.index(new_state)

def print_parsing_table(self):

print("CLR(1) Parsing Table:")

print("Action Table:")

for (state, symbol), action in self.action_table.items():

print(f" State {state}, Symbol {symbol}:


{action[0]}{action[1]}")

print("Goto Table:")

for (state, symbol), goto_state in self.goto_table.items():

print(f" State {state}, Symbol {symbol}: {goto_state}")

productions = [

Production('E', ['E', '+', 'T']),

Production('E', ['T']),
Production('T', ['T', '*', 'F']),

Production('T', ['F']),

Production('F', ['(', 'E', ')']),

Production('F', ['id'])

start_symbol = 'E'

grammar = Grammar(set(['E', 'T', 'F']), set(['+', '*', '(', ')', 'id']),


productions)

grammar.compute_follow()

parser = CLRParser(grammar, start_symbol)

parser.construct_canonical_collection()

parser.construct_parsing_table()

parser.print_parsing_table()

OUTPUT:
EXPERIMENT - 12
AIM: Write a program to construct operator precedence parser.

THEORY:

An operator precedence parser is a bottom-up parser that interprets an operator grammar. This
parser is only used for operator grammars. Ambiguous grammars are not allowed in any parser
except operator precedence parser. There are two methods for determining what precedence
relations should hold between a pair of terminals:
1. Use the conventional associativity and precedence of operator.
2. The second method of selecting operator-precedence relations is first to construct an
unambiguous grammar for the language, a grammar that reflects the correct
associativity and precedence in its parse trees.

PROGRAM:

class OperatorPrecedenceParser:

def __init__(self):

self.precedence = {

'+': 1,

'-': 1,

'*': 2,

'/': 2,

'^': 3

def parse_expression(self, expression):

stack = []

output = []

for token in expression:


if token.isalnum():

output.append(token)

elif token == '(':

stack.append(token)

elif token == ')':

while stack and stack[-1] != '(':

output.append(stack.pop())

stack.pop()

else:

while stack and stack[-1] != '(' and


self.precedence[stack[-1]] >= self.precedence[token]:

output.append(stack.pop())

stack.append(token)

while stack:

output.append(stack.pop())

return output

def evaluate_postfix(self, postfix):

stack = []

for token in postfix:

if token.isalnum():

stack.append(int(token))

else:

operand2 = stack.pop()

operand1 = stack.pop()
if token == '+':

stack.append(operand1 + operand2)

elif token == '-':

stack.append(operand1 - operand2)

elif token == '*':

stack.append(operand1 * operand2)

elif token == '/':

stack.append(operand1 / operand2)

elif token == '^':

stack.append(operand1 ** operand2)

return stack[0]

parser = OperatorPrecedenceParser()

expression = "2 + 3 * 4 - ( 5 ^ 2 )"

postfix = parser.parse_expression(expression)

print("Postfix expression:", postfix)

result = parser.evaluate_postfix(postfix)

print("Result:", result)

OUTPUT:

You might also like