0% found this document useful (0 votes)
20 views55 pages

Practical File Compiler Construction Code: CSE304: Submitted by

The document is a practical file for Compiler Construction submitted by Aditya Kaushik, detailing various programming assignments related to compiler theory and implementation. It includes aims, software used, theoretical backgrounds, and code implementations for tasks such as checking string acceptance by production rules, converting infix to postfix expressions, counting tokens, and removing left recursion. Each practical section provides a structured approach to understanding compiler construction concepts through coding exercises.

Uploaded by

Yoddha Gaming
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views55 pages

Practical File Compiler Construction Code: CSE304: Submitted by

The document is a practical file for Compiler Construction submitted by Aditya Kaushik, detailing various programming assignments related to compiler theory and implementation. It includes aims, software used, theoretical backgrounds, and code implementations for tasks such as checking string acceptance by production rules, converting infix to postfix expressions, counting tokens, and removing left recursion. Each practical section provides a structured approach to understanding compiler construction concepts through coding exercises.

Uploaded by

Yoddha Gaming
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Practical File

Compiler Construction
Code: CSE304

Submitted by
Aditya Kaushik
A2305221161
6CSE-7Y

Submitted to
Dr. Juhi Singh
Assistant Professor

Department of Computer Science and Engineering


Amity School of Engineering and Technology
Amity University Uttar Pradesh, Noida
INDEX

S.No. Aim Date Signature


1. Write a program in C to check if given string is 08/01/24
accepted or rejected by production rules. S → aS |
Sb | ab.
2. WAP to convert infix equation to postfix equation 15/01/24
using stack implementation
3. WAP in C/C++ to count number of tokens in the given 29/01/24
program.
4. WAP in C/C++ to remove left recursion from the given 05/02/24
grammar.
5. WAP in C/C++ to convert infix expression to prefix 12/02/24
expression.
6. WAP in C/C++ to convert regular expression to Finite 19/02/24
Automata.
7. Write a program in C/C++ to convert NFA to DFA. 26/02/24
8. Write a program in C/C++ to find first and follow of 04/03/24
Given CFG.
9. Write a program in C/C++ to design shift reduce 11/03/24
parser for: E -> E + E | E * E | (E) | Id.
10. Write a program which accepts a regular grammar 18/03/24
with no left-recursion, and no null production rules,
and then it accepts a string and reports whether the
string is accepted by the grammar or not.
11. Write separate programs for each of the regular 25/03/24
expressions mentioned
12. Write a program for Recursive Descent Calculator. 25/03/24
13. Consider the following grammar: S --> ABC A--> abA | 25/03/24
ab B--> b | BC C--> c | cC Following any suitable
parsing technique (prefer top-down), design a parser
which accepts a string and tells whether the string is
accepted by above grammar or not.
14. Design a parser which accepts a mathematical 25/03/24
expression (containing integers only). If the expression
is valid, then evaluate the expression else report that
the expression is invalid
15. Open Ended program: Designing of various types of 25/03/24
parser.
PRACTICAL – 1
Aim – Write a program in C to check if given string is accepted or rejected by production rules.
S → aS | Sb | ab.
Software Used – GCC Compiler, Visual Studio Code
Theory –
Regular Expression – A regular expression is basically a shorthand way of showing how a
regular language is built from the base set of regular languages. The symbols are identical
which are used to construct the languages, and any given expression that has a language closely
associated with it. For each regular expression E, there is a regular language L(E).
Grammar –
Grammar in theory of computation is a finite set of formal rules that are generating syntactically
correct sentences.
The formal definition of grammar is that it is defined as four tuples −
G= (V, T, P, S)
• G is a grammar, which consists of a set of production rules. It is used to generate the
strings of a language.
• T is the final set of terminal symbols. It is denoted by lower case letters.
• V is the final set of non-terminal symbols. It is denoted by capital letters.
• P is a set of production rules, which is used for replacing non-terminal symbols (on the
left side of production) in a string with other terminals (on the right side of production).
• S is the start symbol used to derive the string.
Terminal Symbols - Terminal symbols are the components of the sentences that are
generated using grammar and are denoted using small case letters like a, b, c etc.
Non-Terminal Symbols - Non-Terminal Symbols take part in the generation of the
sentence but are not the component of the sentence. These types of symbols are also called
Auxiliary Symbols and Variables. They are represented using a capital letter like A, B, C,
etc.
Lexical Analysis –
Lexical analysis is the first phase of a compiler. It takes modified source code from language
preprocessors that are written in the form of sentences. The lexical analyzer breaks these
syntaxes into a series of tokens, by removing any whitespace or comments in the source code.
If the lexical analyzer finds a token invalid, it generates an error. The lexical analyzer works
closely with the syntax analyzer. It reads character streams from the source code, checks for
legal tokens, and passes the data to the syntax analyzer when it demands.
Example –
a=b+c
tokens:
a – identifier
b – identifier
c – identifier
= - assignment operator
+ addition operator
Lexical Analysis –
<id,1> <=> <id,2> <+> <id,3>

Code –

#include <stdio.h>

void main() {

printf("The grammar is S->aS, S->Sb, S->ab\n");

printf("Enter a string: ");

int state = 0, count = 0;

char str[100];

scanf("%s",str);

while (str[count] != '\0') {

switch (state) {

case 0:

if (str[count] == 'a')

state = 1;

else

state = 3;

break;

case 1:

if (str[count] == 'a')

state = 1;

else if (str[count] == 'b') {


state = 2;

} else {

state = 3;

break;

case 2:

if (str[count] == 'b')

state = 2;

else

state = 3;

break;

default:

break;

count++;

if (state == 2) {

printf("The string is accepted\n");

} else {

printf("The string is rejected\n");

}
Output –
PRACTICAL– 2
Aim – WAP to convert infix equation to postfix equation using stack implementation.
Software Used – GCC (C++) Compiler, VS Code
Theory –
Let, X is an arithmetic expression written in infix notation. This algorithm finds the
equivalent postfix expression Y.
1. Push "("onto Stack, and add ")" to the end of X.
2. Scan X from left to right and repeat Step 3 to 6 for each element of X until the Stack is
empty. 3. If an operand is encountered, add it to Y.
4. If a left parenthesis is encountered, push it onto Stack.
5. If an operator is encountered then:
1. Repeatedly pop from Stack and add to Y each operator (on the top of Stack) which
has the same precedence as or higher precedence than operator.
2. Add operator to Stack. [End of If]
6. If a right parenthesis is encountered then:
1. Repeatedly pop from Stack and add to Y each operator (on the top of Stack) until a
left parenthesis is encountered.
2. Remove the left Parenthesis.
[End of If]
[End of If]
7. END.

Code –
#include <stdio.h>
#define MAX_SIZE 100
char stack[MAX_SIZE];
int top = -1;
void push(char item) {
if (top >= MAX_SIZE - 1) {
printf("Stack Overflow\n");
return;
}
stack[++top] = item;
}
char pop() {
if (top == -1) {
printf("Stack Underflow\n");
getchar();
exit(1);
}
return stack[top--];
}
int precedence(char ch) {
switch (ch) {
case '^':
return 3;
case '*':
case '/':
return 2;
case '+':
case '-':
return 1;
default:
return 0;
}
}
int isOperator(char ch) {
return (ch == '+' || ch == '-' || ch == '*' || ch == '/' || ch == '^');
}
void infixToPostfix(char infix[], char postfix[]) {
int i, j;
char item, x;
push('(');
strcat(infix, ")");
i = 0;
j = 0;
item = infix[i];
while (item != '\0') {
if (isalnum(item)) {
postfix[j++] = item;
} else if (item == '(') {
push(item);
} else if (isOperator(item)) {
x = pop();
while (precedence(x) >= precedence(item) && x != '(') {
postfix[j++] = x;
x = pop();
}
push(x);
push(item);
} else if (item == ')') {
x = pop();
while (x != '(') {
postfix[j++] = x;
x = pop();
}
} else {
printf("Invalid Infix Expression\n");
return;
}
i++;
item = infix[i];
}
postfix[j] = '\0';
}
int main() {
char infix[MAX_SIZE], postfix[MAX_SIZE];
printf("Enter the infix expression: ");
gets(infix);
infixToPostfix(infix, postfix);
printf("Postfix expression: %s\n", postfix);
return 0;
}

Output –
PRACTICAL– 3
Aim – WAP in C/C++ to count number of tokens in the given program.
Software Used – GCC (C++) Compiler, VS Code
Theory –
Lexical analysis is the first phase of a compiler. It takes modified source code from language
preprocessors that are written in the form of sentences. The lexical analyzer breaks these
syntaxes into a series of tokens, by removing any whitespace or comments in the source code.
If the lexical analyzer finds a token invalid, it generates an error. The lexical analyzer works
closely with the syntax analyzer. It reads character streams from the source code, checks for
legal tokens, and passes the data to the syntax analyser when it demands.
Lexemes are said to be a sequence of characters (alphanumeric) in a token. There are some
predefined rules for every lexeme to be identified as a valid token. These rules are defined by
grammar rules, by means of a pattern. A pattern explains what a token can be, and these
patterns are defined by means of regular expressions.
In programming language, keywords, constants, identifiers, strings, numbers, operators and
punctuations symbols can be considered as tokens.
For example, in C language, the variable declaration line.
int value = 100;
Tokens –
int (keyword), value (identifier), = (operator), 100 (constant) and; (symbol).

Code –
#include <iostream>
#include <unordered_set>
using namespace std;

int main() {
string str;
unordered_set<char> tokens;
cout << "Enter Sequence: ";
cin >> str;
for (char c : str) {
if (isalnum(c) || ispunct(c)) {
tokens.insert(c);
}
}
cout << "Number of Tokens: " << tokens.size() << endl;
return 0;
}
Output –
PRACTICAL – 4
Aim – WAP in C/C++ to remove left recursion from the given grammar.
Software Used – GCC (C++) Compiler, VS Code
Theory –
Left recursion is a common issue in context-free grammars that can cause inefficiencies in
parsing algorithms and lead to ambiguity in parsing. It occurs when a non-terminal in a
production rule directly or indirectly references itself from the left side. To eliminate left
recursion, one common technique is to rewrite the grammar in a way that preserves its original
language but removes the left recursion. This typically involves breaking down the recursive
productions into non-recursive ones. The process often includes introducing new non-terminals
and adjusting existing production rules.

Code –

#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#define SIZE 20
int main(){
char pro[SIZE], alpha[SIZE], beta[SIZE];
int nont_terminal,i,j, index=3;
printf("Enter the Production: ");
scanf("%s", pro);
nont_terminal=pro[0];
if(nont_terminal==pro[index]) {
for(i=++index,j=0;pro[i]!='|';i++,j++){
alpha[j]=pro[i];
if(pro[i+1]==0){
printf("This Grammar CAN'T BE REDUCED.\n");
exit(0); //Exit the Program
}
}
alpha[j]='\0';
if(pro[++i]!=0) {
for(j=i,i=0;pro[j]!='\0';i++,j++){
beta[i]=pro[j];
}
beta[i]='\0';
printf("\nGrammar Without Left Recursion: \n\n");
printf(" %c->%s%c'\n", nont_terminal,beta,nont_terminal);
printf(" %c'->%s%c'|#\n", nont_terminal,alpha,nont_terminal);
}
else
printf("This Grammar CAN'T be REDUCED.\n");
}
else
printf("\n This Grammar is not LEFT RECURSIVE.\n");
}

Output –
PRACTICAL– 5
Aim – WAP in C/C++ to convert infix expression to prefix expression.
Software Used – GCC (C++) Compiler, VS Code
Theory –
Infix expressions are expressions where operators are placed between operands, such as a + b
* c. Prefix expressions (also known as Polish notation) are expressions where operators precede
their operands, such as + a * b c. Converting an infix expression to a prefix expression involves
rearranging the expression so that the operators precede their operands. One way to achieve
this is by using the shunting-yard algorithm, which employs a stack to rearrange the operators
and operands in the correct order.

Code –

#include <stack>

#include <algorithm>

using namespace std;

bool isOperator(char c){

if (c == '+' || c == '-' || c == '*' || c == '/' || c == '^') {

return true;

else {

return false;

}int precedence(char c){

if (c == '^')

return 3;

else if (c == '*' || c == '/')

return 2;

else if (c == '+' || c == '-')

return 1;

else

return -1;
}string InfixToPrefix(stack<char> s, string infix){

string prefix;

reverse(infix.begin(), infix.end());

for (int i = 0; i < infix.length(); i++) {

if (infix[i] == '(') {

infix[i] = ')';

}else if (infix[i] == ')') {

infix[i] = '(';

}for (int i = 0; i < infix.length(); i++) {

if ((infix[i] >= 'a' && infix[i] <= 'z') || (infix[i] >= 'A' && infix[i] <= 'Z')) {

prefix += infix[i];

else if (infix[i] == '(') {

s.push(infix[i]);

else if (infix[i] == ')') {

while ((s.top() != '(') && (!s.empty())) {

prefix += s.top();

s.pop();

}if (s.top() == '(') {

s.pop();

}else if (isOperator(infix[i])) {

if (s.empty()) {

s.push(infix[i]);
}else {

if (precedence(infix[i]) > precedence(s.top())) {

s.push(infix[i]);

}else if ((precedence(infix[i]) == precedence(s.top()))

&& (infix[i] == '^')) {

while ((precedence(infix[i]) == precedence(s.top()))

&& (infix[i] == '^')) {

prefix += s.top();

s.pop();

s.push(infix[i]);

}else if (precedence(infix[i]) == precedence(s.top())) {

s.push(infix[i]);

}else {

while ((!s.empty()) && (precedence(infix[i]) < precedence(s.top()))) {

prefix += s.top();

s.pop();

s.push(infix[i]);

}while (!s.empty()) {

prefix += s.top();

s.pop();

reverse(prefix.begin(), prefix.end());
return prefix;

}int main(){

string infix, prefix;

cout << "Enter a Infix Expression :" << endl;

cin >> infix;

stack<char> stack;

cout << "INFIX EXPRESSION: " << infix << endl;

prefix = InfixToPrefix(stack, infix);

cout << endl<< "PREFIX EXPRESSION: " << prefix;

return 0;

Output-
PRACTICAL– 6
Aim – WAP in C/C++ to convert regular expression to Finite Automata.
Software Used – GCC (C++) Compiler, VS Code
Theory –
A Finite Automaton (FA) is a computational model used to recognize patterns in strings or
sequences. It consists of a finite set of states, a finite set of input symbols (alphabet), a transition
function that describes how the automaton moves from one state to another based on input
symbols, and a set of accepting states that define when the automaton accepts or recognizes an
input string.
Converting a regular expression to a Finite Automaton (FA) involves transforming the abstract
representation of a regular language into a concrete, computational model. Regular expressions
describe patterns of strings in a language, while Finite Automata are computational models
capable of recognizing or generating strings in a language.

Code –

#include<stdio.h>

#include<string.h>

int main()

char reg[20]; int q[20][3],i=0,j=1,len,a,b;

for(a=0;a<20;a++) for(b=0;b<3;b++) q[a][b]=0;

scanf("%s",reg);

printf("Given regular expression: %s\n",reg);

len=strlen(reg);

while(i<len)

if(reg[i]=='a'&&reg[i+1]!='|'&&reg[i+1]!='*') { q[j][0]=j+1; j++; }

if(reg[i]=='b'&&reg[i+1]!='|'&&reg[i+1]!='*') { q[j][1]=j+1; j++; }

if(reg[i]=='e'&&reg[i+1]!='|'&&reg[i+1]!='*') { q[j][2]=j+1; j++; }

if(reg[i]=='a'&&reg[i+1]=='|'&&reg[i+2]=='b')

q[j][2]=((j+1)*10)+(j+3); j++;
q[j][0]=j+1; j++;

q[j][2]=j+3; j++;

q[j][1]=j+1; j++;

q[j][2]=j+1; j++;

i=i+2;

if(reg[i]=='b'&&reg[i+1]=='|'&&reg[i+2]=='a')

q[j][2]=((j+1)*10)+(j+3); j++;

q[j][1]=j+1; j++;

q[j][2]=j+3; j++;

q[j][0]=j+1; j++;

q[j][2]=j+1; j++;

i=i+2;

if(reg[i]=='a'&&reg[i+1]=='*')

q[j][2]=((j+1)*10)+(j+3); j++;

q[j][0]=j+1; j++;

q[j][2]=((j+1)*10)+(j-1); j++;

if(reg[i]=='b'&&reg[i+1]=='*')

q[j][2]=((j+1)*10)+(j+3); j++;

q[j][1]=j+1; j++;

q[j][2]=((j+1)*10)+(j-1); j++;

}
if(reg[i]==')'&&reg[i+1]=='*')

q[0][2]=((j+1)*10)+1;

q[j][2]=((j+1)*10)+1;

j++;

i++;

printf("\n\tTransition Table \n");

printf("_____________________________________\n");

printf("Current State |\tInput |\tNext State");

printf("\n_____________________________________\n");

for(i=0;i<=j;i++)

if(q[i][0]!=0) printf("\n q[%d]\t | a | q[%d]",i,q[i][0]);

if(q[i][1]!=0) printf("\n q[%d]\t | b | q[%d]",i,q[i][1]);

if(q[i][2]!=0)

if(q[i][2]<10) printf("\n q[%d]\t | e | q[%d]",i,q[i][2]);

else printf("\n q[%d]\t | e | q[%d] , q[%d]",i,q[i][2]/10,q[i][2]%10);

printf("\n_____________________________________\n");

return 0;

}
Output-
PRACTICAL– 7
Aim – Write a program in C/C++ to convert NFA to DFA

Software Used – GCC (C++) Compiler, VS Code


Theory –

NFA is a finite automation where for some cases when a single input is given to a single state, the
machine goes to more than 1 states, i.e. some of the moves cannot be uniquely determined by the present
state and the present input symbol. DFA is a finite automata where, for all cases, when a single input
is given to a single state, the machine goes to a single state, i.e., all the moves of the machine can be
uniquely determined by the present state and the present input symbol.
A DFA (Deterministic Finite Automaton) is a finite state machine where from each state and a given
input symbol, the next possible state is uniquely determined. On the other hand, an NFA (Non-
Deterministic Finite Automaton) can move to several possible next states from a given state and a given
input symbol. However, this does not add any more power to the machine. It still accepts the same set
of languages, namely the regular languages. It is possible to convert an NFA to an equivalent DFA
using the powerset construction.

Code –
#include <stdio.h>
int main()
{
int nfa[5][2];
nfa[1][1]=12;
nfa[1][2]=1;
nfa[2][1]=0;
nfa[2][2]=3;
nfa[3][1]=0;
nfa[3][2]=4;
nfa[4][1]=0;
nfa[4][2]=0;
int dfa[10][2];
int dstate[10];
int i=1,n,j,k,flag=0,m,q,r;
dstate[i++]=1;
n=i;

dfa[1][1]=nfa[1][1];
dfa[1][2]=nfa[1][2];
printf("\nf(%d,a)=%d",dstate[1],dfa[1][1]);
printf("\nf(%d,b)=%d",dstate[1],dfa[1][2]);

for(j=1;j<n;j++)
{
if(dfa[1][1]!=dstate[j])
flag++;
}
if(flag==n-1)
{
dstate[i++]=dfa[1][1];
n++;
}
flag=0;
for(j=1;j<n;j++)
{
if(dfa[1][2]!=dstate[j])
flag++;
}
if(flag==n-1)
{
dstate[i++]=dfa[1][2];
n++;
}
k=2;
while(dstate[k]!=0)
{
m=dstate[k];
if(m>10)
{
q=m/10;
r=m%10;
}
if(nfa[r][1]!=0)
dfa[k][1]=nfa[q][1]*10+nfa[r][1];
else
dfa[k][1]=nfa[q][1];
if(nfa[r][2]!=0)
dfa[k][2]=nfa[q][2]*10+nfa[r][2];
else
dfa[k][2]=nfa[q][2];

printf("\nf(%d,a)=%d",dstate[k],dfa[k][1]);
printf("\nf(%d,b)=%d",dstate[k],dfa[k][2]);

flag=0;
for(j=1;j<n;j++)
{
if(dfa[k][1]!=dstate[j])
flag++;
}
if(flag==n-1)
{
dstate[i++]=dfa[k][1];
n++;
}
flag=0;
for(j=1;j<n;j++)
{
if(dfa[k][2]!=dstate[j])
flag++;
}
if(flag==n-1)
{
dstate[i++]=dfa[k][2];
n++;
}
k++;
}
return 0;
}

Output-
PRACTICAL-8
AIM: Write a program to calculate First and Follow of a grammar.

SOFTWARE USED: VS Code

THEORY:

FIRST and FOLLOW are two functions associated with grammar that help us fill in the entries
of an M-table.
FIRST ()− It is a function that gives the set of terminals that begin the strings derived from
the production rule.
A symbol c is in FIRST (α) if and only if α ⇒ cβ for some sequence β of grammar symbols.
A terminal symbol a is in FOLLOW (N) if and only if there is a derivation from the start
symbol S of the grammar such that S ⇒ αNαβ, where α and β are a (possible empty) sequence
of grammar symbols. In other words, a terminal c is in FOLLOW (N) if c can follow N at
some point in a derivation.
Benefit of FIRST ( ) and FOLLOW ( )
• It can be used to prove the LL (K) characteristic of grammar.
• It can be used to promote in the construction of predictive parsing tables.
• It provides selection information for recursive descent parsers.
Follow (A) is defined as the collection of terminal symbols that occur directly to the right of A.
FOLLOW(A) = {a|S ⇒* αAaβ where α, β can be any strings}
Rules to find FOLLOW
• If S is the start symbol, FOLLOW (S) ={$}
• If production is of form A → α B β, β ≠ ε.
(a) If FIRST (β) does not contain ε then, FOLLOW (B) = {FIRST (β)}
Or
(b) If FIRST (β) contains ε (i. e. , β ⇒* ε), then
FOLLOW (B) = FIRST (β) − {ε} ∪ FOLLOW (A)
∵ when β derives ε, then terminal after A will follow B.

• If production is of form A → αB, then Follow (B) ={FOLLOW (A)}.


Code:
#include<iostream>
#include<string.h>
#define max 20

using namespace std;

char prod[max][10];
char ter[10],nt[10];
char first[10][10],follow[10][10];
int eps[10];
int count_var=0;

int findpos(char ch) {


int n;
for(n=0;nt[n]!='\0';n++)
if(nt[n]==ch) break;
if(nt[n]=='\0') return 1;
return n;
}

int IsCap(char c) {
if(c >= 'A' && c<= 'Z')
return 1;
return 0;
}

void add(char *arr,char c) {


int i,flag=0;
for(i=0;arr[i]!='\0';i++) {
if(arr[i] == c) {
flag=1;
break;
}
}
if(flag!=1) arr[strlen(arr)] = c;
}

void addarr(char *s1,char *s2) {


int i,j,flag=99;
for(i=0;s2[i]!='\0';i++) {
flag=0;
for(j=0;;j++) {
if(s2[i]==s1[j]) {
flag=1;
break;
}
if(j==strlen(s1) && flag!=1) {
s1[strlen(s1)] = s2[i];
break;
}
}
}
}

void addprod(char *s) {


int i;
prod[count_var][0] = s[0];
for(i=3;s[i]!='\0';i++) {
if(!IsCap(s[i])) add(ter,s[i]);
prod[count_var][i-2] = s[i];
}
prod[count_var][i-2] = '\0';
add(nt,s[0]);
count_var++;
}

void findfirst() {
int i,j,n,k,e,n1;
for(i=0;i<count_var;i++) {
for(j=0;j<count_var;j++) {
n = findpos(prod[j][0]);
if(prod[j][1] == (char)238) eps[n] = 1;
else {
for(k=1,e=1;prod[j][k]!='\0' && e==1;k++) {
if(!IsCap(prod[j][k])) {
e=0;
add(first[n],prod[j][k]);
}
else {
n1 = findpos(prod[j][k]);
addarr(first[n],first[n1]);
if(eps[n1]==0)
e=0;
}
}
if(e==1) eps[n]=1;
}
}
}
}

void findfollow() {
int i,j,k,n,e,n1;
n = findpos(prod[0][0]);
add(follow[n],'#');
for(i=0;i<count_var;i++) {
for(j=0;j<count_var;j++) {
k = strlen(prod[j])-1;
for(;k>0;k--) {
if(IsCap(prod[j][k])) {
n=findpos(prod[j][k]);
if(prod[j][k+1] == '\0')
{
n1 = findpos(prod[j][0]);
addarr(follow[n],follow[n1]);
}
if(IsCap(prod[j][k+1]))
{
n1 = findpos(prod[j][k+1]);
addarr(follow[n],first[n1]);
if(eps[n1]==1)
{
n1=findpos(prod[j][0]);
addarr(follow[n],follow[n1]);
}
}
else if(prod[j][k+1] != '\0')
add(follow[n],prod[j][k+1]);
}
}
}
}
}

int main() {
char s[max],i;
cout<<"Enter the productions\n";
cin>>s;
while(strcmp("end",s)) {
addprod(s);
cin>>s;
}
findfirst();
findfollow();
for(i=0;i<strlen(nt);i++) {
cout<<nt[i]<<"\t";
cout<<first[i];
if(eps[i]==1) cout<<((char)238)<<"\t";
else cout<<"\t";
cout<<follow[i]<<"\n";
}
return 0;;
}

Output:
PRACTICAL-9
AIM: Write the Program to design Shift Reduce Parser for E -> E+E | E*E | (E) | id .

SOFTWARE USED: VS Code

THEORY:
Shift Reduce parser attempts for the construction of parse in a similar manner as done in bottom-
up parsing i.e. the parse tree is constructed from leaves(bottom) to the root(up). A more general
form of the shift-reduce parser is the LR parser.
This parser requires some data structures i.e.
• An input buffer for storing the input string.
• A stack for storing and accessing the production rules.
Basic Operations –
• Shift: This involves moving symbols from the input buffer onto the stack.
• Reduce: If the handle appears on top of the stack then, its reduction by using appropriate
production rule is done i.e. RHS of a production rule is popped out of a stack and LHS of a
production rule is pushed onto the stack.
• Accept: If only the start symbol is present in the stack and the input buffer is empty then,
the parsing action is called accept. When accepted action is obtained, it is means successful
parsing is done.
• Error: This is the situation in which the parser can neither perform shift action nor reduce
action and not even accept action.
Code:

#include<stdio.h>
#include<string.h>
int k=0,z=0,i=0,j=0,c=0;
char a[16],ac[20],stk[15],act[10];
void check();
int main()
{

puts("GRAMMAR is E->E+E \n E->E*E \n E->(E) \n E->id");


puts("enter input string ");
gets(a);
c=strlen(a);
strcpy(act,"SHIFT->");
puts("stack \t input \t action");
for(k=0,i=0; j<c; k++,i++,j++)
{
if(a[j]=='i' && a[j+1]=='d')
{
stk[i]=a[j];
stk[i+1]=a[j+1];
stk[i+2]='\0';
a[j]=' ';
a[j+1]=' ';
printf("\n$%s\t%s$\t%sid",stk,a,act);
check();
}
else
{
stk[i]=a[j];
stk[i+1]='\0';
a[j]=' ';
printf("\n$%s\t%s$\t%ssymbols",stk,a,act);
check();
}
}

}
void check()
{
strcpy(ac,"REDUCE TO E");
for(z=0; z<c; z++)
if(stk[z]=='i' && stk[z+1]=='d')
{
stk[z]='E';
stk[z+1]='\0';
printf("\n$%s\t%s$\t%s",stk,a,ac);
j++;
}
for(z=0; z<c; z++)
if(stk[z]=='E' && stk[z+1]=='+' && stk[z+2]=='E')
{
stk[z]='E';
stk[z+1]='\0';
stk[z+2]='\0';
printf("\n$%s\t%s$\t%s",stk,a,ac);
i=i-2;
}
for(z=0; z<c; z++)
if(stk[z]=='E' && stk[z+1]=='*' && stk[z+2]=='E')
{
stk[z]='E';
stk[z+1]='\0';
stk[z+1]='\0';
printf("\n$%s\t%s$\t%s",stk,a,ac);
i=i-2;
}
for(z=0; z<c; z++)
if(stk[z]=='(' && stk[z+1]=='E' && stk[z+2]==')')
{
stk[z]='E';
stk[z+1]='\0';
stk[z+1]='\0';
printf("\n$%s\t%s$\t%s",stk,a,ac);
i=i-2;
}
}
Output:
PRACTICAL-10
AIM: Write a program which accepts a regular grammar with no left-recursion, and no null-production
rules, and then it accepts a string and reports whether the string is accepted by the grammar or not.

SOFTWARE USED: VS Code

THEORY: The program checks if a string is accepted by a specific regular grammar, which is defined
without left recursion and null-production rules. It models the grammar as a collection of production
rules, each associating a non-terminal symbol with a production string composed of terminal and/or
non-terminal symbols. The main function iterates over the input string and recursively applies these
rules to see if the string can be generated from the grammar's start symbol. This approach relies on the
deterministic nature of the provided grammar to either accept or reject the input string based on whether
it matches the patterns defined by the grammar's rules. The simplicity of the grammar's structure—
specifically the absence of left recursion and null productions—facilitates a straightforward
implementation of this checking process.
Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_RULES 100


#define MAX_RULE_LEN 100

typedef struct {
char nonTerminal;
char production[MAX_RULE_LEN];
} GrammarRule;

int numRules = 0;
GrammarRule rules[MAX_RULES];

// Function to add a rule to the grammar


void addRule(char nonTerminal, const char* production) {
rules[numRules].nonTerminal = nonTerminal;
strcpy(rules[numRules].production, production);
numRules++;
}

// Function to check if a string is accepted by the grammar


int isAccepted(const char* str, char currentSymbol) {
if (*str == '\0') {
return 0; // Empty string is not accepted
}

for (int i = 0; i < numRules; i++) {


if (rules[i].nonTerminal == currentSymbol) {
const char* production = rules[i].production;

// If production is terminal
if (production[1] == '\0' && production[0] == *str) {
if (*(str + 1) == '\0') {
return 1; // String accepted
}
} else if (isAccepted(str + 1, production[1])) { // Recursive step for non-terminals
return 1;
}
}
}

return 0;
}

// Main function to demonstrate usage


int main() {
// Example grammar
addRule('S', "aA");
addRule('A', "bB");
addRule('B', "c");
char input[100];
printf("Enter a string: ");
scanf("%99s", input);

if (isAccepted(input, 'S')) {
printf("String is accepted by the grammar.\n");
} else {
printf("String is not accepted by the grammar.\n");
}

return 0;
}
Output:
PRACTICAL-11
AIM: Consider the following regular expressions:
a) (0 + 1)* + 0*1*
b) (ab*c + (def)+ + a*d+e)+
c) ((a + b)*(c + d)*)+ + ab*c*d

Write separate programs for each of the regular expressions mentioned above.

SOFTWARE USED: VS Code

THEORY:
Regular Expression
a) (0 + 1)* + 0*1*
This regular expression matches any string of 0s and 1s, including the empty string. It combines two
patterns: any sequence of 0s and 1s (0 + 1)*, and any number of 0s followed by any number of 1s 0*1*.
b) (ab*c + (def)+ + a*d+e)+
This expression is more complex. It matches strings made of one or more repetitions of either ab*c, one
or more def sequences, or sequences starting with any number of as, followed by one or more ds and
ending in e.
Implementing a matcher for this without using a regex library would require manually parsing and
matching each segment of the pattern. Due to space constraints and the complexity of the pattern,
providing a fully working example is not feasible here. Instead, you would typically use state machines
or recursion to evaluate each segment according to the rules defined in the regex.
c) ((a + b)*(c + d)*)+ + ab*c*d
This regular expression matches strings that are one or more combinations of any number of as and bs
followed by any number of cs and ds, or strings that match ab*c*d.
Again, implementing a matcher for this from scratch is non-trivial and would either involve a series of
nested loops and conditionals or a form of state machine that can handle each segment of the pattern in
turn.

Code:

a)(0 + 1)* + 0*1*

#include <stdio.h>
#include <string.h>

int match_a(const char *str) {


// Since everything matches, we just verify characters are valid
while (*str) {
if (*str != '0' && *str != '1') return 0; // False on invalid character
str++;
}
return 1; // True if all characters are 0 or 1
}

int main() {
char input[1024];
printf("Enter string for regex a): ");
scanf("%1023s", input);
if (match_a(input))
printf("Matches a)\n");
else
printf("Does not match a)\n");
return 0;
}
Output:

b) (ab*c + (def)+ + a*d+e)+


#include <stdio.h>

#include <string.h>

#include <ctype.h>

// Function to check if the substring matches "ab*c"

int match_ab_star_c(const char *str) {

if (*str != 'a') return 0; // Must start with 'a'

str++;

while (*str == 'b') str++; // Skip over 'b's

if (*str != 'c') return 0; // Must end with 'c'

return 1; // Success

}
// Simplified matcher for the pattern "(ab*c + (def)+ + a*d+e)+"

int match_pattern(const char* str) {

int matchFound = 0; // To keep track if we found a match

// Iterate over the string to find matches according to one of the patterns

for (int i = 0; i < strlen(str); ++i) {

if (str[i] == 'a' && match_ab_star_c(str + i)) {

matchFound = 1;

break; // For simplification, we break after the first match

// Placeholder for additional patterns like (def)+ and a*d+e

return matchFound;

int main() {

char input[1024];

printf("Enter a string: ");

scanf("%1023s", input);

if (match_pattern(input))

printf("The string matches the pattern.\n");

else

printf("The string does NOT match the pattern.\n");


return 0;

Output:

c)((a + b)*(c + d)*)+ + ab*c*d

#include <stdio.h>
#include <string.h>
#include <stdbool.h>

bool match_ab_star_cd(const char **str) {


if (**str != 'a') return false;
(*str)++;
while (**str == 'b') (*str)++;
if (**str != 'c') return false;
(*str)++;
while (**str == 'd') (*str)++;
return true;
}

bool match_complex_pattern(const char *str) {


bool matched = false;
const char *original = str;

while (*str != '\0') {


const char *temp = str;
// Match (a + b)*(c + d)* part
while (*temp == 'a' || *temp == 'b') temp++;
while (*temp == 'c' || *temp == 'd') temp++;
if (temp != str) { // If we advanced, there was a match
str = temp;
matched = true;
continue;
}

// Match ab*c*d
if (match_ab_star_cd(&str)) {
matched = true;
continue;
}

// If no part of the pattern matches, stop


break;
}

return matched && str > original;


}

int main() {
char input[1024];
printf("Enter a string: ");
scanf("%1023s", input);

if (match_complex_pattern(input))
printf("The string matches the pattern.\n");
else
printf("The string matches the pattern.\n");

return 0;
}
Output
PRACTICAL-12
AIM: Write a program for Recursive Descent Parser

SOFTWARE USED: VS Code

THEORY:
Recursive Descent Parser uses the technique of Top-Down Parsing without backtracking. It can be
defined as a Parser that uses the various recursive procedure to process the input string with no
backtracking. It can be simply performed using a Recursive language. The first symbol of the string of
R.H.S of production will uniquely determine the correct alternative to choose. The major approach of
recursive-descent parsing is to relate each non-terminal with a procedure. The objective of each
procedure is to read a sequence of input characters that can be produced by the corresponding non-
terminal, and return a pointer to the root of the parse tree for the non-terminal. The structure of the
procedure is prescribed by the productions for the equivalent non-terminal. The recursive procedures
can be simply to write and adequately effective if written in a language that executes the procedure call
effectively. There is a procedure for each non-terminal in the grammar. It can consider a global variable
lookahead, holding the current input token and a procedure match (Expected Token) is the action of
recognizing the next token in the parsing process and advancing the input stream pointer, such that
lookahead points to the next token to be parsed. Match () is effectively a call to the lexical analyzer to
get the next token.

Code:

#include <stdio.h>
#include <string.h>

#define SUCCESS 1
#define FAILED 0

int E(), Edash(), T(), Tdash(), F();

const char *cursor;


char string[64];

int main()
{
puts("Enter the string");
scanf("%s", string);
//i+(i+i)*i
cursor = string;
puts("");
puts("Input Action");
puts("--------------------------------");

if (E() && *cursor == '\0') {


puts("--------------------------------");
puts("String is successfully parsed");
return 0;
} else {
puts("--------------------------------");
puts("Error in parsing String");
return 1;
}
}

int E()
{
printf("%-16s E -> T E'\n", cursor);
if (T()) {
if (Edash())
return SUCCESS;
else
return FAILED;
} else
return FAILED;
}

int Edash()
{
if (*cursor == '+') {
printf("%-16s E' -> + T E'\n", cursor);
cursor++;
if (T()) {
if (Edash())
return SUCCESS;
else
return FAILED;
} else
return FAILED;
} else {
printf("%-16s E' -> $\n", cursor);
return SUCCESS;
}
}

int T()
{
printf("%-16s T -> F T'\n", cursor);
if (F()) {
if (Tdash())
return SUCCESS;
else
return FAILED;
} else
return FAILED;
}

int Tdash()

if (*cursor == '*') {
printf("%-16s T' -> * F T'\n", cursor);
cursor++;
if (F()) {
if (Tdash())
return SUCCESS;
else
return FAILED;
} else
return FAILED;
} else {
printf("%-16s T' -> $\n", cursor);
return SUCCESS;
}
}

int F()

{
if (*cursor == '(') {
printf("%-16s F -> ( E )\n", cursor);
cursor++;
if (E()) {

if (*cursor == ')') {
cursor++;
return SUCCESS;
} else
return FAILED;
} else
return FAILED;
} else if (*cursor == 'i') {
cursor++;
printf("%-16s F -> i\n", cursor);
return SUCCESS;
} else
return FAILED;
}

Output:
PRACTICAL-13
AIM- Consider the following grammar: S --> ABC A--> abA | ab B--> b | BC C--> c | cC
Following any suitable parsing technique (prefer top-down), design a parser which accepts a string
and tells whether the string is accepted by above grammar or not.

SOFTWARE USED: VS Code

THEORY: Designing a parser for the given grammar involves a top-down approach, often realized
through recursive descent parsing. This technique constructs a parse tree from the top (start symbol)
down to the leaves (terminals). For the grammar provided, functions representing each non-terminal (S,
A, B, C) recursively call each other according to the production rules. A looks for patterns starting with
"ab", potentially followed by more A patterns. B accepts "b" or leads to further exploration through B
and C. C captures "c" or a sequence of "c"s. This strategy elegantly navigates the grammar's structure,
validating if a string conforms to the defined rules, thereby determining its acceptance.
1. Start with the S non-terminal, which leads to ABC. This means our main function will start
parsing by calling the functions to parse A, B, and C in order.
2. For A, B, and C, implement recursive functions that try to match the rules defined in the
grammar. Since A can lead to abA or ab, the function for A should look for the "ab" prefix and
then decide whether to call itself again (for abA) or return success (for ab). Similarly, implement
B and C with their respective rules.
3. End condition and input management: We will be reading the string character by character, so
we need a global variable or a pointer that keeps track of our current position in the string. If
we reach the end of the string and successfully matched all parts, the string is accepted by the
grammar.

Code:
#include <stdio.h>
#include <string.h>

const char* str; // Input string


int pos = 0; // Current position in the string

// Function prototypes
int parseS();
int parseA();
int parseB();
int parseC();

int parseS() {
if (parseA() && parseB() && parseC()) return 1;
else return 0;
}

int parseA() {
if (str[pos] == 'a' && str[pos+1] == 'b') {
pos += 2; // Matched "ab"
if (str[pos] == 'a' && str[pos+1] == 'b') // Check if another "ab" follows
return parseA(); // Recurse for abA
return 1; // Success for "ab"
}
return 0; // No match
}

int parseB() {
if (str[pos] == 'b') {
pos++; // Matched "b"
return 1;
} else if (str[pos] == 'B' && str[pos+1] == 'C') {
pos += 2; // Assuming BC is a placeholder for further rules
return parseB() && parseC(); // Recurse for BC (this example assumes additional logic)
}
return 0; // No match
}

int parseC() {
if (str[pos] == 'c') {
pos++; // Matched "c"
while (str[pos] == 'c') pos++; // Consume all 'c's for cC
return 1; // Success
}
return 0; // No match
}
int main() {
printf("Enter a string to parse: ");
char input[100];
scanf("%99s", input);
str = input;

if (parseS() && pos == strlen(str)) // Check if whole string is parsed


printf("String is accepted by the grammar.\n");
else
printf("String is not accepted by the grammar.\n");

return 0;
}

Output:
PRACTICAL-14
AIM- Design a parser which accepts a mathematical expression (containing integers only). If the
expression is valid, then evaluate the expression else report that the expression is invalid.

SOFTWARE USED: VS Code

Theory- Designing a parser in C to evaluate mathematical expressions containing only integers involves
handling basic arithmetic operations (addition, subtraction, multiplication, and division) and managing
operator precedence and parentheses. A common approach for such a task is to use the Shunting Yard
algorithm for parsing the expression and converting it into Reverse Polish Notation (RPN), which can
then be easily evaluated. However, for simplicity, let's focus on a basic version that supports addition
and subtraction without parentheses. This approach can be extended to include more operations and
parentheses.
Steps:
1. Tokenize the Input: Break down the expression into numbers (operands) and operators.
2. Check for Validity: Ensure that tokens follow a valid pattern (operand-operator-operand) and
that there are no invalid characters.
3. Evaluate the Expression: Process the tokens to calculate the result following operator
precedence.

Code:
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>

// Function to evaluate the expression


int evaluateExpression(char* expr) {
int result = 0;
int num = 0;
char op = '+';

for (int i = 0; i < strlen(expr); i++) {


char curr = expr[i];

if (isdigit(curr)) {
num = num * 10 + (curr - '0'); // Build the current number
}
if (!isdigit(curr) && !isspace(curr) || i == strlen(expr) - 1) {
// Apply the current operation
switch (op) {
case '+': result += num; break;
case '-': result -= num; break;
// Extend cases for '*', '/' as needed
}

// Update operation and reset num


op = curr;
num = 0;
}
}

return result;
}

// Function to check if the expression is valid


int isValidExpression(char* expr) {
// Basic validation: Ensure expression is not empty and has numbers
if (strlen(expr) == 0) return 0;

for (int i = 0; i < strlen(expr); i++) {


if (!isdigit(expr[i]) && !isspace(expr[i]) && expr[i] != '+' && expr[i] != '-') {
// Extend checks for '*', '/' as needed
return 0; // Invalid character found
}
}

// Further checks for pattern validity can be added here

return 1; // Expression is valid


}

int main() {
char expression[256];
printf("Enter a mathematical expression: ");
fgets(expression, 256, stdin); // Reading input

if (isValidExpression(expression)) {
int result = evaluateExpression(expression);
printf("Result: %d\n", result);
} else {
printf("Invalid expression.\n");
}

return 0;
}

Output:
PRACTICAL-15
AIM- Open Ended program: Designing of various types of parser.

SOFTWARE USED: VS Code

THEORY: Designing parsers for various types of grammars is a fundamental task in compilers and
interpreters, enabling the translation of high-level language constructs into machine-understandable
code or intermediate representations. Parsers are critical in understanding the syntax of programming
languages, data formats (like JSON, XML), and protocols.

Types of Parsers

Top-Down Parsers: Start parsing from the start symbol and try to transform it into the input string by
appling production rules.

Recursive Descent Parser: A straightforward implementation of top-down parsing without


backtracking. Each non-terminal in the grammar has a corresponding function in the parser code. It's
easy to implement but not suitable for left-recursive grammars without modification.

Predictive Parser (LL Parser): An extension of the recursive descent parser that uses a look-ahead
pointer to predict which production to use. LL parsers are driven by a table of rules (parsing table) and
are more systematic compared to simple recursive descent parsers.

Bottom-Up Parsers: Start with the input string and attempt to reach the start symbol by reducing strings
to non-terminals using production rules.

Shift-Reduce Parser: A simple form of bottom-up parsing where the parser shifts input tokens onto a
stack and reduces them to non-terminals when they match the right side of a production rule.

LR Parser: More complex than shift-reduce parsers, LR parsers use a state machine and a parsing table
to decide when to shift or reduce. Types include SLR, LALR, and Canonical LR, with varying degrees
of complexity and power.

Earley Parser: A dynamic programming parser that is efficient and can parse all context-free
grammars, including those with left recursion.
CODE:

#include <stdio.h>

#include <ctype.h>

char* input;

// Function prototypes

int expr();

int term();

int factor();

int expr() {

int result = term();

while (*input == '+' || *input == '-') {

if (*input == '+') {

input++;

result += term();

} else if (*input == '-') {

input++;

result -= term();

return result;
}

int term() {

int result = factor();

while (*input == '*' || *input == '/') {

if (*input == '*') {

input++;

result *= factor();

} else if (*input == '/') {

input++;

result /= factor();

return result;

int factor() {

if (*input >= '0' && *input <= '9') {

int result = *input - '0';

input++;

return result;

} else if (*input == '(') {

input++; // Skip '('


int result = expr();

if (*input == ')') {

input++; // Skip ')'

return result;

return 0; // Error case

int main() {

char expression[256];

printf("Enter an arithmetic expression: ");

scanf("%255s", expression);

input = expression;

printf("Result: %d\n", expr());

return 0;

}
This parser can evaluate simple arithmetic expressions with numbers, addition, subtraction,
multiplication, and division, including handling parentheses for precedence. It's a direct implementation
of top-down parsing, where each function corresponds to a grammar rule, recursively parsing the input
based on the structure of the expression.

Expanding this parser to handle more complex grammars or different parsing strategies requires deeper
understanding of parsing techniques and possibly incorporating additional data structures, such as parse
trees or state machines, depending on the parser type.

Output:

You might also like