Compiler Lab
Compiler Lab
(Code 18CSC304J)
B. Tech (CSE) – 3rd year/6th Semester
Name:
Registration No:
(Dr. R. P. Mahapatra)
10
11
12
13
14
15
EXPERIMENT-1
Implementation of Lexical Analyzer
Theory:
Lexical Analysis is the very first phase in the compiler designing. A Lexer takes the modified
source code which is written in the form of sentences . In other words, it helps you to convert
a sequence of characters into a sequence of tokens. The lexical analyzer breaks this syntax
into a series of tokens. It removes any extra space or comment written in the source code.
Programs that perform Lexical Analysis in compiler design are called lexical analyzers or
lexers. A lexer contains tokenizer or scanner. If the lexical analyzer detects that the token is
invalid, it generates an error. The role of Lexical Analyzer in compiler design is to read
character streams from the source code, check for legal tokens, and pass the data to the syntax
analyzer when it demands.
Program:
#include<iostream>
#include<cstring>
#include<stdlib.h>
#include<ctype.h>
string arr[] = { "void", "using", "namespace", "int", "include", "iostream", "std", "main",
bool isKeyword(string a) {
if (arr[i] == a) {
return true;
return false;
int main() {
printf(“Aditya Saxena\n”);
string input;
getline(cin, input);
string s;
char c = input[i];
if (s != "") {
if (isKeyword(s)) {
} else if (isdigit(s[0])) {
int x = 0;
if (!isdigit(s[x++])) {
continue;
} else {
} else {
}
s = "";
} else {
s += c;
return 0;
EXPERIMENT-2
Regular Expression to NFA
Theory:
In NDFA, for a particular input symbol, the machine can move to any combination of the
states in the machine. In other words, the exact state to which the machine moves cannot be
determined. Hence, it is called Non-deterministic Automaton. Regular expressions are a
concise way to represent a set of strings in formal languages and automata theory. They are a
notation for describing regular languages, which can be recognized by finite state automata.
1. (0+1)*011(0+1)*
2. (0+1)*1(0+1)
3.(a+b)*a
EXPERIMENT-3
NFA to DFA
Theory:
An NFA can have zero, one or more than one move from a given state on a given input
symbol. An NFA can also have NULL moves (moves without input symbol). On the other
hand, DFA has one and only one move from a given state on a given input symbol.This
operator may be applied to any nondeterministic FA. At the end of the operation, there will
be a completed NFA. The conversion practice used is the standard canonical method of
creating an equivalent DFA from an NFA, that is: each state in the DFA being built
corresponds to a nonempty set of states in the original NFA. Therefore, for an NFA
with n states, there are potentially 2n - 1 states in the DFA, though realistically this upper
bound is rarely met.
1. Input NFA
Output DFA
2. Input NFA
Output DFA
Result: We converted the given NFA to DFA.
EXPERIMENT-4
Aim: Write a program in C/C++ to Elimination of Ambiguity, Left Recursion and Left
Factoring for a given set of production rule of a grammar.
Theory:
Left Recursion
Left Recursion. The production is left-recursive if the leftmost symbol on the right
side is the same as the non-terminal on the left side.
For example, expr → expr + term. If one were to code this production in a recursive-
descent parser, the parser would go in an infinite loop.
Left Factoring
Program
Left Recursion
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#define SIZE 20
int main()
printf("Aditya Saxena\n");
scanf("%s", pro);
nont_terminal=pro[0];
if(nont_terminal==pro[index])
{
for(i=++index,j=0;pro[i]!='|';i++,j++){
alpha[j]=pro[i];
if(pro[i+1]==0){
exit(0);
}
}
alpha[j]='\0';
if(pro[++i]!=0)
{
for(j=i,i=0;pro[j]!='\0';i++,j++){
beta[i]=pro[j];
}
beta[i]='\0';
}
else
}
else
Left Factoring
#include<stdio.h>
#include<string.h>
int main()
printf("Aditya Saxena");
char gram[20],part1[20],part2[20],modifiedGram[20],newGram[20],tempGram[20];
int i,j=0,k=0,l=0,pos;
for(i=0;gram[i]!='|';i++,j++)
part1[j]=gram[i];
part1[j]='\0';
for(j=++i,i=0;gram[j]!='\0';j++,i++)
part2[i]=gram[j];
part2[i]='\0';
for(i=0;i<strlen(part1)||i<strlen(part2);i++){
if(part1[i]==part2[i]){
modifiedGram[k]=part1[i];
k++;
pos=i+1;
}
}
for(i=pos,j=0;part1[i]!='\0';i++,j++){
newGram[j]=part1[i];
}
newGram[j++]='|';
for(i=pos;part2[i]!='\0';i++,j++){
newGram[j]=part2[i];
}
modifiedGram[k]='X';
modifiedGram[++k]='\0';
newGram[j]='\0';
printf(" A->%s",modifiedGram);
printf("\n X->%s\n",newGram);
EXPERIMENT-5
Aim: Write a program in C/C++ to find a FIRST and FOLLOW set from a given set
of production rule.
Theory:-
FIRST and FOLLOW are two functions associated with grammar that help us fill in the
entries of an M-table.
FIRST ()− It is a function that gives the set of terminals that begin the strings derived from
the production rule.
A symbol c is in FIRST (α) if and only if α ⇒ cβ for some sequence β of grammar symbols.
A terminal symbol a is in FOLLOW (N) if and only if there is a derivation from the start
symbol S of the grammar such that S ⇒ αNαβ, where α and β are a (possible empty)
sequence of grammar symbols. In other words, a terminal c is in FOLLOW (N) if c can
follow N at some point in a derivation.
Program:-
#include <ctype.h>
#include <stdio.h>
#include <string.h>
int count, n = 0;
char calc_first[10][100];
char calc_follow[10][100];
int m = 0;
char production[10][10];
int k;
char ck;
int e;
printf("Aditya Saxena\n\n");
int jm = 0;
int km = 0;
int i, choice;
char c, ch;
count = 8;
strcpy(production[0], "X=TnS");
strcpy(production[1], "X=Rm");
strcpy(production[2], "T=q");
strcpy(production[3], "T=#");
strcpy(production[4], "S=p");
strcpy(production[5], "S=#");
strcpy(production[6], "R=om");
strcpy(production[7], "R=ST");
int kay;
char done[count];
calc_first[k][kay] = '!';
c = production[k][0];
point2 = 0;
xxx = 0;
if (c == done[kay])
xxx = 1;
if (xxx == 1)
continue;
findfirst(c, 0, 0);
ptr += 1;
done[ptr] = c;
calc_first[point1][point2++] = c;
if (first[i] == calc_first[point1][lark]) {
chk = 1;
break;
if (chk == 0) {
calc_first[point1][point2++] = first[i];
printf("}\n");
jm = n;
point1++;
printf("\n");
printf("-----------------------------------------------"
"\n\n");
char donee[count];
ptr = -1;
calc_follow[k][kay] = '!';
point1 = 0;
int land = 0;
ck = production[e][0];
point2 = 0;
xxx = 0;
if (ck == donee[kay])
xxx = 1;
if (xxx == 1)
continue;
land += 1;
follow(ck);
ptr += 1;
donee[ptr] = ck;
calc_follow[point1][point2++] = ck;
if (f[i] == calc_follow[point1][lark]) {
chk = 1;
break;
if (chk == 0) {
calc_follow[point1][point2++] = f[i];
printf(" }\n\n");
km = m;
point1++;
}
void follow(char c)
int i, j;
if (production[0][0] == c) {
f[m++] = '$';
if (production[i][j] == c) {
if (production[i][j + 1] != '\0') {
followfirst(production[i][j + 1], i,
(j + 2));
if (production[i][j + 1] == '\0'
&& c != production[i][0]) {
follow(production[i][0]);
int j;
if (!(isupper(c))) {
first[n++] = c;
if (production[j][0] == c) {
if (production[j][2] == '#') {
if (production[q1][q2] == '\0')
first[n++] = '#';
findfirst(production[q1][q2], q1,
(q2 + 1));
else
first[n++] = '#';
else if (!isupper(production[j][2])) {
first[n++] = production[j][2];
else {
findfirst(production[j][2], j, 3);
int k;
if (!(isupper(c)))
f[m++] = c;
else {
int i = 0, j = 1;
if (calc_first[i][0] == c)
break;
}
while (calc_first[i][j] != '!') {
if (calc_first[i][j] != '#') {
f[m++] = calc_first[i][j];
else {
if (production[c1][c2] == '\0') {
follow(production[c1][0]);
else {
followfirst(production[c1][c2], c1,
c2 + 1);
j++;
EXPERIMENT-6
Theory:
A -> A1 | A2 | ... | An
If the non-terminal is to be further expanded to ‘A’, the rule is selected based on the current
input symbol ‘a’ only.
Program:-
#include <stdio.h>
#include <string.h>
char table[5][6][10];
int numr(char c)
switch (c)
case 'S':
return 0;
case 'A':
return 1;
case 'B':
return 2;
case 'C':
return 3;
case 'a':
return 0;
case 'b':
return 1;
case 'c':
return 2;
case 'd':
return 3;
case '$':
return 4;
return (2);
int main(){
int i, j, k;
printf("%s\n", prod[i]);
fflush(stdin);
k = strlen(first[i]);
if (first[i][j] != '@')
if (pror[i][0] == '@'){
k = strlen(follow[i]);
strcpy(table[0][1], "a");
strcpy(table[0][2], "b");
strcpy(table[0][3], "c");
strcpy(table[0][4], "d");
strcpy(table[0][5], "$");
strcpy(table[1][0], "S");
strcpy(table[2][0], "A");
strcpy(table[3][0], "B");
strcpy(table[4][0], "C");
printf("\n--------------------------------------------------------\n");
printf("%-10s", table[i][j]);
if (j == 5)
printf("\n--------------------------------------------------------\n");
}
}
EXPERIMENT-7
Theory:-
Shift Reduce parser attempts for the construction of parse in a similar manner as done in
bottom-up parsing i.e. the parse tree is constructed from leaves(bottom) to the root(up). A
more general form of the shift-reduce parser is the LR parser.
Program:-
#include<stdio.h>
#include<string.h>
int k=0,z=0,i=0,j=0,c=0;
int main(){
printf("Aditya Saxena\n");
c=strlen(a);
strcpy(act,"SHIFT->");
printf("$ \t%s$\n",a);
stk[i]=a[j];
stk[i+1]=a[j+1];
stk[i+2]='\0';
a[j]=' ';
a[j+1]=' ';
}
else{
stk[i]=a[j];
stk[i+1]='\0';
a[j]=' ';
printf("\n$%s\t%s$\t%ssymbols",stk,a,act);
check();
}
}
void check(){
strcpy(ac,"REDUCE TO E");
stk[z]='E';
stk[z+1]='\0';
printf("\n$%s\t%s$\t%s",stk,a,ac);
j++;
}
stk[z]='E';
stk[z+1]='\0';
stk[z+2]='\0';
printf("\n$%s\t%s$\t%s",stk,a,ac);
i=i-2;
}
stk[z]='E';
stk[z+1]='\0';
stk[z+1]='\0';
printf("\n$%s\t%s$\t%s",stk,a,ac);
i=i-2;
}
stk[z]='E';
stk[z+1]='\0';
stk[z+1]='\0';
printf("\n$%s\t%s$\t%s",stk,a,ac);
i=i-2;
}