Complier Design Lab
Complier Design Lab
LAB MANUAL
93
Computer Science and Engineering
Vision & Mission
Vision
Mission
94
Lab Floor Plans &Layout
-------------------------------------------
ENTRY
---------------------------------
95
PREFACE
This lab is as a part of III B. Tech I semester for CSE students. Compiler design
complete translator for a mini language and error detection & recovery. It
includes lexical, syntax, and semantic analysis as front end, and code generation
following the university prescribed textbooks. The expected outcomes from the
students are:
2. This will enable him to work in the development phase of new computer
languages in industry.
96
SYLLABUS
Objectives:
Intel based desktop PC with minimum of 166 MHZ or faster processor with atleast 64
MB RAM and 100 MB free disk space
C++ comiler and JDK kit
Consider the following mini Language, a simple procedural high-level language, only
operating on integer data, with a syntax looking vaguely like a simple C crossed with Pascal.
The syntax of the language is defined by the following BNF grammar:
<ifstatement> ::= <bexpression> then <slist> else <slist> endif | if <bexpression> then <slist>
endif
97
<bexpression> ::= <expression> <relop> <expression>
<addingop> ::= + | -
<multop> ::= * | /
Comments (zero or more characters enclosed between the standard C / Java style comment
brackets /*...*/) can be inserted. The language has rudimentary support for 1-dimensional
arrays. The declaration
int a[3] declares an array of three elements, referenced as a[0], a[1] and a[2] Note also that
you should worry about the scoping of names.
{
int a[3], t1, t2;
t1 = 2;
a[0] = 1; a[1] = 2; a[t1] = 3;
t2 = -(a[2] + t1 * 6)/ a[2] - t1);
if t2 > 5 then
print(t2);
else
{
int t3;
t3 = 99;
t2 = -25;
print(-t1 + t2 * t3); /* this is a comment on 2 lines */
}
endif
}
1. Design a Lexical analyzer for the above language. The lexical analyzer should ignore
redundant spaces, tabs and newlines. It sholud also ignore comments. Although the
98
syntax specification states that identifiers can be arbitrarily long, you may restrict the
length to some reasonable value.
2. Implement the lexical analyzer using JLex, flex or lex or other lexical analyzer
generarting tools.
3. Design Predictive parser for the given language.
4. Design LALR bottom up parser for the above language.
5. Convert the BNF rules into Yacc from and write code to generate abstract syntax tree.
6. Write program to generate machine code from the abstract syntax tree generated by
the parser. The following instruction set may be considered as target code.
In the description of the individual instructions below, instruction argument types are
specified as follows:
R specifies a register in the form R0, R1, R2, R3, R4, R5, R6 or R7 (or r0, r1, etc).
So, for example an A-type argument could have the form 4 (variable number 4), #4 (the
constant value 4), r4 (register 4) or @r4 (the contents of register 4 identifies the variable
location to be accessed).
100
List Of Experiments
Experiment 1: Design a Lexical analyzer for the above language. The lexical
analyzer should ignore redundant spaces, tabs and newlines. It should also
ignore comments
Experiment 2: Implement the lexical analyzer using JLex, flex or lex or other
Experiment 5: Convert the BNF rules into Yacc from and write code to
101
COMPILER DESIGN LAB EQUIPMENT SPECIFICATIONS
30 Zenith
computers Intel
Pentium dual core
processor with
3.5GHz speed,
RAM 2GB,
Hard Disk
320GB,
DATABASE Programmer B. Tech
15 KVA UPS,
MANAGEMENT 24 port switches-
3 No. s,
SYSTEMS LAB/
16port switch-
4 COMPILER 1 No. s, UG
30 No. s 8 port switch-
DESIGN LAB II B. Tech-
1 No. s,
Batteries-19 No. II Sem / Ms.A.Mounika
s,
III B. Tech -
AC’s-2 No. s,
Projector-1 No. s, I Sem
LAN speed with
100 Mbps,
HP-Printer-1 No.
s,
Amplifier-1 No. s,
Speakers-2 No. s,
chairs-30 No. s,
Table -1 No. s
.
102
CMR ENGINEERING COLLEGE
Laboratory Name :Compiler Design Experiment No: 1
PROGRAM:
#include<stdio.h>
#include<conio.h>
#include<string.h>
{
char exp[20],id[10],dig[10],ch;
int i,j;
clrscr();
printf("enter expression:");
scanf("%s",&exp);
void main()
for(i=0;i<strlen(exp);)
{
ch=exp[i];
j=0;
if(ch>='a'&&ch<='z')
{
id[j++]=ch;
i++;
while((exp[i]>='a'&&exp[i]<='z')||(exp[i]>='0'&&exp[i]<='9'))
{
id[j++]=exp[i++];
}
id[j]='\0';
printf("\nidentifier:%s",id);
}
else if(ch=='+'||ch=='-'||ch=='*'||ch=='/'||ch=='%'||ch=='=')
{
printf("\noperator:%c",ch);
i++;
}
103
else if(ch>='0'&&ch<='9')
{
dig[j++]=ch;
i++;
while(exp[i]>='0'&&exp[i]<='9')
dig[j++]=exp[i++];
dig[j]='\0';
printf("\nconstant:%s",dig);
}
}//for
getch();
}
Output:
104
LAB VIVA QUESTIONS & ANSWERS
3. Define Passes?
In an implementation of a compiler, portion of one or more phases are
combined into a module called pass. A pass reads the source program or the
output of the previous pass, makes the transformations specified by its phases
and writes output into an intermediate file, which is read by subsequent pass.
7. Define optimization?
Certain compilers apply transformations to the output of the intermediate code
generator. It is used to produce an intermediate-language from which a faster or
smaller object program can be produced. This phase is called optimization
phase. Types of optimization are local optimization and loop optimization.
105
8. What is cross compiler?
A compiler may run on one machine and produce object code for another
machine is called cross compiler.
106
CMR ENGINEERING COLLEGE
Laboratory Name :Compiler Design Experiment No: 2
PROGRAM:
107
\} {if(!COMMENT) printf("\n BLOCK ENDS");}
{identifier}(\[[0-9]*\])? {if(!COMMENT) printf("\n %s
IDENTIFIER",yytext);}
\".*\" {if(!COMMENT) printf("\n\t%s is a STRING",yytext);}
[0-9]+ {if(!COMMENT) printf("\n\t%s is a NUMBER",yytext);}
\)(\;)? {if(!COMMENT) printf("\n\t");ECHO;printf("\n");}
\(
\( ECHO;
= {if(!COMMENT)printf("\n\t%s is an ASSIGNMENT
OPERATOR",yytext);}
\<= |
\>= |
\< |
== |
\> {if(!COMMENT) printf("\n\t%s is a RELATIONAL OPERATOR",yytext);}
%%
int main(int argc,char **argv)
{
if (argc > 1)
{
FILE *file;
file = fopen(argv[1],"r");
if(!file)
{
printf("could not open %s \n",argv[1]);
exit(0);
}
yyin = file;
}
yylex();
printf("\n\n");
return 0;
}
int yywrap()
{
return 0;
}
108
Output:
$lex lex.l
$cc lex.yy.c
#include<stdio.h>
main()
{
int a,b;
}
$./a.out var.c
#include<stdio.h> is a PREPROCESSOR DIRECTIVE
FUNCTION
main (
)
BLOCK BEGINS
int is a KEYWORD
a IDENTIFIER
b IDENTIFIER
BLOCK ENDS
1. Define binding?
The act of associating attributes to a name is referred to as binding the attributes
to the name. Most binding done at compile time called static binding. Some
languages, such as SNOBOL allow dynamic binding, binding done at run time.
109
3.What is meant by loaders and link-editors?
A program called a loader performs the two function of loading and
linkediting.The process of loading consists of taking relocatable machine
code,altering the relocatable addresses and placing the altered instruction and
data in memory at the proper locations.
7.Write the regular expression for denoting the set containing the string a
andall strings consisting of zero or more a’s followed by a b.
a|a*b
110
9.What is a regular definition?
If Σ is an alphabet of basic symbols, then a regular definition is a sequence of
definition of the form
d1 r1
d2 r2
….
dn fn
Where each di is a distinct name, and each ri is a regular expression over the
symbol in Σ U {d1, d2, …di-1}
111
CMR ENGINEERING COLEGE
Laboratory Name :Compiler Design Experiment No: 3
PROGRAM:
#include<stdio.h>
#include<conio.h>
#include<string.h>
char prol[7][10]={"s","A","A","B","B","C","C"};
char pror[7][10]={"Aa","Bb","Cd","aB","@","Cc","@"};
char prod[7][10]={"s-->A","A-->Bb","A-->Cd","B-->aB","B-->@","C--
>Cc","C-->@"};
char first[7][10]={"abcd","ab",cd","a@","@","c@","@"};
char follow[7][10]={"$","$","$","a$","b$","c$","d$"};
char table[5][6][10];
{
switch(c)
{
case 'S':return0;
case 'A':return1;
case 'B':return2;
case 'C':return3;
case 'a':return0;
case 'b':return1;
case 'c':return2;
case 'd':return3;
case '$':return4;
}
retun(2);
}
void main()
{
int i,j,k;
clrscr();
for(i=0;i<5;i++)
for(j=0;j<6;j++)
strcpy(table[i][j]," ");
printf("\n The following is the predictive parsing table for the following
grammar:\n");
for(i=0;i<7;i++)
112
printf("%s\n",prod[i]);
printf("\n Predictive parsing table is:\n ");
fflush(stdin);
for(i=0;i<7;i++)
{
k=strlen(first[i]);
for(j=0;j<10;j++)
if(first[i][j]!='@')
strcpy(table[numr(prol[i][0])+1][numr(first[i][j])+1],prod[i]);
}
for(i=0;i<7;i++)
{
if(strlen(pror[i])==1)
{
if(pror[i][0]=='@')
{
k=strlen(follow[i]);
for(j=0;j<k;j++)
strcpy(table[numr(prol[i][0])+1][numr(follow[i][j])+1]prod[i]);
}
}
}
strcpy(table[0][0]," ");
strcpy(table[0][1],"a");
strcpy(table[0][2],"b");
strcpy(table[0][3],"c");
strcpy(table[0][4],"d");
strcpy(table[0][5],"$");
strcpy(table[1][0],"S");
strcpy(table[2][0],"A");
strcpy(table[3][0],"B");
strcpy(table[4][0],"C");
printf("\n-----------------------------------------------------------------------------\n");
for(i-0;i<5;i++)
for(j=0;j<6;j++)
{
printf("%s_10S",table[i][j]);
if(j==5)
printf("\n-----------------------------------------------------------------------------\n");
}
getch();
}
113
Output:
[examuser56@localhost ~]$ gcc predictive.c
[examuser56@localhost ~]$ ./a.out
The following is the predictive parsing table for the following grammar:
S->A
A->Bb
A->Cd
B->aB
B->@
C->Cc
C->@
114
LAB VIVA QUESTIONS & ANSWERS
2. Define LEX?
LEX is a tool for automatically generating lexical analyzers. A LEX source
program is a specification of a lexical analyzer, consisting of a set of regular
expressions together with an action for each regular expression. The output of
LEX is a lexical analyzer program.
9. Define ambiguity?
A grammar that produces more than one parse tree for some sentence is said to
be ambiguous. An ambiguous grammar is one that produces more than one
leftmost or more than one right most derivation for some sentence.
116
CMR ENGINEERING COLLEGE
Laboratory Name :Compiler Design Experiment No: 4
PROGRAM:
<parser.l>
%{
#include<stdio.h>
#include "y.tab.h"
%}
%%
[0-9]+ {yylval.dval=atof(yytext);
return DIGIT;
}
\n|.return yytext[0];
%%
<parser.y>
%{
/*This YACC specification file generates the LALR parser for the program
considered in experiment 4.*/
#include<stdio.h>
%}
%union
{
double dval;
}
%token <dval> DIGIT
%type <dval> expr
%type <dval> term
%type <dval> factor
{
%%
line: expr '\n' {
printf("%g\n",$1);
}
;
| DIGIT
;
%%
int main()
{
yyparse();
}
yyerror(char *s)
{
printf("%s",s);
}
Output:
$lex parser.l
$yacc –d parser.y
$cc lex.yy.c y.tab.c –ll –lm
$./a.out
2+3
5.0000
118
LAB VIVA QUESTIONS & ANSWERS
2. Define parser?
A parser for grammar G is a program that takes as input a string w and
produces as output either a parse tree for w, if w is a sentence of G, or an error
message indicating that w is not a sentence of G.
4. Define Handles?
A handle of a right-sentential form γ is a production A β and a position of
γ where the string β may be found and replaced by A to produce the previous
right-sentential form in a rightmost derivation of γ.
120
CMR ENGINEERING COLLEGE
Laboratory Name :Compiler Design Experiment No: 5
AIM: Convert The BNF rules into YACC form and write code to
generate abstract syntax tree.
PROGRAM:
<int.l>
%{
#include"y.tab.h"
#include<stdio.h>
#include<string.h>
int LineNo=1;
%}
identifier [a-zA-Z][_a-zA-Z0-9]*
number [0-9]+|([0-9]*\.[0-9]+)
%%
main\(\) return MAIN;
if return IF;
else return ELSE;
while return WHILE;
int |
char |
float return TYPE;
{identifier} {strcpy(yylval.var,yytext);
return VAR;}
{number} {strcpy(yylval.var,yytext);
return NUM;}
\< |
\> |
\>= |
\<= |
== {strcpy(yylval.var,yytext);
return RELOP;}
[ \t] ;
\n LineNo++;
121
. return yytext[0];
%%
<int.y>
%{
#include<string.h>
#include<stdio.h>
struct quad
{
char op[5];
char arg1[10];
char arg2[10];
char result[10];
}QUAD[30];
struct stack
{
int items[100];
int top;
}stk;
int Index=0,tIndex=0,StNo,Ind,tInd;
extern int LineNo;
%}
%union
{
char var[10];
}
%token <var> NUM VAR RELOP
%token MAIN IF ELSE WHILE TYPE
%type <var> EXPR ASSIGNMENT CONDITION IFST ELSEST
WHILELOOP
%left '-' '+'
%left '*' '/'
%%
CODE: BLOCK
| STATEMENT CODE
| STATEMENT
;
STATEMENT: DESCT ';'
122
| ASSIGNMENT ';'
| CONDST
| WHILEST
;
BLOCK {
strcpy(QUAD[Index].op,"GOTO");
123
strcpy(QUAD[Index].arg1,"");
strcpy(QUAD[Index].arg2,"");
strcpy(QUAD[Index].result,"-1");
push(Index);
Index++;
}
;
ELSEST: ELSE{
tInd=pop();
Ind=pop();
push(tInd);
sprintf(QUAD[Ind].result,"%d",Index);
}
BLOCK{
Ind=pop();
sprintf(QUAD[Ind].result,"%d",Index);
}
;
CONDITION: VAR RELOP VAR {AddQuadruple($2,$1,$3,$$);
StNo=Index-1;
}
| VAR
| NUM
;
WHILEST: WHILELOOP{
Ind=pop();
sprintf(QUAD[Ind].result,"%d",StNo);
Ind=pop();
sprintf(QUAD[Ind].result,"%d",Index);
}
;
WHILELOOP: WHILE '(' CONDITION ')' {
strcpy(QUAD[Index].op,"==");
strcpy(QUAD[Index].arg1,$3);
strcpy(QUAD[Index].arg2,"FALSE");
strcpy(QUAD[Index].result,"-1");
push(Index);
Index++;
}
BLOCK {
strcpy(QUAD[Index].op,"GOTO");
strcpy(QUAD[Index].arg1,"");
strcpy(QUAD[Index].arg2,"");
strcpy(QUAD[Index].result,"-1");
push(Index);
Index++;
124
}
;
%%
extern FILE *yyin;
int main(int argc,char *argv[])
{
FILE *fp;
int i;
if(argc>1)
{
fp=fopen(argv[1],"r");
if(!fp)
{
printf("\n File not found");
exit(0);
}
yyin=fp;
}
yyparse();
printf("\n\n\t\t ----------------------------""\n\t\t Pos Operator Arg1 Arg2 Result"
"\n\t\t
--------------------");
for(i=0;i<Index;i++)
{
printf("\n\t\t %d\t %s\t %s\t %s\t
%s",i,QUAD[i].op,QUAD[i].arg1,QUAD[i].arg2,QUAD[i].result);
}
printf("\n\t\t -----------------------");
printf("\n\n");
return 0;
}
void push(int data)
{
stk.top++;
if(stk.top==100)
{
printf("\n Stack overflow\n");
exit(0);
}
stk.items[stk.top]=data;
}
int pop()
{
int data;
if(stk.top==-1)
{
125
printf("\n Stack underflow\n");
exit(0);
}
data=stk.items[stk.top--];
return data;
}
void AddQuadruple(char op[5],char arg1[10],char arg2[10],char result[10])
{
strcpy(QUAD[Index].op,op);
strcpy(QUAD[Index].arg1,arg1);
strcpy(QUAD[Index].arg2,arg2);
sprintf(QUAD[Index].result,"t%d",tIndex++);
strcpy(result,QUAD[Index++].result);
}
yyerror()
{
printf("\n Error on line no:%d",LineNo);
}
Input:
$vi test.c
main()
{
int a,b,c;
if(a<b)
{
a=a+b;
}
while(a<b)
{
a=a+b;
}
if(a<=b)
{
c=a-b;
}
else
{
c=a+b;
}
}
126
Output:
$lex int.l
$yacc –d int.y
$gcc lex.yy.c y.tab.c –ll –lm
$./a.out test.c
1. Define LR grammar?
A grammar for which we can construct a parsing table in which every entry is
uniquely defined is said to be an LR grammar.
127
5.What are the various kinds of intermediate representations for
intermediate
code generation?
a) Syntax trees
b) Postfix notation
c) Three address code
128
9.Write the three address code for the assignment statement a:= b * -c + b
* -c
t1 := -c, t2 := b * t1, t3 := -c, t4 := b * t3
t5 := t2 + t4, a := t5
129
CMR ENGINEERING COLLEGE
Laboratory Name :Compiler Design Experiment No: 6
PROGRAM:
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int label[20];
int no=0;
int main()
{
FILE *fp1,*fp2;
char fname[10],op[10],ch;
char operand1[8],operand2[8],result[8];
int i=0,j=0;
printf("\n Enter filename of the intermediate code");
scanf("%s",&fname);
fp1=fopen(fname,"r");
fp2=fopen("target.txt","w");
if(fp1==NULL || fp2==NULL)
{
printf("\n Error opening the file");
exit(0);
}
while(!feof(fp1))
{
fprintf(fp2,"\n");
fscanf(fp1,"%s",op);
i++;
if(check_label(i))
fprintf(fp2,"\nlabel#%d",i);
if(strcmp(op,"print")==0)
{
fscanf(fp1,"%s",result);
fprintf(fp2,"\n\t OUT %s",result);
}
if(strcmp(op,"goto")==0)
{
130
fscanf(fp1,"%s %s",operand1,operand2);
fprintf(fp2,"\n\t JMP %s,label#%s",operand1,operand2);
label[no++]=atoi(operand2);
}
if(strcmp(op,"[]=")==0)
{
fscanf(fp1,"%s %s %s",operand1,operand2,result);
fprintf(fp2,"\n\t STORE %s[%s],%s",operand1,operand2,result);
}
if(strcmp(op,"uminus")==0)
{
fscanf(fp1,"%s %s",operand1,result);
fprintf(fp2,"\n\t LOAD -%s,R1",operand1);
fprintf(fp2,"\n\t STORE R1,%s",result);
}
switch(op[0])
{
case '*': fscanf(fp1,"%s %s %s",operand1,operand2,result);
fprintf(fp2,"\n \t LOAD",operand1);
fprintf(fp2,"\n \t LOAD %s,R1",operand2);
fprintf(fp2,"\n \t MUL R1,R0");
fprintf(fp2,"\n \t STORE R0,%s",result);
break;
case '+': fscanf(fp1,"%s %s %s",operand1,operand2,result);
fprintf(fp2,"\n \t LOAD %s,R0",operand1);
fprintf(fp2,"\n \t LOAD %s,R1",operand2);
fprintf(fp2,"\n \t ADD R1,R0");
fprintf(fp2,"\n \t STORE R0,%s",result);
break;
case '-': fscanf(fp1,"%s %s %s",operand1,operand2,result);
fprintf(fp2,"\n \t LOAD %s,R0",operand1);
fprintf(fp2,"\n \t LOAD %s,R1",operand2);
fprintf(fp2,"\n \t SUB R1,R0");
fprintf(fp2,"\n \t STORE R0,%s",result);
break;
}
}
fclose(fp2);
fclose(fp1);
fp2=fopen("target.txt","r");
if(fp2==NULL)
{
printf("Error opening the file\n");
exit(0);
}
do
131
{
ch=fgetc(fp2);
printf("%c",ch);
}while(ch!=EOF);
fclose(fp1);
return 0;
}
int check_label(int k)
{
int i;
for(i=0;i<no;i++)
{
if(k==label[i])
return 1;
}
return 0;
}
132
Input:
$vi int.txt
=t1 2
[]=a 0 1
[]=a 1 2
[]=a 2 3
*t1 6 t2
+a[2] t2 t3
-a[2] t1 t2
/t3 t2 t2
uminus t2 t2
print t2
goto t2 t3
=t3 99
uminus 25 t2
*t2 t3 t3
uminus t1 t1
+t1 t3 t4
print t4
Output:
STORE t1,2
STORE a[0],1
STORE a[1],2
STORE a[2],3
LOAD t1,R0
LOAD 6,R1
ADD R1,R0
STORE R0,t3
LOAD a[2],R0
LOAD t2,R1
ADD R1,R0
STORE R0,t3
LOAD a[t2],R0
LOAD t1,R1
SUB R1,R0
STORE R0,t2
133
LOAD t3,R0
LOAD t2,R1
DIV R1,R0
STORE R0,t2
LOAD t2,R1
STORE R1,t2
LOAD t2,R0
JGT 5,label#11
Label#11: OUT t2
JMP t2,label#13
Label#13: STORE t3,99
LOAD 25,R1
STORE R1,t2
LOAD t2,R0
LOAD t3,R1
MUL R1,R0
STORE R0,t3
LOAD t1,R1
STORE R1,t1
LOAD t1,R0
LOAD t3,R1
ADD R1,R0
STORE R0,t4
OUT t4
134
special and are called formal parameters of the procedure. Arguments, known
as
actual parameters may be passed to a called procedure; they are substituted for
the
formal in the body.
135
8. What is an activation record?
Information needed by a single execution of a procedure is managed using a
contiguous block of storage called an activation record or frame, consisting of
the collection of fields such as
a) Return value
b) Actual parameters
c) Optional control link
d) Optional access link
e) Saved machine status
f) Local data
g) Temporaries
136
137