CS3501-Compiler Lab-2021R-Updated-19-7-2023
CS3501-Compiler Lab-2021R-Updated-19-7-2023
AIM:
To write a C program to implement symbol table
INTRODUCTION:
A Symbol table is a data structure used by a language translator such as a compiler or
interpreter, where each identifier in a program’s source code is associated with information relating
to its declaration or appearance in the source
Possible entries in a symbol table:
Name : a string
Attribute:
1. Reserved word
2. Variable name
3. Type Name
4. Procedure name
5. Constant name
Data type
Scope information: where it can be used.
Storage allocation
SYMBOL TABLE
ALGORITHM:
#include<stdlib.h>
#include<string.h>
#include<conio.h>
int cnt=0;
struct symtab
{
char label[20];
int addr;
}sy[50];
void insert();
int search(char*);
void display();
void modify();
void main()
{
int ch,val;
char lab[10];
clrscr();
do
{
printf("\n1.Insert \n2.Display \n3.Search \n4.Modify \n6.Exit\n");
scanf("%d",&ch);
switch(ch)
{
case 1:
insert();
break;
case 2:
display();
break;
case 3:
printf("Enter the label: ");
scanf("%s",lab);
val=search(lab);
if(val==1)
printf("Label is found");
else
printf("Label is not found");
break;
case 4:
modify();
break;
case 5:
exit(0);
break;
}
}
while(ch<6);
}
void insert()
{
int val;
char lab[10];
printf("Enter the label: ");
scanf("%s",lab);
val=search(lab);
if(val==1)
printf("Duplicate symbol");
else
{
strcpy(sy[cnt].label,lab);
printf("Enter the address: ");
scanf("%d",&sy[cnt].addr);
cnt++;
}
}
int search(char *s)
{
int flag=0,i;
for(i=0;i<cnt;i++)
{
if(strcmp(sy[i].label,s)==0)
flag=1;
}
return flag;
}
void modify()
{
int val,ad,i;
char lab[10];
printf("Enter the label: ");
scanf("%s",lab);
val=search(lab);
if(val==0)
printf("No such symbol");
else
{
printf("Label is found\n");
printf("Enter the address: ");
scanf("%d",&ad);
for(i=0;i<cnt;i++)
{
if(strcmp(sy[i].label,lab)==0)
sy[i].addr=ad;
}
}
}
void display()
{
int i;
for(i=0;i<cnt;i++)
printf("%s \t %d \n",sy[i].label,sy[i].addr);}
OUTPUT:
1.insert
2.display
3.search
4.modify
5.exit
1
enter the label A enter the address 2000
1.insert
2.display
3.search
4.modify
5.exit
1
enter the label SUB enter the address 3000
1.insert
2.display
3.search
4.modify
5.exit
1
1.insert
2.display
3.search
4.modify
5.exit
2
1.insert
2.display
3.search
4.modify
5.exit
5
RESULT:
Thus the C program for symbol table is implemented and executed successfully.
EX.NO:1B
AIM:
To write and execute a C program to implement the lexical analyzer.
INTRODUCTION:
Lexical analysis is the process of converting a sequence of characters (such as in a
computer program of web page) into a sequence of tokens (strings with an identified
“meaning”). A program that perform lexical analysis may be called a lexer, tokenize or
scanner.
TOKEN
The process of forming tokens from an input stream of characters is called tokenization.
Consider this expression in the C programming language: Sum=3 + 2;
Tokenized and represented by the following table:
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
#include<string.h>
#include<ctype.h>
return flag;
}
int main(){
char ch, buffer[15], operators[] = "+-*/%=";
FILE *fp;
int i,j=0;
clrscr();
fp = fopen("input1.txt","r");
if(fp == NULL){
printf("error while opening the file\n");
exit(0);
}
if(isalnum(ch)){
buffer[j++] = ch;
}
else if((ch == ' ' || ch == '\n') && (j != 0)){
buffer[j] = '\0';
j = 0;
if(isKeyword(buffer) == 1)
printf("%s is keyword\n", buffer);
else
printf("%s is indentifier\n", buffer);
}
fclose(fp);
getch();
return 0;
}
input1.txt
void main()
{
int a, b, c ;
a=b+c;
}
OUTPUT:
RESULT:
Thus the C program for lexical analyzer to recognize few patterns is implemented and executed
successfully.
EX.NO:2
AIM:
To implement the lexical analyzer using lex tool for a subset of C language.
INTRODUCTION:
THEORY:
A language for specifying lexical analyzer.
There is a wide range of tools for construction of lexical analyzer. The majority of these
tools are based on regular expressions.
The one of the traditional tools of that kind is lex.
LEX:
The lex is used in the manner depicted. A specification of the lexical analyzer is pre-
ferred by creating a program lex.1 in the lex language.
Then lex.1 is run through the lex compiler to produce a ‘c’ program lex.yy.c.
The program lex.yy.c consists of a tabular representation of a transition diagram con-
structed from the regular expression of lex.1 together with a standard routine that uses ta-
ble of recognize leximes.
Lex.yy.c is run through the ‘C’ compiler to produce as object program a.out, which is the
lexical analyzer that transform as input stream into sequence of tokens.
LEX SOURCE:
ALGORITHM:
Step1: Start the program.
Step2: Declare necessary variables and creates token representation using Regular.
Step3: Print the pre processor or directives, keywords by analysis of the input program.
Step4: In the program check whether there are arguments.
Step5: Declare a file and open it as read mode.
Step6: Read the file and if any taken in source program matches with RE that all returned as integer
value.
Step7: Print the token identified using YYdex() function.
Step8: Stop the program
PROGRAM
identifier[a-zA-Z][a-zA-Z0-9]*
number[0-9]+
%%
int |
float |
main |
void |
include |
stdio.h |
switch |
case |
long |
struct |
const |
typedef |
return |
else |
goto {printf("\n \t %s is a keyword",yytext);}
{identifier} {printf("\n \t %s is an identifier",yytext);}
{number} {printf("\n \t %s is a number",yytext);}
\, |
\; |
\. {printf("\n \t %s is a symbol",yytext);}
const |
typedef |
return |
else |
goto {printf("\n \t %s is a keyword",yytext);}
{identifier} {printf("\n \t %s is an identifier",yytext);}
{number} {printf("\n \t %s is a number",yytext);}
\, |
\; |
\. {printf("\n \t %s is a symbol",yytext);}
\<= |
\>= |
\> |
\< |
\= |
\{ |
\} |
\( |
\# |
\) {printf("\n \t %s is a operator",yytext);}
%%
int main(int argc,char** argv)
{
FILE *f=fopen(argv[1],"r");
yyin=f;
yylex();
return 0;
}
int yywrap()
{
return 0;
}
Input.c
#include<stdio.h>
main()
{
int a,b;
}
OUTPUT:
# is a operator
include is a keyword
< is a operator
stdio.h is a keyword
> is a operator
main is a keyword
( is a operator
) is a operator
{ is a operator
int is a keyword
a is an identifier
, is a symbol
b is an identifier
; is a symbol
} is a operator
RESULT:
Thus the program to implement lexical analyzer using lex tool is executed and implemented
successfully.
EX.NO:3A
AIM:
To write YACC program to recognize a valid arithmetic expression that uses operator +, - , *
and /.
INTRODUCTION:
YACC (yet another compiler) is a program designed to produce designed to compile a
LALR (1) grammar and to produce the source code of the synthetically analyses of the language
produced by the grammar.
ALGORITHM:
%{
#include<stdio.h>
#include<ctype.h>
#include<stdlib.h>
%}
%token num let
%left '+' '-'
%left '*' '/'
%%
stmt: expr '\n' {printf("\n..valid Expression..\n"); exit(0);}
| error '\n' {printf("\n..Invalid..\n"); exit(0);}
;
expr: num
| let
| expr '+' expr
| expr '-' expr
| expr '*' expr
| expr '/' expr
| '(' expr ')'
%%
main()
{
printf("Enter an expression to validate :");
yyparse();
}
yylex()
{
int ch;
while((ch=getchar())==' ');
if(isdigit(ch))
return num;
if(isalpha(ch))
return let;
return ch;
}
yyerror(char *s)
{
printf("%s",s);
}
OUTPUT
$ yacc -d y1.y
$ cc y.tab.c -ll
$ ./a.out
Enter an expression to validate : a+b
valid Expression
$ ./a.out
Enter an expression to validate : (a+b
Invalid
RESULT:
Thus the program for validating arithmetic expressions using Yacc is implemented and executed
successfully.
EX.NO:3B
ALGORITHM:
$ yacc -d y4.y
$ cc y.tab.c -ll
$ ./a.out total30
Accepted
$ ./a.out 40a
Syntax error Rejected
RESULT:
Thus the program is to recognize a valid variable which starts with a letter followed by any
number of letters or digits executed successfully.
EX.NO:3D
AIM:
To write programs implementation of calculator using lex and yacc
ALGORITHM:
%{
#include<stdio.h>
#include<ctype.h>
#include<stdlib.h>
%}
%token num let
%left '+' '-'
%left '*' '/'
%%
stmt: expr '\n' { printf(“Answer: %d”, $1); }
yylex()
{
int ch;
while((ch=getchar())==' ');
if(isdigit(ch))
yylval=ch-‘0’;
return num;
}
if(isalpha(ch))
return let;
return ch;
}
yyerror(char *s)
{
printf("%s",s);
}
OUTPUT
$yacc cal.y
$cc y.tab.c
$./a.out
Enter an expression to validate :4*3
Answer:12
RESULT:
Thus the program to implement calculator using LEX and YACC tool is executed
successfully and output is verified.
EX.No:4
GENERATE THREE ADDRESS CODE FOR A SIMPLE PROGRAM USING LEX AND
YACC.
AIM:
To Convert the BNF rules into Yacc form and write code to generate Abstract Syntax Tree
INTRODUCTION:
BNF-Backus Naur form is formal notationfor encoding grammars intended for human
Consumption. Many programming languages, protocol or formats have BNF description in their
Specification.
ALGORITHM:
Step1: Start the program.
Step2: In int code.l,declare the variable lie no as integer and assign it to be equal to ‘1’.
Step3: In translation rules section define keywords ,data types and integer along with
their actions .
Step4: Start the main block. In main block check the statement
1.declarative 2.assignment 3.conditional 4.if and else 5.While assignment.
Step5: Perform the actions of that particular block.
Step6: In main program declare the parameters arg c as int end *argv[] as char and open file in read
mode.
Step7: Print the output in a file.
Step8: End the program.
PROGRAM:
Lex<Bnf.L>
%{
#include"y.tab.h"
#include<stdio.h>
#include<string.h>
int LineNo=1;
%}
identifier [a-zA-Z][_a-zA-Z0-9]*
number [0-9]+|([0-9]*\.[0-9]+)
%%
main\(\) return MAIN;
int |
char |
float return TYPE;
{identifier} {strcpy(yylval.var,yytext);
return VAR;}
{number} {strcpy(yylval.var,yytext);
return NUM;}
[ \t] ;
\n LineNo++;
. return yytext[0];
%%
Yacc <Bnf.Y>
%{
#include<string.h>
#include<stdio.h>
struct quad
{
char op[5];
char arg1[10];
char arg2[10];
char result[10];
}QUAD[30];
int Index=0,tIndex=0,StNo,Ind,tInd;
int LineNo;
%}
%union
{
char var[10];
}
%token <var> NUM VAR RELOP
%token MAIN TYPE
%type <var> EXPR ASSIGNMENT
%left '-' '+'
yyparse();
printf("\n\n\t\t ----------------------------\n\t\t Pos Operator Arg1 Arg2 Result\n\t\t--------------------");
for(i=0;i<Index;i++)
{
printf("\n\t\t %d\t %s\t %s\t %s\t %s",i,QUAD[i].op,QUAD[i].arg1,QUAD[i].arg2,QUAD[i].result);
}
printf("\n\t\t -----------------------");
printf("\n\n");
return 0;
}
void AddQuadruple(char op[5],char arg1[10],char arg2[10],char result[10])
{
strcpy(QUAD[Index].op,op);
strcpy(QUAD[Index].arg1,arg1);
strcpy(QUAD[Index].arg2,arg2);
sprintf(QUAD[Index].result,"t%d",tIndex++);
strcpy(result,QUAD[Index++].result);
}
yyerror()
{
printf("\n Error on line no:%d",LineNo);
}
----------------------------------------------------------------------------
Pos Operator Arg1 Arg2 Result
-----------------------------------------------------------------------------
0 + a b t1
1 = t1 c
2 - c a t2
3 = t2 b
-----------------------------------------------------------------------------
RESULT:
Thus the program to convert the BNF rules into Yacc form is implemented and executed
successfully.
EX.NO:5
AIM:
To write a c program to implement type checking
INTRODUCTION:
The type analysis and type checking is an important activity done in the semantic
analysis phase. The need for type checking is
Returns
type info
Source
scanner Type checker Intermediate
program Or Sematic code
analyser
ALGORITHM:
Step 1: start the program.
Step 2: read the necessary variable for a,b,new,mess.
Step 3: find the length of the a value a is copy to the (mess,type(a,l));
Step 4: then find the length of b value. It is copy to the (mess,type(b,l) print it.
Step 5: compare the mess and mess if it is equal to ‘0’ means print there is no type error.
Step 6: or else print the type error.
Step 7: declare the x,m and mess value.
Step 8: copy the alphanumeric value to mess using if else statement if its not true. Check the condition.
X[i]==’.’ And copy the float value to mess.
Step 9: finally return the mess value.
Step 10: stop the program.
PROGRAM:
#include <stdio.h>
#include <conio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>
char* type(char[],int);
void main() {
char a[10],b[10],mess[20],mess1[20];
int i,l;
clrscr();
printf( "\n\n int a,b;\n\n int c=a+b\n");
printf( "\n\n Enter a value for a\n");
scanf("%s", a);
l=strlen(a);
printf(" \n a is :");
strcpy(mess,type(a,l));
printf("%s",mess);
printf( "\n\n Enter a value for b\n\n");
scanf("%s",b);
l=strlen(b);
printf(" \n b is :");
strcpy(mess1,type(b,l));
printf("%s",mess1);
if(strcmp(mess,"int")==0&&strcmp(mess1,"int")==0) {
printf("\n\n No Type Error");
}
else
{
printf("\n\n Type Error");
}
getch();
}
char* type(char x[],int m)
{
int i;
char mes[20];
for(i=0;i<m;i++)
{
if(isalpha(x[i]))
{
strcpy(mes,"AlphaNumeric");
goto X;
}
else if
(x[i]=='.')
{
strcpy(mes,"float");
goto X;
}
}
strcpy(mes,"int");
X:
return mes;
}
OUTPUT:
Enter a value for a
10
a is :int
Enter a value for b
10
b is :int
No Type Error
EX.NO:6
AIM:
To write a program for implementation of Code Optimization
INTRODUCTION:
Code Optimization
Optimization is a program transformation technique, which tries to improve the code by making
it consume less resources (i.e. CPU, Memory) and deliver high speed
loop-invariant code movement or code motion:
Loop-invariant code consists of statements or expressions (in an imperative programming lan-
guage) which can be moved outside the body of a loop without affecting the semantics of the pro -
gram. Loop-invariant code motion (also called hoisting or scalar promotion) is a compiler optimiza-
tion which performs this movement automatically.
strength reduction :
Strength reduction is a compiler optimization where expensive operations are replaced with
equivalent but less expensive operations.
ALGORITHM:
Step1: Start the program
Step2: Write the program without code movement and get the output
Step3: Implement the above program with code movement and get the output.
Step4: Write the program without strength reduction and get the output
Step5: Implement the above program with strength reduction and get the output.
Step6: Stop the program
PROGRAM:
Program1:
#include<stdio.h>
#include<conio.h>
#define max 6
void main()
{
int n=1,s=0;
clrscr();
printf("Output without Code movement technique:\n");
while(n<=max-1)
{
s=s+n;
n++;
}
printf("Sum of First 5 Numbers:%d",s);
getch();
}
Output:
Output without Code movement Technique
Sum of First 5 Numbers:15
#include<stdio.h>
#include<conio.h>
#define max 6
void main()
{
int n=1,s=0,z;
clrscr();
printf("Output with Code movement technique:\n");
z=max-1;
while(n<=z)
{
s=s+n;
n++;
}
printf("Sum of First 5 Numbers:%d",s);
getch();
}
Output:
Output with Code movement Technique
Sum of First 5 Numbers:15
Program2:
#include<stdio.h>
#include<conio.h>
void main()
{
int i,s;
clrscr();
printf("Output without strength reduction:\n");
for(i=1;i<=10;i++)
{
s=i*2;
printf("%d ",s);
}
getch();
}
Output:
Output without strength reduction:
2 4 6 8 10 12 14 16 18 20
#include<stdio.h>
#include<conio.h>
void main()
{
int i,s;
clrscr();
printf("Output with strength reduction:\n");
for(i=1;i<=10;i++)
{
s=i+i;
printf("%d ",s);
}
getch();
}
Output:
Output with strength reduction:
2 4 6 8 10 12 14 16 18 20
RESULT:
Thus the program for implementation of Code Optimization technique is executed and verified.
Ex.No:7
IMPLEMENT THE BACK END OF THE COMPILER
AIM
To implement the back end of the compiler which takes the three address code and
produces the 8086 assembly language instructions that can be assembled and run using
a 8086 assembler. The target assembly instructions can be simple move, add, sub,
jump. Also simple addressing modes are used.
INTRODUCTION:
A compiler is a computer program that implements a programming language
specification to “translate” programs, usually as a set of files which constitute the
source code written in source language, into their equivalent machine readable
instructions(the target language, often having a binary form known as object code).
This translation process is called compilation.
BACK END:
Some local optimization
Register allocation
Peep-hole optimization
Code generation
Instruction scheduling
The main phases of the back end include the following:
Analysis: This is the gathering of program information from the intermediate representa-
tion derived from the input; data-flow analysis is used to build use-define chains, to-
gether with dependence analysis, alias analysis, pointer analysis, escape analysis etc.
Optimization: The intermediate language representation is transformed into function-
ally equivalent but faster (or smaller) forms. Popular optimizations are expansion, dead,
constant, propagation, loop transformation, register allocation and even automatic par-
allelization.
Code generation: The transformed language is translated into the output language, usu-
ally the native machine language of the system. This involves resource and storage deci-
sions, such as deciding which variables to fit into registers and memory and the selection
and scheduling of appropriate machine instructions along with their associated modes.
Debug data may also need to be generated to facilitate debugging.
ALGORITHM:
Step1: Start the program
Step2: Read the intermediate code from input file
Step3: For each instruction, Move one of the operand to register using MOV statement
Step4: Then perform the required arithmetic operation
Step5: print the assembly language for each statement
Step6: Stop the program
PROGRAM:
#include<stdio.h>
#include<conio.h>
#include<process.h>
char op[2],arg1[5],arg2[5],result[5];
void main()
{
FILE *fp1,*fp2;
fp1=fopen(“input.txt”,”r”);
fp2=fopen(“output.txt”,”w”);
while(!feof(fp1))
{
fscanf(fp1,”%s%s%s%s”,op,arg1,arg2,result);
if(strcmp(op,”+”)==0)
{
fprintf(fp2,”\nMOV R0,%s”,arg1);
fprintf(fp2,”\nADD R0,%s”,arg2);
fprintf(fp2,”\nMOV %s,R0″,result);
}
if(strcmp(op,”*”)==0)
{
42
fprintf(fp2,”\nMOV R0,%s”,arg1);
fprintf(fp2,”\nMUL R0,%s”,arg2);
fprintf(fp2,”\nMOV %s,R0″,result);
}
if(strcmp(op,”-“)==0)
{
fprintf(fp2,”\nMOV R0,%s”,arg1);
fprintf(fp2,”\nSUB R0,%s”,arg2);
fprintf(fp2,”\nMOV %s,R0″,result);
}
if(strcmp(op,”/”)==0)
{
fprintf(fp2,”\nMOV R0,%s”,arg1);
fprintf(fp2,”\nDIV R0,%s”,arg2);
fprintf(fp2,”\nMOV %s,R0″,result);
}
if(strcmp(op,”=”)==0)
{
fprintf(fp2,”\nMOV R0,%s”,arg1);
fprintf(fp2,”\nMOV %s,R0″,result);
}
}
fclose(fp1);
fclose(fp2);
getch();
}
}
input.txt
+ a b t1
* c d t2
– t1 t2 t
=t?x
43
OUTPUT:
output.txt
MOV R0,a
ADD R0,b
MOV t1,R0
MOV R0,c
MUL R0,d
MOV t2,R0
MOV R0,t1
SUB R0,t2
MOV t,R0
MOV R0,t
MOV x,R0
RESULT:
Thus the back end compiler based on machine instructions were executed
44