0% found this document useful (0 votes)
146 views45 pages

Complier Design Lab

The document describes a compiler design lab manual for a computer science course. It includes the vision, mission and goals of the computer science department. It then describes the syllabus for a Compiler Design Lab course which focuses on designing a compiler for a sample mini programming language. The lab experiments include designing a lexical analyzer, implementing it, designing predictive and LALR parsers, generating an abstract syntax tree from the parsers and generating machine code from the abstract syntax tree.

Uploaded by

Pranove AB
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
146 views45 pages

Complier Design Lab

The document describes a compiler design lab manual for a computer science course. It includes the vision, mission and goals of the computer science department. It then describes the syllabus for a Compiler Design Lab course which focuses on designing a compiler for a sample mini programming language. The lab experiments include designing a lexical analyzer, implementing it, designing predictive and LALR parsers, generating an abstract syntax tree from the parsers and generating machine code from the abstract syntax tree.

Uploaded by

Pranove AB
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

CMR ENGINEERING COLLEGE

Kandlakoya, Medchal, Hyderabad-501401


R13 REGULATION

III - I CSE (2015-2016)

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

COMPILER DESIGN LABORATORY

LAB MANUAL

93
Computer Science and Engineering
Vision & Mission
Vision

To produce globally competent and industry ready graduates in Computer


Science & Engineering by imparting quality education with a know-how
of cutting edge technology and holistic personality.

Mission

To offer high quality education in Computer Science& Engineering in order


to build core competence for the students by laying solid foundation in
Applied Mathematics, and program framework with a focus on concept
building.

The department promotes excellence in teaching, research, and


collaborative activities to prepare students for professional career or higher
studies.

Creating intellectual environment for developing logical skills and problem


solving strategies, thus to develop able and proficient computer engineer to
compete in the current global scenario.

94
Lab Floor Plans &Layout
-------------------------------------------
ENTRY
---------------------------------

95
PREFACE

This lab is as a part of III B. Tech I semester for CSE students. Compiler design

principles provide an in-depth view of translation and optimization process. This

lab enables the students to practice basic translation mechanism by designing

complete translator for a mini language and error detection & recovery. It

includes lexical, syntax, and semantic analysis as front end, and code generation

and optimization as back-end with recommended systems/software requirements

following the university prescribed textbooks. The expected outcomes from the

students are:

1. By this laboratory, students will understand the practical approach

of how a compiler works.

2. This will enable him to work in the development phase of new computer

languages in industry.

96
SYLLABUS

COMPILER DESIGN LAB

( A50587) COMPILER DESIGN LAB

Objectives:

 To provide an understanding of the language translation peculiarities by designing a


complete translator for a mini language.

Recomended System / Software Requirements:

 Intel based desktop PC with minimum of 166 MHZ or faster processor with atleast 64
MB RAM and 100 MB free disk space
 C++ comiler and JDK kit

Consider the following mini Language, a simple procedural high-level language, only
operating on integer data, with a syntax looking vaguely like a simple C crossed with Pascal.
The syntax of the language is defined by the following BNF grammar:

<program> ::= <block>

<block> ::= { <variabledefinition> <slist> } | { <slist> }

<variabledefinition> ::= int<vardeflist>;

<vardeflist> ::= <vardec> | <vardec>, <vardeflist>

<vardec> ::= <identifier> | <identifier> [ <constant> ]

<slist> ::= <statement> | <statement>; <slist>

<statement> ::= <assignment> | <ifstatement> | <whilestatement> | <block> |


<printstatement> | <empty>

<assignment> ::= <identifier> = <expression> | <identifier> [ <expression> ] = <expression>

<ifstatement> ::= <bexpression> then <slist> else <slist> endif | if <bexpression> then <slist>
endif

<whilestatement> ::= while <bexpression> do <slist> enddo

<printstatement> ::= print ( <expression> )

<expression> ::= <expression> <additionop> <term> | <term> | addingop> <term>

97
<bexpression> ::= <expression> <relop> <expression>

<relop> ::= < | <= | == | >= | > | !=

<addingop> ::= + | -

<term> ::= <term> <mulitop> <factor> | <factor>

<multop> ::= * | /

<factor> ::= <constant> | <identifier> | <identifier> [ <expression> ] | ( <expression> )

<constant> ::= <digit> | <digit> <constant>

<identifier> ::= <identifier> <letterordigit> | <letter>

<letterordigit> ::= <letter> | <digit>

<letter> ::= a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z

<digit> ::= 0|1|2|3|4|5|6|7|8|9

<empty> has the obvious meaning

Comments (zero or more characters enclosed between the standard C / Java style comment
brackets /*...*/) can be inserted. The language has rudimentary support for 1-dimensional
arrays. The declaration

int a[3] declares an array of three elements, referenced as a[0], a[1] and a[2] Note also that
you should worry about the scoping of names.

A simple program written in this language is:

{
int a[3], t1, t2;
t1 = 2;
a[0] = 1; a[1] = 2; a[t1] = 3;
t2 = -(a[2] + t1 * 6)/ a[2] - t1);
if t2 > 5 then
print(t2);
else
{
int t3;
t3 = 99;
t2 = -25;
print(-t1 + t2 * t3); /* this is a comment on 2 lines */
}
endif
}

1. Design a Lexical analyzer for the above language. The lexical analyzer should ignore
redundant spaces, tabs and newlines. It sholud also ignore comments. Although the

98
syntax specification states that identifiers can be arbitrarily long, you may restrict the
length to some reasonable value.
2. Implement the lexical analyzer using JLex, flex or lex or other lexical analyzer
generarting tools.
3. Design Predictive parser for the given language.
4. Design LALR bottom up parser for the above language.
5. Convert the BNF rules into Yacc from and write code to generate abstract syntax tree.
6. Write program to generate machine code from the abstract syntax tree generated by
the parser. The following instruction set may be considered as target code.

The following is a simple register-based machine, supporting a total of 17 instructions. It has


three distinct internal storage areas. The first is the set of 8 registers, used by the individual
instructions as detailed below, the second is an area used for the storage of variables and the
third is an area used for the storage of program. The instructions can be precede by a label.
This consists of an integer in the range 1 to 9999 and the label is followed by a colon to
seperate it from the rest of the instruction. The numerical label can be used as the argument to
a jump instruction, as detailed below.

In the description of the individual instructions below, instruction argument types are
specified as follows:

R specifies a register in the form R0, R1, R2, R3, R4, R5, R6 or R7 (or r0, r1, etc).

L specifies a numerical label (in the rabge 1 tp 9999).

V specifies a "variable location" ( a variable number, or a variable location pointed to by a


register - see below).

A specifies a constant value, a variable location, a register or a variable location pointed to by


a register (an indirect address). Constant values are specified as an integer value, optionally
preceded by a minus sign, preceded by a # symbol. An indirect address is specified by an @
followed by a register.

So, for example an A-type argument could have the form 4 (variable number 4), #4 (the
constant value 4), r4 (register 4) or @r4 (the contents of register 4 identifies the variable
location to be accessed).

The instruction set is defined as follows:


LOAD A, R
loads the integer value secified by A into register R.
STORE R, V
stores the value in register R to variable V.
OUT R
outputs the value in register R.
NEG R
negates the value in register R.
ADD A, R
adds the value specified by A to register R, leaving the result in register R.
SUB A, R
subtracts the value specified by A from register R, leaving the result in register R.
MUL A, R
multiplies the value specified by A by register R, leaving the result in register R.
DIV A, R
divides register R by the value specified by A, leaving the result in register R.
99
JMP L
causes an unconditional jump to the instruction with the label L.
JEQ R, L
jumps to the instruction with the label L if the value in register R is zero.
JNE R, L
jumps to the instruction with the label L if the value in register R is not zero.
JGE R, L
jumps to the instruction with the label L if the value in register R is greater than or equal to
zero.
JGT R, L
jumps to the instruction with the label L if the value in register R is greater than zero.
JLE R, L
jumps to the instruction with the label L if the value in register R is less than or equal to zero.
JLT R, L
jumos to the instruction with the label L if the value in register R is less than zero.
NOP
is an instruction with no effect. It can be tagged by a label.
STOP
stops execution of the machine. All programs should terminate by executing a STOP
instruction.

100
List Of Experiments

Experiment 1: Design a Lexical analyzer for the above language. The lexical

analyzer should ignore redundant spaces, tabs and newlines. It should also

ignore comments

Experiment 2: Implement the lexical analyzer using JLex, flex or lex or other

lexical analyzer generating tools.

Experiment 3: Design Predictive parser for the given language

Experiment 4: Design LALR bottom up parser for the above language.

Experiment 5: Convert the BNF rules into Yacc from and write code to

generate abstract syntax tree.

Experiment 6: Write program to generate machine code from the abstract

syntax tree generated by the parser.

101
COMPILER DESIGN LAB EQUIPMENT SPECIFICATIONS

ADEQUATE AND WELL EQUIPPED LABORATORIES AND TECHNICAL


MANPOWER.

Weekly Technical Manpower support


No. of utilization
students status
Name of the
Name Of The per (all the Name Of The
Sl. No important
Laboratory setup courses for Technical Designation Qualification
equipment
(Batch which the Staff
size) lab is
utilized)

30 Zenith
computers Intel
Pentium dual core
processor with
3.5GHz speed,
RAM 2GB,
Hard Disk
320GB,
DATABASE Programmer B. Tech
15 KVA UPS,
MANAGEMENT 24 port switches-
3 No. s,
SYSTEMS LAB/
16port switch-
4 COMPILER 1 No. s, UG
30 No. s 8 port switch-
DESIGN LAB II B. Tech-
1 No. s,
Batteries-19 No. II Sem / Ms.A.Mounika
s,
III B. Tech -
AC’s-2 No. s,
Projector-1 No. s, I Sem
LAN speed with
100 Mbps,
HP-Printer-1 No.
s,
Amplifier-1 No. s,
Speakers-2 No. s,
chairs-30 No. s,
Table -1 No. s
.

102
CMR ENGINEERING COLLEGE
Laboratory Name :Compiler Design Experiment No: 1

AIM: A Program to Design Lexical Analyzer.

PROGRAM:

[examuser56@localhost ~]$ vi lexical.c

#include<stdio.h>
#include<conio.h>
#include<string.h>
{
char exp[20],id[10],dig[10],ch;
int i,j;
clrscr();
printf("enter expression:");
scanf("%s",&exp);
void main()

for(i=0;i<strlen(exp);)
{
ch=exp[i];
j=0;
if(ch>='a'&&ch<='z')
{
id[j++]=ch;
i++;

while((exp[i]>='a'&&exp[i]<='z')||(exp[i]>='0'&&exp[i]<='9'))
{
id[j++]=exp[i++];
}
id[j]='\0';

printf("\nidentifier:%s",id);
}
else if(ch=='+'||ch=='-'||ch=='*'||ch=='/'||ch=='%'||ch=='=')
{
printf("\noperator:%c",ch);
i++;
}
103
else if(ch>='0'&&ch<='9')
{
dig[j++]=ch;
i++;
while(exp[i]>='0'&&exp[i]<='9')
dig[j++]=exp[i++];
dig[j]='\0';
printf("\nconstant:%s",dig);
}
}//for
getch();
}

Output:

[examuser56@localhost ~]$ gcc lexical.c


[examuser56@localhost ~]$ ./a.out
Enter the expression: a=b+c
identifier : a
operator: =
identifier : b
operator: +
identifier : c

104
LAB VIVA QUESTIONS & ANSWERS

1. Define compilers and translators?


A translator is a program that takes as input a program written in one
programming language and produces as output a program in another language.
If the source language is a high level language and the object language is a low-
level language then such a translator is called a compiler.

2. What are the phases of a compiler?


i) Lexical analysis.
ii) Syntax analysis.
iii) Intermediate code generation.
iv) Code optimization.
v) Code generation.

3. Define Passes?
In an implementation of a compiler, portion of one or more phases are
combined into a module called pass. A pass reads the source program or the
output of the previous pass, makes the transformations specified by its phases
and writes output into an intermediate file, which is read by subsequent pass.

4. Define Lexical Analysis?


The lexical analyzer reads the source program one character at a time, carving
the source program into a sequence of atomic units called tokens. Identifiers,
keywords, constants, operators and punctuation symbols are typical tokens.

5. Write notes on syntax analysis?


Syntax analysis is also called parsing. It involves grouping the tokens of the
source program into grammatical phrases that are used by the compiler to
synthesize output.

6. What is meant by semantic analysis?


The semantic analysis phase checks the source program for semantic errors and
gathers type information for the subsequent code generation phase. It uses the
hierarchical structure determined by the syntax-analysis phase to identify the
operators and operand of expressions and statements.

7. Define optimization?
Certain compilers apply transformations to the output of the intermediate code
generator. It is used to produce an intermediate-language from which a faster or
smaller object program can be produced. This phase is called optimization
phase. Types of optimization are local optimization and loop optimization.

105
8. What is cross compiler?
A compiler may run on one machine and produce object code for another
machine is called cross compiler.

9. Define semantics of a programming language?


The rules that tell whether a string is a valid program or not are called syntax of
the language. The rules that give meaning to programs are called the semantics
of a programming language.

10.What are the data elements of a programming language?


a) Numerical data.
b) Logical data.
c) Character data.
d) Pointers.
e) Labels.

106
CMR ENGINEERING COLLEGE
Laboratory Name :Compiler Design Experiment No: 2

AIM: Implement the Lexical Analyzer Using LEX Tool.

PROGRAM:

/* program name is lexp.l */


%{
/* program to recognize a c program */
int COMMENT=0;
%}
identifier [a-zA-Z][a-zA-Z0-9]*
%%
#.* { printf("\n%s is a PREPROCESSOR DIRECTIVE",yytext);}
int |
float |
char |
double |
while |
for |
do |
if |
break |
continue |
void |
switch |
case |
long |
struct |
const |
typedef |
return |
else |
goto{printf("\n\t%s is a KEYWORD",yytext);}
"/*" {COMMENT = 1;}
/*{printf("\n\n\t%s is a COMMENT\n",yytext);}*/
"*/" {COMMENT = 0;}
/* printf("\n\n\t%s is a COMMENT\n",yytext);}*/
{identifier}\( {if(!COMMENT)printf("\n\nFUNCTION\n\t%s",yytext);}
\{ {if(!COMMENT) printf("\n BLOCK BEGINS");}

107
\} {if(!COMMENT) printf("\n BLOCK ENDS");}
{identifier}(\[[0-9]*\])? {if(!COMMENT) printf("\n %s
IDENTIFIER",yytext);}
\".*\" {if(!COMMENT) printf("\n\t%s is a STRING",yytext);}
[0-9]+ {if(!COMMENT) printf("\n\t%s is a NUMBER",yytext);}
\)(\;)? {if(!COMMENT) printf("\n\t");ECHO;printf("\n");}
\(
\( ECHO;
= {if(!COMMENT)printf("\n\t%s is an ASSIGNMENT
OPERATOR",yytext);}
\<= |
\>= |
\< |
== |
\> {if(!COMMENT) printf("\n\t%s is a RELATIONAL OPERATOR",yytext);}
%%
int main(int argc,char **argv)
{
if (argc > 1)
{
FILE *file;
file = fopen(argv[1],"r");
if(!file)
{
printf("could not open %s \n",argv[1]);
exit(0);
}
yyin = file;
}
yylex();
printf("\n\n");
return 0;
}
int yywrap()
{
return 0;
}

108
Output:

$lex lex.l
$cc lex.yy.c
#include<stdio.h>
main()
{
int a,b;
}

$./a.out var.c
#include<stdio.h> is a PREPROCESSOR DIRECTIVE
FUNCTION
main (
)
BLOCK BEGINS
int is a KEYWORD
a IDENTIFIER
b IDENTIFIER
BLOCK ENDS

LAB VIVA QUESTIONS & ANSWERS

1. Define binding?
The act of associating attributes to a name is referred to as binding the attributes
to the name. Most binding done at compile time called static binding. Some
languages, such as SNOBOL allow dynamic binding, binding done at run time.

2.What is coercion of types?


The translation of the operator, which the compiler must provide, includes any
necessary conversion from one type to another, and this implied change in type
iscalledcoercion.

109
3.What is meant by loaders and link-editors?
A program called a loader performs the two function of loading and
linkediting.The process of loading consists of taking relocatable machine
code,altering the relocatable addresses and placing the altered instruction and
data in memory at the proper locations.

4.Write down the various compiler construction tools?


Some of the useful compiler construction tools are
a) Parser generator
b) Scanner generators
c) Syntax-directed translation engines
d) Automatic code generators
e) Data-flow engines

5.What are the possible error recovery actions in lexical analysis:


a) Deleting an extraneous character
b) Inserting a missing character
c) Replacing an incorrect character by a correct character
d) Transposing two adjacent characters

6. Define regular expressions?


Regular expressions are the notation we shall use to define the class of
languages known as regular sets. It is used to describe tokens. In regular
expression notation we could write
identifier = letter ( letter | digit )*

7.Write the regular expression for denoting the set containing the string a
andall strings consisting of zero or more a’s followed by a b.
a|a*b

8. Describe the language generated by the regular expressions?


a) 0(0|1)*0
The set of zero or more number of zeroes and ones prefixed by zero and
suffixed by 0.

110
9.What is a regular definition?
If Σ is an alphabet of basic symbols, then a regular definition is a sequence of
definition of the form
d1 r1
d2 r2
….
dn fn
Where each di is a distinct name, and each ri is a regular expression over the
symbol in Σ U {d1, d2, …di-1}

10. Define finite automata?


A better way to convert a regular expression to a recognizer is to construct a
generalized transition diagram from the expression. This diagram is called a
finite automaton.

111
CMR ENGINEERING COLEGE
Laboratory Name :Compiler Design Experiment No: 3

AIM: write a program for predictive parser

PROGRAM:

#include<stdio.h>
#include<conio.h>
#include<string.h>
char prol[7][10]={"s","A","A","B","B","C","C"};
char pror[7][10]={"Aa","Bb","Cd","aB","@","Cc","@"};
char prod[7][10]={"s-->A","A-->Bb","A-->Cd","B-->aB","B-->@","C--
>Cc","C-->@"};
char first[7][10]={"abcd","ab",cd","a@","@","c@","@"};
char follow[7][10]={"$","$","$","a$","b$","c$","d$"};
char table[5][6][10];
{
switch(c)
{
case 'S':return0;
case 'A':return1;
case 'B':return2;
case 'C':return3;
case 'a':return0;
case 'b':return1;
case 'c':return2;
case 'd':return3;
case '$':return4;
}
retun(2);
}
void main()
{
int i,j,k;
clrscr();
for(i=0;i<5;i++)
for(j=0;j<6;j++)
strcpy(table[i][j]," ");
printf("\n The following is the predictive parsing table for the following
grammar:\n");
for(i=0;i<7;i++)
112
printf("%s\n",prod[i]);
printf("\n Predictive parsing table is:\n ");
fflush(stdin);

for(i=0;i<7;i++)
{
k=strlen(first[i]);
for(j=0;j<10;j++)
if(first[i][j]!='@')
strcpy(table[numr(prol[i][0])+1][numr(first[i][j])+1],prod[i]);
}
for(i=0;i<7;i++)
{
if(strlen(pror[i])==1)
{
if(pror[i][0]=='@')
{

k=strlen(follow[i]);
for(j=0;j<k;j++)
strcpy(table[numr(prol[i][0])+1][numr(follow[i][j])+1]prod[i]);
}
}
}
strcpy(table[0][0]," ");
strcpy(table[0][1],"a");
strcpy(table[0][2],"b");
strcpy(table[0][3],"c");
strcpy(table[0][4],"d");
strcpy(table[0][5],"$");
strcpy(table[1][0],"S");
strcpy(table[2][0],"A");
strcpy(table[3][0],"B");
strcpy(table[4][0],"C");
printf("\n-----------------------------------------------------------------------------\n");
for(i-0;i<5;i++)
for(j=0;j<6;j++)
{
printf("%s_10S",table[i][j]);
if(j==5)
printf("\n-----------------------------------------------------------------------------\n");
}
getch();
}

113
Output:
[examuser56@localhost ~]$ gcc predictive.c
[examuser56@localhost ~]$ ./a.out

The following is the predictive parsing table for the following grammar:
S->A
A->Bb
A->Cd
B->aB
B->@
C->Cc
C->@

Predictive parsing table is


------------------------------------------------------------------
a b c d $
------------------------------------------------------------------
S S->A S->A S->A S->A
------------------------------------------------------------------
A A->Bb A->Bb A->Cd A->Cd
------------------------------------------------------------------
B B->aB B->@ B->@ B->@
------------------------------------------------------------------
C C->@ C->@ C->@
------------------------------------------------------------------

114
LAB VIVA QUESTIONS & ANSWERS

1.What is Deterministic Automata?


A finite automaton is deterministic if
a. It has no transition of input .
b. For each state s and input symbol a, there is at most one edge labeled a
leaving s.

2. Define LEX?
LEX is a tool for automatically generating lexical analyzers. A LEX source
program is a specification of a lexical analyzer, consisting of a set of regular
expressions together with an action for each regular expression. The output of
LEX is a lexical analyzer program.

3. Define context-free grammar?


The syntactic specification of a programming language can be formed by a
notation called a context-free grammar, which is also called a BNF (Backus-
Naur form ) description. Context-free grammars are capable of describing most,
but not all, of the syntax of programming languages.

4. Define parse trees?


The graphical representation for derivations that filters out the choice regarding
replacement order. This representation is called the parse trees. It represents the
hierarchical syntactic structure of sentences that is implied by the grammar.

5.What are the various types of errors in program?


a) Lexical, such as misspelling an identifier, keyword, or operator.
b) Syntactic , such as an arithmetic expression with unbalanced parenthesis.
c) Semantic, such a as an operator applied to an incompatible operand.
d) Logical, such as an infinitely recursive call.

6.What re the various error-recovery strategies?


a) Panic mode - On discovering this error, the parser discards the input symbols
one at a time until one of a designated set of synchronized tokens is found.
b) Phrase level – On discovering an error, a parser perform local correction on
the remaining input ; that is , it may replace a prefix or the remaining input by
115
some string that allows the parser to continue.
c) Error production and - If we are having good idea of error we recover it.
d) Global correction – Use the compiler to make as few changes as possible in
processing an input string.

7.Write a grammar to define simple arithmetic expression?


expr expr op expr
expr (expr)
expr - expr
expr id
op + | - | * | / | ^

8. Define context-free language?


Given a grammar G with start symbol S, we can use the ==> relation to define
L(G) , the language generated by G. We say a string of terminals w is in * L(G)
if and only if S ==> w. The string w is called a sentence of G. the language that
can only generated by a grammar is said to be a context-free language.

9. Define ambiguity?
A grammar that produces more than one parse tree for some sentence is said to
be ambiguous. An ambiguous grammar is one that produces more than one
leftmost or more than one right most derivation for some sentence.

10.What is meant by left recursion?


A grammar is left recursive if it has a nonterminal A such that there is a
derivation A ==> A α for some string α . Top down parsing methods cannot
handle left-recursion grammars, so a transformation that eliminates left
recursion in needed.
Ex:-
E E +T | T
T T * F | F
F (E) | id

116
CMR ENGINEERING COLLEGE
Laboratory Name :Compiler Design Experiment No: 4

AIM: Design LALR Bottom up Parser .

PROGRAM:

<parser.l>
%{
#include<stdio.h>
#include "y.tab.h"
%}
%%
[0-9]+ {yylval.dval=atof(yytext);
return DIGIT;
}
\n|.return yytext[0];
%%
<parser.y>
%{
/*This YACC specification file generates the LALR parser for the program
considered in experiment 4.*/
#include<stdio.h>
%}
%union
{
double dval;
}
%token <dval> DIGIT
%type <dval> expr
%type <dval> term
%type <dval> factor
{
%%
line: expr '\n' {
printf("%g\n",$1);
}
;

expr: expr '+' term {$$=$1 + $3 ;}


117
| term
;

term: term '*' factor {$$=$1 * $3 ;}


| factor
;
factor: '(' expr ')' {$$=$2 ;}

| DIGIT
;
%%
int main()
{
yyparse();
}
yyerror(char *s)
{
printf("%s",s);
}

Output:

$lex parser.l
$yacc –d parser.y
$cc lex.yy.c y.tab.c –ll –lm
$./a.out
2+3
5.0000

118
LAB VIVA QUESTIONS & ANSWERS

1.What is meant by left factoring?


Left factoring is a grammar transformation that is useful for producing a
grammar suitable for predictive parsing. The basic idea is that when it is not
clear which of two alternative productions to use to expand a nonterminal A,
we may be able to rewrite the A production to defer the decision until we have
seen enough of the input to make the right choice.

2. Define parser?
A parser for grammar G is a program that takes as input a string w and
produces as output either a parse tree for w, if w is a sentence of G, or an error
message indicating that w is not a sentence of G.

3.What is shift_reduce parsing?


The bottom_up style of parsing is called shift_reduce parsing. This parsing
method is bottom_up because it attempts to construct a parse tree for an input
string beginning at the leaves and working up towards the root.

4. Define Handles?
A handle of a right-sentential form γ is a production A β and a position of
γ where the string β may be found and replaced by A to produce the previous
right-sentential form in a rightmost derivation of γ.

5.What are the four possible action of a shift_reduce parser?


a) Shift action – the next input symbol is shifted to the top of the stack.
b) Reduce action – replace handle.
c) Accept action – successful completion of parsing.
119
d) Error action- find syntax error.

6.What is an operator grammar?


The grammars have the property that no production right side is or has two
adjacent nonterminals is called operator grammar.

7.What are the problems in top down parsing?


a) Left recursion.
b) Backtracking.
c) The order in which alternates are tried can affect the language accepted.

8. Define recursive-descent parser?


A parser that uses a set of recursive procedures to recognize its input with
nonbacktracking is called a recursive-descent parser. The recursive procedures
can be quite easy to write.

9. Define predictive parsers?


A predictive parser is an efficient way of implementing recursive_descent
parsing by handling the stack of activation records explicitly. The predictive
parser has an input, a stack , a parsing table and an output.

10.What is LL(1) grammar?


A grammar whose parsing table has no multiply-defined entries is said to be
LL(1).

120
CMR ENGINEERING COLLEGE
Laboratory Name :Compiler Design Experiment No: 5

AIM: Convert The BNF rules into YACC form and write code to
generate abstract syntax tree.

PROGRAM:

<int.l>
%{
#include"y.tab.h"
#include<stdio.h>
#include<string.h>
int LineNo=1;
%}
identifier [a-zA-Z][_a-zA-Z0-9]*
number [0-9]+|([0-9]*\.[0-9]+)
%%
main\(\) return MAIN;
if return IF;
else return ELSE;
while return WHILE;
int |
char |
float return TYPE;
{identifier} {strcpy(yylval.var,yytext);
return VAR;}
{number} {strcpy(yylval.var,yytext);
return NUM;}

\< |
\> |
\>= |
\<= |
== {strcpy(yylval.var,yytext);
return RELOP;}

[ \t] ;
\n LineNo++;

121
. return yytext[0];
%%

<int.y>

%{
#include<string.h>
#include<stdio.h>
struct quad
{
char op[5];
char arg1[10];
char arg2[10];
char result[10];
}QUAD[30];
struct stack
{
int items[100];
int top;
}stk;
int Index=0,tIndex=0,StNo,Ind,tInd;
extern int LineNo;
%}
%union
{
char var[10];
}
%token <var> NUM VAR RELOP
%token MAIN IF ELSE WHILE TYPE
%type <var> EXPR ASSIGNMENT CONDITION IFST ELSEST
WHILELOOP
%left '-' '+'
%left '*' '/'
%%

PROGRAM : MAIN BLOCK


;

BLOCK: '{' CODE '}'


;

CODE: BLOCK
| STATEMENT CODE
| STATEMENT
;
STATEMENT: DESCT ';'
122
| ASSIGNMENT ';'
| CONDST
| WHILEST
;

DESCT: TYPE VARLIST


;
VARLIST: VAR ',' VARLIST
| VAR
;
ASSIGNMENT: VAR '=' EXPR{
strcpy(QUAD[Index].op,"=");
strcpy(QUAD[Index].arg1,$3);
strcpy(QUAD[Index].arg2,"");
strcpy(QUAD[Index].result,$1);
strcpy($$,QUAD[Index++].result);
}
;
EXPR: EXPR '+' EXPR {AddQuadruple("+",$1,$3,$$);}
| EXPR '-' EXPR {AddQuadruple("-",$1,$3,$$);}
| EXPR '*' EXPR {AddQuadruple("*",$1,$3,$$);}
| EXPR '/' EXPR {AddQuadruple("/",$1,$3,$$);}
| '-' EXPR {AddQuadruple("UMIN",$2,"",$$);}
| '(' EXPR ')' {strcpy($$,$2);}
| VAR
| NUM
;
CONDST: IFST{
Ind=pop();
sprintf(QUAD[Ind].result,"%d",Index);
Ind=pop();
sprintf(QUAD[Ind].result,"%d",Index);
}
| IFST ELSEST
;
IFST: IF '(' CONDITION ')' {
strcpy(QUAD[Index].op,"==");
strcpy(QUAD[Index].arg1,$3);
strcpy(QUAD[Index].arg2,"FALSE");
strcpy(QUAD[Index].result,"-1");
push(Index);
Index++;
}

BLOCK {
strcpy(QUAD[Index].op,"GOTO");
123
strcpy(QUAD[Index].arg1,"");
strcpy(QUAD[Index].arg2,"");
strcpy(QUAD[Index].result,"-1");
push(Index);
Index++;
}
;
ELSEST: ELSE{
tInd=pop();
Ind=pop();
push(tInd);
sprintf(QUAD[Ind].result,"%d",Index);
}
BLOCK{
Ind=pop();
sprintf(QUAD[Ind].result,"%d",Index);
}
;
CONDITION: VAR RELOP VAR {AddQuadruple($2,$1,$3,$$);
StNo=Index-1;
}
| VAR
| NUM
;
WHILEST: WHILELOOP{
Ind=pop();
sprintf(QUAD[Ind].result,"%d",StNo);
Ind=pop();
sprintf(QUAD[Ind].result,"%d",Index);
}
;
WHILELOOP: WHILE '(' CONDITION ')' {
strcpy(QUAD[Index].op,"==");
strcpy(QUAD[Index].arg1,$3);
strcpy(QUAD[Index].arg2,"FALSE");
strcpy(QUAD[Index].result,"-1");
push(Index);
Index++;
}
BLOCK {
strcpy(QUAD[Index].op,"GOTO");
strcpy(QUAD[Index].arg1,"");
strcpy(QUAD[Index].arg2,"");
strcpy(QUAD[Index].result,"-1");
push(Index);
Index++;
124
}
;
%%
extern FILE *yyin;
int main(int argc,char *argv[])
{
FILE *fp;
int i;
if(argc>1)
{
fp=fopen(argv[1],"r");
if(!fp)
{
printf("\n File not found");
exit(0);
}
yyin=fp;
}
yyparse();
printf("\n\n\t\t ----------------------------""\n\t\t Pos Operator Arg1 Arg2 Result"
"\n\t\t
--------------------");
for(i=0;i<Index;i++)
{
printf("\n\t\t %d\t %s\t %s\t %s\t
%s",i,QUAD[i].op,QUAD[i].arg1,QUAD[i].arg2,QUAD[i].result);
}
printf("\n\t\t -----------------------");
printf("\n\n");
return 0;
}
void push(int data)
{
stk.top++;
if(stk.top==100)
{
printf("\n Stack overflow\n");
exit(0);
}
stk.items[stk.top]=data;
}
int pop()
{
int data;
if(stk.top==-1)
{
125
printf("\n Stack underflow\n");
exit(0);
}
data=stk.items[stk.top--];
return data;
}
void AddQuadruple(char op[5],char arg1[10],char arg2[10],char result[10])
{
strcpy(QUAD[Index].op,op);
strcpy(QUAD[Index].arg1,arg1);
strcpy(QUAD[Index].arg2,arg2);
sprintf(QUAD[Index].result,"t%d",tIndex++);
strcpy(result,QUAD[Index++].result);
}
yyerror()
{
printf("\n Error on line no:%d",LineNo);
}

Input:
$vi test.c
main()
{
int a,b,c;
if(a<b)
{
a=a+b;
}
while(a<b)
{
a=a+b;
}
if(a<=b)
{
c=a-b;
}
else
{
c=a+b;
}
}

126
Output:
$lex int.l
$yacc –d int.y
$gcc lex.yy.c y.tab.c –ll –lm
$./a.out test.c

LAB VIVA QUESTIONS & ANSWERS

1. Define LR grammar?
A grammar for which we can construct a parsing table in which every entry is
uniquely defined is said to be an LR grammar.

2.What is augmented grammar?


If G is a grammar with start symbol S, then G’, the augmented grammar for G,
is G with a new start symbol S’ and production S’ S. It is to indicate the
parser when it should stop and announce acceptance of the input.

3. Define intermediate code?


In many compilers the source code is translated into a language which is
intermediate in complexity between a high-level programming language and
machine code. Such a language is therefore called intermediate code or
intermediate text.

4.What are the benefits of using a machine-independent intermediate


form?
a) Retargeting is facilitated; a compiler for a different machine can be created
by attaching a back end for the new machine to an existing frontend.
b) A machine-independent code optimizer can be applied to the intermediate
representation.

127
5.What are the various kinds of intermediate representations for
intermediate
code generation?
a) Syntax trees
b) Postfix notation
c) Three address code

6.What is syntax directed translation scheme?


A syntax directed translation scheme is merely a context-free grammar in which
a program fragment called an output action ( or sometimes a semantic action or
semantic rule) is associated with each production.

7. Define parse trees and syntax trees.


The parse tree itself is a useful intermediate language representation for a
source program. A parse tree, however often contains redundant information
which can be eliminated. A variant of a parse tree is what is called an syntax
tree, a tree in which each leaf represents an operand and each interior node an
operator.

8.What is a three-address code?


Three-address code is a sequence of statements, typically of the generalform
A:= B op C, where A,B and C are either programmer-defined names,constants
or compiler-generated temporary names; op stands for any operator,such as
fixed- or floating-point arithmetic operator, or a logical operator on Boolean-
valued data.

128
9.Write the three address code for the assignment statement a:= b * -c + b
* -c
t1 := -c, t2 := b * t1, t3 := -c, t4 := b * t3
t5 := t2 + t4, a := t5

10. Name any four types of three-address statements?

a) Assignment statements of the form x := y op z


b) Assignment instruction of the form x := op y
c) Copy statement of the form x := y
d) The unconditional jump goto L

129
CMR ENGINEERING COLLEGE
Laboratory Name :Compiler Design Experiment No: 6

AIM: A Program to Generate Machine Code.

PROGRAM:

#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int label[20];
int no=0;
int main()
{
FILE *fp1,*fp2;
char fname[10],op[10],ch;
char operand1[8],operand2[8],result[8];
int i=0,j=0;
printf("\n Enter filename of the intermediate code");
scanf("%s",&fname);
fp1=fopen(fname,"r");
fp2=fopen("target.txt","w");
if(fp1==NULL || fp2==NULL)
{
printf("\n Error opening the file");
exit(0);
}
while(!feof(fp1))
{
fprintf(fp2,"\n");
fscanf(fp1,"%s",op);
i++;
if(check_label(i))
fprintf(fp2,"\nlabel#%d",i);
if(strcmp(op,"print")==0)
{
fscanf(fp1,"%s",result);
fprintf(fp2,"\n\t OUT %s",result);
}
if(strcmp(op,"goto")==0)
{
130
fscanf(fp1,"%s %s",operand1,operand2);
fprintf(fp2,"\n\t JMP %s,label#%s",operand1,operand2);
label[no++]=atoi(operand2);
}
if(strcmp(op,"[]=")==0)
{
fscanf(fp1,"%s %s %s",operand1,operand2,result);
fprintf(fp2,"\n\t STORE %s[%s],%s",operand1,operand2,result);
}
if(strcmp(op,"uminus")==0)
{
fscanf(fp1,"%s %s",operand1,result);
fprintf(fp2,"\n\t LOAD -%s,R1",operand1);
fprintf(fp2,"\n\t STORE R1,%s",result);
}
switch(op[0])
{
case '*': fscanf(fp1,"%s %s %s",operand1,operand2,result);
fprintf(fp2,"\n \t LOAD",operand1);
fprintf(fp2,"\n \t LOAD %s,R1",operand2);
fprintf(fp2,"\n \t MUL R1,R0");
fprintf(fp2,"\n \t STORE R0,%s",result);
break;
case '+': fscanf(fp1,"%s %s %s",operand1,operand2,result);
fprintf(fp2,"\n \t LOAD %s,R0",operand1);
fprintf(fp2,"\n \t LOAD %s,R1",operand2);
fprintf(fp2,"\n \t ADD R1,R0");
fprintf(fp2,"\n \t STORE R0,%s",result);
break;
case '-': fscanf(fp1,"%s %s %s",operand1,operand2,result);
fprintf(fp2,"\n \t LOAD %s,R0",operand1);
fprintf(fp2,"\n \t LOAD %s,R1",operand2);
fprintf(fp2,"\n \t SUB R1,R0");
fprintf(fp2,"\n \t STORE R0,%s",result);
break;
}
}
fclose(fp2);
fclose(fp1);
fp2=fopen("target.txt","r");
if(fp2==NULL)
{
printf("Error opening the file\n");
exit(0);
}
do
131
{
ch=fgetc(fp2);
printf("%c",ch);
}while(ch!=EOF);
fclose(fp1);
return 0;
}
int check_label(int k)
{
int i;
for(i=0;i<no;i++)
{
if(k==label[i])
return 1;
}
return 0;
}

132
Input:

$vi int.txt
=t1 2
[]=a 0 1
[]=a 1 2
[]=a 2 3
*t1 6 t2
+a[2] t2 t3
-a[2] t1 t2
/t3 t2 t2
uminus t2 t2
print t2
goto t2 t3
=t3 99
uminus 25 t2
*t2 t3 t3
uminus t1 t1
+t1 t3 t4
print t4

Output:

Enter filename of the intermediate code: int.txt

STORE t1,2
STORE a[0],1
STORE a[1],2
STORE a[2],3

LOAD t1,R0
LOAD 6,R1
ADD R1,R0
STORE R0,t3

LOAD a[2],R0
LOAD t2,R1
ADD R1,R0
STORE R0,t3

LOAD a[t2],R0
LOAD t1,R1
SUB R1,R0
STORE R0,t2

133
LOAD t3,R0
LOAD t2,R1
DIV R1,R0
STORE R0,t2

LOAD t2,R1
STORE R1,t2
LOAD t2,R0
JGT 5,label#11

Label#11: OUT t2
JMP t2,label#13
Label#13: STORE t3,99
LOAD 25,R1
STORE R1,t2

LOAD t2,R0
LOAD t3,R1
MUL R1,R0
STORE R0,t3

LOAD t1,R1
STORE R1,t1

LOAD t1,R0
LOAD t3,R1
ADD R1,R0
STORE R0,t4
OUT t4

LAB VIVA QUESTIONS & ANSWERS

1.What are the representations of three-address statements?


A three address statement is an abstract form of intermediate code. There are
three representation are available. They are
a) Quadruples
b) Triples
c) Indirect triples

2. Define procedure definition?


A procedure definition is a declaration that, in its simplest form, associates
an identifier with a statement. The identifier is the procedure name, and the
statement body. Some of the identifiers appearing in a procedure definition are

134
special and are called formal parameters of the procedure. Arguments, known
as
actual parameters may be passed to a called procedure; they are substituted for
the
formal in the body.

3. Define activation trees?


A recursive procedure p need not call itself directly; p may call another
procedure q, which may then call p through some sequence of procedure calls.
We can use a tree called an activation tree, to depict the way control enters and
leaves activation. In an activation tree
a) Each node represents an activation of a procedure,
b) The root represents the activation of the main program
c) The node for a is the parent of the node for b if an only if control flows
from activation a to b, and
d) The node for a is to the left of the node for b if an only if the lifetime of
a occurs before the lifetime of b.

4. Write notes on control stack?


A control stack is to keep track of live procedure activations. The idea is to
push the node for activation onto the control stack as the activation begins and
to pop the node when the activation ends.

5. Write the scope of a declaration?


A portion of the program to which a declaration applies is called the scope of
that declaration. An occurrence of a name in a procedure is said to be local to
procedure if it is in the scope of a declaration within the procedure; otherwise
the occurrence is said to be nonlocal.

6. Define binding of names?


When an environment associates storage location s with a name x, we say that x
is bound to s; the association itself is referred to as a binding of x. A binding is
the dynamic counterpart of a declaring.

7. What is the use of run time storage?


The run time storage might be subdivided to hold
a) The generated target code
b) Data objects, and
c) A counterpart of the control stack to keep track of procedure activation.

135
8. What is an activation record?
Information needed by a single execution of a procedure is managed using a
contiguous block of storage called an activation record or frame, consisting of
the collection of fields such as
a) Return value
b) Actual parameters
c) Optional control link
d) Optional access link
e) Saved machine status
f) Local data
g) Temporaries

9. What are the storage allocation strategies?


a) Static allocation lays out storage for all data objects at compile time.
b) Stack allocation manages the run-storage as a stack.
c) Heap allocation allocates and deallocates storage as needed at run time from
a
data area known as heap.

10. What is static allocation?


In static allocation, names are bound to storage as the program is compiled, so
there is no need for a run-time support package. Since the bindings do not
change at run time, every time a procedure is activated, its names are bound to
the same storage location.

136
137

You might also like