0% found this document useful (0 votes)
24 views26 pages

LP IV Compiler Manual

Manual

Uploaded by

Ritu Patil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views26 pages

LP IV Compiler Manual

Manual

Uploaded by

Ritu Patil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

PVG’s

COLLEGE OF ENGINEERING
Nashik

Department of Computer Engineering

LABORATORY MANUAL
2018-2019

LABORATORY PRACTICE-IV
[Compiler]
BE-COMPUTER ENGINEERING
SEMESTER-II
Subject Code: 410255

EXAMINATION SCHEME
TEACHING SCHEME
Oral: 50 Marks
Practical: 4 Hrs/Week
Term Work: 50 Marks

-: Name of Faculty:-

Prof. A.R.Jain

Asst. Professor, Department of Computer Engineering.


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

Assignment No. 1
Implement a Lexical Analyzer using LEX for a subset of C. Cross
Title check your output with Stanford LEX.

Roll No.
Class B.E. (C.E.)
Date
Subject Laboratory Practice-IV

Signature

1 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

Assignment No: 1
Title: Implement a Lexical Analyzer using LEX for a subset of C. Cross check your output with
Stanford LEX.
Aim:
Assignment to understand the syntax of LEX specifications, built-in functions and
variables.
Objectives:

 To understand first phase of compiler: Lexical Analysis.



 To learn and use compiler writing tools.

 Understand the importance and usage of LEX automated tool.

Theory:
Introduction:

LEX stands for Lexical Analyzer.LEX is a UNIX utility which generates the lexical analyzer. LEX is a
tool for generating scanners. Scanners are programs that recognize lexical patterns in text. These lexical
patterns (or regular expressions) are defined in a particular syntax. A matched regular expression may
have an associated action. This action may also include returning a token. When Lex receives input in
the form of a file or text, it attempts to match the text with the regular expression. It takes input one
character at a time and continues until a pattern is matched. If a pattern can be matched, then Lex
performs the associated action (which may include returning a token). If, on the other hand, no regular
expression can be matched, further processing stops and Lex displays an error message. Lex and C are
tightly coupled. A lex file (files in Lex have the .l extension eg: first.l ) is passed through the lex utility,
and produces output files in C (lex.yy.c). The program lex.yy.c basically consists of a transition
diagram constructed from the regular expressions of first.l These file is then compiled object program
a.out, and lexical analyzer transforms an input streams into a sequence of tokens as show in fig 1.1.
To generate a lexical analyzer two important things are needed. Firstly it will need a precise
specification of the tokens of the language. Secondly it will need a specification of the action to be
performed on identifying each token.

2 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

1. LEX Specifications:

The Structure of lex programs consists of three parts:

Definition Section :
The Definition Section includes declarations of variables, start conditions regular definitions, and
manifest constants (A manifest constant is an identifier that is declared to represent a constant e.g.
# define PIE 3.14).

 C code: Any indented code between %{ and %} is copied to the C file. This is typically used
for defining file variables, and for prototypes of routines that are defined in the code segment.

 Definitions: A definition is very much like # define cpp directive. For example
letter [a-zA-Z]+
digit [0-9]+
These definitions can be used in the rules section: one could start a rule

3 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

{letter}{printf("n Wordis = %s",yytext);}

 State definitions: If a rule depends on context, it‟s possible to introduce states and
incorporate those in the rules. A state definition looks like %s STATE, and by default a
state INITIAL is already given.


Rule Section:

Second section is for translation rules which consist of regular expression and action with
respect to it. The translation rules of a Lex program are statements of the form:
p1 {action 1}
p2 {action 2}
p3 {action 3}
... ...
... ...
pn {action n}
Where, each p is a regular expression and each action is a program fragment describing what action the
lexical analyzer should take when a pattern p matches a lexeme. In Lex the actions are written in C.

Auxiliary Function(User Subroutines):


Third section holds whatever auxiliary procedures are needed by the actions. If the lex program is to
be used on its own, this section will contain a main program. If you leave this section empty
you will get the default main as follow:
int main()
{
yylex();
return 0;
}
In this section we can write a user subroutines its option to user e.g. yylex() is a unction automatically
get called by compiler at compilation and execution of lex program or we can call that function from
the subroutine section.

4 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

2. Built - in Functions:

No. Function Meaning


1 yylex() The function that starts the analysis. It is automatically generated
by Lex.
2 yywrap() This function is called when end of file (or input) is encountered.
If yywrap() returns 0, the scanner continues scanning, while if it
returns 1 the scanner returns a zero token to report the end of file.

3 yyless(int n) This function can be used to push back all but first „n‟ characters
of the read Token.
4 yymore() This function tells the lexer to append the next token to the current
token.
5 yyerror() This function is used for displaying any error message.

3. Built - in Variables:

No. Variables Meaning


1 yyin Of the type FILE*. This point to the current file being parsed by the lexer.
It is standard input file that stores input source program.
2 yyout Of the type FILE*. This point to the location where the output of the lexer
will be written. By default, both yyin and yyout point to standard input
and output.
3 yytext The text of the matched pattern is stored in this variable (char*) i.e. When
lexer matches or recognizes the token from input token the lexeme stored
in null terminated string called yytext.
OR
This is global variable which stores current token
4 yyleng Gives the length of the matched pattern. (yyleng stores the length or
number of character in the input string)The value in yyleng is same as
strlen() functions.
5 yylineno Provides current line number information. (May or may not be supported
by the lexer.)
6 yylval This is a global variable used to store the value of any token.

1. Regular Expression:

No. RE Meaning
1 a Matches a
2 abc Matches abc
3 [abc] Matches a or b or c
4 [a-f] Matches a,b,c,d,e or f
5 [0-9] Matches any digit
+
6 X Matches one or more of x
7 X* Matches zero or more of x
8 [0-9]+ Matches any integer
9 (…) Grouping an expression into a single unit

5 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

10 | Alteration ( or)
11 (b|c) Is euivalent to [a-c]*
12 X? X is optional (0 or 1 occurrence)
13 If(def)? Matches if or ifdef
14 [A-Za-z] Matches any alphabetical character
15 . Matches any character except new line
16 \. Matches the . character
17 \n Matches the new character
18 \t Matches the tab character
19 \\ Matches the \ character
20 [ \t] Matches either a space or tab character
21 [^a-d] Matches any character other than a,b,c and d
22 $ End of the line

2. Steps to Execute the program:


$ lex filename.l (eg: first.l)
$cc lex.yy.c–ll or gcc lex.yy.c–ll
$./a .out

Algorithm:

1.Start the program.

2.Lex program consists of three parts.

a. Declaration%%

b.Translation rules %%

c. Auxilary procedure.

3. The declaration section includes declaration of variables, maintest, constants and


regular definitions.

4. Translation rule of lex program are statements of the form

a. P1 {action}

b. P2 {action}

c. …

d. …

e. Pn {action}

6 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

5. Write a program in the vi editor and save it with .l extension.

6. Compile the lex program with lex compiler to produce output file as lex.yy.c. eg $ lex
filename.l $ cc lex.yy.c -ll

7.Compile that file with C compiler and verify the output.

Conclusion:

LEX is a tool which accepts regular expressions as an input & generates a C code to recognize
that token. If that token is identified, then the LEX allows us to write user defined routines that are to
be executed. When we give input specification file to LEX, LEX generates lex.yy.c file as an output
which contains function yylex() which is generated by the LEX tool & contains a C code to recognize
the token & action to be carried out if we find the token.

7 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

Assignment No. 2
Implement a parser for an expression grammar using YACC and
LEX for the subset of C. Cross check your output with Stanford
Title LEX and YACC.

Roll No.
Class B.E. (C.E.)
Date
Subject Laboratory Practice-IV

Signature

8 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

Assignment No: 2

Title: Implement a parser for an expression grammar using YACC and LEX for the subset of C. Cross b
check your output with Stanford LEX and YACC.

Aim: Assignment to understand basic syntax of YACC specifications built-in functions and variables

Objective:

 To understand Second phase of compiler: Syntax Analysis.



 To learn and use compiler writing tools.

 Understand the importance and usage of YACC automated tool

Theory:

Parser generator facilitates the construction of the front end of a compiler. YACC is LALR parser
generator. It is used to implement hundreds of compilers. YACC is command (utility) of the UNIX
system. YACC stands for “Yet Another Compiler Complier”.

File in which parser generated is with .y extension. e.g. parser.y, which is containing YACC
specification of the translator. After complete specification UNIX command. YACC transforms
parser.y into a C program called y.tab.c using LR parser. The program y.tab.c is automatically
generated. We can use command with –d option as

yacc –d parser.y

By using –d option two files will get generated namely y.tab.c and y.tab.h. The header file y.tab.h will
store all the token information and so you need not have to create y.tab.h explicitly.

The program y.tab.c is a representation of an LALR parser written in C, along with other C routines
that the user may have prepared. By compiling y.tab.c with the ly library that contains the LR parsing
program using the command.

cc y tab c – ly

we obtain the desired object program a out that perform the translation specified by the original program.
If procedure is needed, they can be compiled or loaded with y.tab.c, just as with any C program.

9 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

LEX recognizes regular expressions, whereas YACC recognizes entire grammar. LEX divides the
input stream into tokens, while YACC uses these tokens and groups them together logically. LEX
and YACC work together to analyze the program syntactically. The YACC can report conflicts or
ambiguities (if at all) in the form of error messages.

1. YACC Specifications:

The Structure of YACC programs consists of three parts:

10 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

Definition Section:

The definitions and programs section are optional. Definition section handles control
information for the YACC-generated parser and generally set up the execution environment in
which the parser will operate.

Declaration part:

In declaration section, %{ and %} symbol used for C declaration. This section is used for
definition of token, union, type, start, associativity and precedence of operator. Token declared
in this section can then be used in second and third parts of Yacc specification.

Translation Rule Section:

In the part of the Yacc specification after the first %% pair, we put the translation rules. Each
rule consists of a grammar production and the associated semantic action. A set of productions
that we have been writing:

<left side> <alt 1> | <alt 2> | … <alt n>

Would be written in YACC as

<left side> : <alt 1> {action 1}

| <alt 2> {action 2}

… …………

| <alt n> {action n}

In a Yacc production, unquoted strings of letters and digits not declared to be tokens are taken
to be nonterminals. A quoted single character, e.g. 'c', is taken to be the terminal symbol c, as
well as the integer code for the token represented by that character (i.e., Lex would return the
character code for ' c' to the parser, as an integer). Alternative bodies can be separated by a
vertical bar, and a semicolon follows each head with its alternatives and their semantic actions.
The first head is taken to be the start symbol.

A Yacc semantic action is a sequence of C statements. In a semantic action, the symbol $$ refers
to the attribute value associated with the nonterminal of the head, while $i refers to the value
associated with the ith grammar symbol (terminal or nonterminal) of the body. The semantic

11 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

action is performed whenever we reduce by the associated production, so normally the


semantic action computes a value for $$ in terms of the $i's. In the Yacc specification, we have
written the two E-productions.

E E + T/T

and their associated semantic action as:

exp : exp „+‟ term {$$ = $1 + $3;}

| term

In above production exp is $1, „+‟ is $2 and term is $3. The semantic action associated with
first production adds values of exp and term and result of addition copying in $$ (exp) left hand
side. For above second number production, we have omitted the semantic action since it is just
copying the value. In general {$$ = $1;} is the default semantic action.

Supporting C-Routines Section:

The third part of a Yacc specification consists of supporting C-routines. YACC generates a
single function called yyparse(). This function requires no parameters and returns either a 0 on
success, or 1 on failure. If syntax error over its return 1.The special function yyerror() is called
when YACC encounters an invalid syntax. The yyerror() is passed a single string (char )
argument. This function just prints user defined message like:

yyerror (char err)

printf (“Divide by zero”);

When LEX and YACC work together lexical analyzer using yylex () produce pairs consisting of
a token and its associated attribute value. If a token such as DIGIT is returned, the token value
associated with a token is communicated to the parser through a YACC defined variable yylval.
We have to return tokens from LEX to YACC, where its declaration is in YACC. To link this
LEX program include a y.tab.h file, which is generated after YACC compiler the program using
– d option.

12 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

2. Built-in Functions:

Function Meaning

yyparser() This is a standard parse routine used for calling syntax analyzer for given translation
rules. When yyparse() is call, the parser attempts to parse an input stream.

yyerror() This function is used for displaying any error message when a yacc detects a syntax
error

3. Built-in Types:

Type Meaning
%token Used to declare the tokens used in the grammar.
Eg.:- %token NUMBER

%start Used to declare the start symbol of the grammar.


Eg.:- %start S Where S is start symbol

%left Used to assign the associatively to operators.


Eg.: %left „+‟ „-„ -Assign left associatively to + & – with lowest precedence.
%left „*‟ „/„ -Assign left associatively to * & / with highest precedence.

%right Used to assign the associatively to operators.


Eg.: %right „+‟ „-„ -Assign right associatively to + & – with lowest precedence
%right „*‟ „/„ -Assign right associatively to * & / with highest precedence.

%nonassoc Used to unary associate.


Eg.:- %nonassoc UMINUS

%prec Used to tell parser use the precedence of given code.


Eg.:- %prec UMINUS

%type Used to create the type of a variable.


Eg.:- %type <name of any variable> exp

%union Token data types are declared in YACC using the YACC declaration % union, like this :
% union
{ char str ;
int num ; }

13 Department of Computer Engineering, PVGCOE, Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

4. Special Characters:

Characters Meanings

% A line with two percent signs separates the part of yacc grammar. All
declarations in definition section start with %, including %{ %},%start,
%token, %type, %left, %right, %nonassoc and %union.

$ In action, a dollar sign introduces a value references e.g: $3 value of the


third symbol in the rule‟s right-hand side.

‘’ Literal tokens are enclosed in single quotes. Eg: „+‟ or „-„ or „*‟ or „\‟
etc.

<> In value references in an action, you can override the value‟s defaults type
by enclosing the type name in angle brackets.

{} The C code in action is enclosed in curly braces

; Each rule in rules section should end with semicolon, except those that are
immediately followed by rule that starts a vertical bar.

| When two consecutive rules have same left-hand side, the second rule is
separated by vertical bar.

: In rule section, colon is used to separate left-hand side and right-hand side.

5. Steps to Execute the program

$ lex filename.l (eg: cal.l)

$ yacc -d filename.y (eg: cal.y)

$cc lex.yy.c y.tab.c –ll –ly –lm

$./a .out

14 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

Algorithm:

Write a program to implement YACC for Subset of C (for loop) statement.

LEX program:

1. Declare header files y.tab.h which contains information of the tokens and also declare
variable yylval within %{ and %}.

2. End of declaration section with %%

3. Write the Regular Expression for: FOR, OB, CB, SM, CON, EQ, ID, NUM, INC, DEC.

4. If match found for regular expression then write action that store token in yylval where p
is pointer declared in YACC and return the valve of token.

5. End rule-action section by %%

6. Subroutines section is optional.

1. Declaration of header files and set flag=0;

2. Declare tokens FOR, OB, CB, SM, CON, EQ, ID, NUM, INC, DEC.

3. End of declaration section by %%

4. State Context Free Grammar for FOR loop in rule section and write appropriate action
for same.

S : FOR OPBR E1 SEMIC E2 SEMIC E3 CLBR { printf("Accepted!");flag=1; }

| ID EQ NUM

E2 : ID RELOP ID

| ID RELOP NUM

15 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

E3 : ID INC

| ID DEC

5. End the translation rule section by %%

6. Define main() function to call yyparse() function to parse an input stream

7. Define yyerror() function to displaying any error message when a yacc detects a syntax
error. yyerror(const char *msg) { if(flag==0); printf("\n\n Syntax is Wrong"); }

Conclusion:

The yacc command accepts a language that is used to define a grammar for a target language to
be parsed by the tables and code generated by yacc. The language accepted by yacc as a grammar for
the target language is described below using the yacc input language itself.
The input grammar includes rules describing the input structure of the target language and code
to be invoked when these rules are recognized to provide the associated semantic action. The code to be
executed will appear as bodies of text that are intended to be C-language code. The C-language
inclusions are presumed to form a correct function when processed by yacc into its output files.

FAQ’s

1. For which phase of compilation is YACC used.

2. What is the role of parser? YACC is which kind of a parser?

3. How the tokens generated from LEX are passed to YACC?

4. How y. tab.h is generated? What are the contents of it?

5. Explain the grammar defined in yacc file.

16 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

Assignment No. 5
Implement the front end of a compiler that generates the three
address code for a simple language.
Title

Roll No.
Class B.E. (C.E.)
Date
Subject Laboratory Practice-IV

Signature

17 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

Assignment No: 5
Title: Implement the front end of a compiler that generates the three address code for a simple language.

Aim: Write an attributed translation grammar to recognize declarations of simple variables, "for",
assignment, if, if-else statements as per syntax of C or Pascal and generate equivalent three address
code for the given input made up of constructs mentioned above using LEX and YACC. Write a code
to store the identifiers from the input in a symbol table and also to record other relevant information
about the identifiers. Display all records stored in the symbol table.

Objective: To learn the function of compiler by:

 To understand fourth phase of compiler: Intermediate code generation.



 To learn and use compiler writing tools.

 To learn how to write three address code for given statement.

Theory:

Introduction:

In the analysis - synthesis model of a compiler, the front end analyzes a source program and creates an
intermediate representation, from which the back end generates target code. Ideally, details of the
source language are confined to the front end, and details of the target machine to the back end. The
front end translates a source program into an intermediate representation from which the back end
generates target code. With a suitably defined intermediate representation, a compiler for language i
and machine j can then be built by combining the front end for language i with the back end for
machine j. This approach to creating suite of compilers can save a considerable amount of effort: m x n
compilers can be built by writing just m front ends and n back ends.

Benefits of using a machine-independent intermediate form are:


1. Retargeting is facilitated. That is, a compiler for a different machine can be created by attaching
a back end for the new machine to an existing front end.
2. A machine-independent code optimizer can be applied to the intermediate representation

18 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

Intermediate Languages:
Three ways of intermediate representation:
 Syntax tree

 Postfix notation

 Three address code
The semantic rules for generating three-address code from common programming language constructs
are similar to those for constructing syntax trees or for generating postfix notation.
Graphical Representations:
1. Syntax tree:
A syntax tree depicts the natural hierarchical structure of a source program. A dag(Directed
Acyclic Graph) gives the same information but in a more compact way because common sub
expressions are identified. A syntax tree and dag for the assignment statement a : =b * -c + b * -
c are as follows:

2. Postfix notation:
Postfix notation is a linearized representation of a syntax tree; it is a list of the nodes of the tree
in which a node appears immediately after its children. The postfix notation for the syntax tree
given above is
a b c uminus * b c uminus * + assign

3. Three-Address Code:
Three-address code is a sequence of statements of the general
form x : = y op z

19 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

Where x, y and z are names, constants, or compiler-generated temporaries; op stands for any operator,
such as a fixed-or floating-point arithmetic operator, or a logical operator on Boolean valued data.
Thus a source language expression like x+ y*z might be translated into a sequence
t1 : = y * z
t2 : = x +t1
Where t1 and t2 are compiler-generated temporary names.
The reason for the term “three-address code” is that each statement usually contains three addresses,
two for the operands and one for the result.

Advantages of three-address code:

 The unraveling of complicated arithmetic expressions and of nested flow-of-control statements


makes three-address code desirable for target code generation and optimization.

 The use of names for the intermediate values computed by a program allows three address code
to be easily rearranged – unlike postfix notation.

Three-address code is a liberalized representation of a syntax tree or a dag in which explicit names
correspond to the interior nodes of the graph. The syntax tree and dag are represented by the three-
address code sequences. Variable names can appear directly in three address statements.

Types of Three-Address Statements:

The common three-address statements are:

 Assignment statements of the form x : = y op z, where op is a binary arithmetic or logical


operation.

 Assignment instructions of the form x : = op y, where op is a unary operation. Essential unary
operations include unary minus, logical negation, shift operators, and conversion operators that,
for example, convert a fixed-point number to a floating-point number.

 Copy statements of the form x : = y where the value of y is assigned to x.

 The unconditional jump goto L. The three-address statement with label L is the next to be
executed.

 Conditional jumps such as if x relop y goto L. This instruction applies a relational operator (<, =,
>=, etc. ) to x and y, and executes the statement with label L next if x stands in relation relop to

20 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

y. If not, the three-address statement following if x relop y goto L is executed next, as in the
usual sequence.

 param x and call p, n for procedure calls and return y, where y representing a returned value is
optional. For example,

param x1

param x2

.......

param xn

call p,n

generated as part of a call of the procedure p(x1, x2, …. ,xn ).

 Indexed assignments of the form x : = y[i] and x[i] : = y.



 Address and pointer assignments of the form x : = &y , x : = *y, and *x : = y.

Implementation of Three-Address Statements:

A three-address statement is an abstract form of intermediate code. In a compiler, these statements can
be implemented as records with fields for the operator and the operands. Three such representations
are: Quadruples, Triples, Indirect triples.

A. Quadruples:

 A quadruple is a record structure with four fields, which are, op, arg1, arg2 and result.

 The op field contains an internal code for the operator. The 3 address statement x = y op
z is represented by placing y in arg1, z in arg2 and x in result.

 The contents of fields arg1, arg2 and result are normally pointers to the symbol-table
entries for the names represented by these fields. If so, temporary names must be entered
into the symbol table as they are created.

 Fig a) shows quadruples for the assignment a : b * c + b * c

21 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

B. Triples:

 To avoid entering temporary names into the symbol table, we might refer to a temporary
value by the position of the statement that computes it.

 If we do so, three-address statements can be represented by records with only three
fields: op, arg1 and arg2.

 The fields arg1 and arg2, for the arguments of op, are either pointers to the symbol table
or pointers into the triple structure ( for temporary values ).

 Since three fields are used, this intermediate code format is known as triples.

 Fig b) shows the triples for the assignment statement a: = b * c + b * c.

C. Indirect triples:

 Indirect triple representation is the listing pointers to triples rather-than listing the triples
themselves.

 Let us use an array statement to list pointers to triples in the desired order.

 Fig c) shows the indirect triple representation.

22 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

Steps to execute the program

$ lex filename.l (eg: comp.l)

$ yacc -d filename.y (eg: comp.y)

$cc lex.yy.c y.tab.c –ll –ly -lm

$./a .out

Algorithm:

Write a LEX and YACC program to generate Intermediate Code for arithmetic expression

LEX program:

1. Declaration of header files specially y.tab.h which contains declaration for Letter, Digit, expr.

2. End declaration section by %%

3. Match regular expression.

4. If match found then convert it into char and store it in yylval.p where p is pointer declared in
YACC

5. Return token

6. If input contains new line character (\n) then return 0

7. If input contains „.‟ then return yytext[0]

8. End rule-action section by %%

9. Declare main function

a. open file given at command line

23 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

b. if any error occurs then print error and exit

c. assign file pointer fp to yyin

d. call function yylex until file ends

10. End

1. Declaration of header files

2. Declare structure for three address code representation having fields of argument1, argument2,
operator, result.

3. Declare pointer of char type in union.

4. Declare token expr of type pointer p.

5. Give precedence to „*‟,‟/‟.

6. Give precedence to „+‟,‟-‟.

7. End of declaration section by %%.

8. If final expression evaluates then add it to the table of three address code.

9. If input type is expression of the form.

a. exp‟+‟exp then add to table the argument1, argument2, operator.

b. exp‟-‟exp then add to table the argument1, argument2, operator.

c. exp‟*‟exp then add to table the argument1, argument2, operator.

d. exp‟/‟exp then add to table the argument1, argument2, operator.

e. (exp) then assign $2 to $$.

f. Digit OR Letter then assigns $1 to $$.

10. End the section by %%.

11. Declare file *yyin externally.

12. Declare main function and call yyparse function untill yyin ends

13. Declare yyerror for if any error occurs.

14. Declare char pointer s to print error.

15. Print error message.

24 Department of Computer Engineering, PVGCOE,Nashik


Laboratory Practice-IV B.E.C.E(Sem-II) [2018-19]

16. End of the program.

Addtotable function will add the argument1, argument2, operator and temporary variable to the
structure array of three address code.

Three address code function will print the values from the structure in the form first temporary
variable, argument1, operator, argument2

Quadruple Function will print the values from the structure in the form first operator, argument1,
argument2, result field

Triple Function will print the values from the structure in the form first argument1, argument2, and
operator. The temporary variables in this form are integer / index instead of variables.

Conclusion:

Thus we have studied how to generate intermediate code. A three-address statement is an


abstract form of intermediate code. In a compiler, these statements can be implemented as records with
fields for the operator and the operands. Three such representations are: Quadruples, Triples, Indirect
triples that we have studied.

FAQ’s

1. What are the different forms of ICG?

2. What are the difference between syntax tree and DAG?

3. What are advantages of 3-address code?

4. Which representation of 3-address code is better than other and why? Justify.

5. What is role of Intermediate code in compiler?

25 Department of Computer Engineering, PVGCOE,Nashik

You might also like