Unit 4
Unit 4
Intermediate code
Intermediate code is used to translate the source code into the machine code.
Intermediate code lies between the high-level language and the machine language.
• Using the intermediate code, the second phase of the compiler synthesis phase is
changed according to the target machine.
• If the compiler directly translates source code into the machine code without
generating intermediate code then a full native compiler is required for each new
machine.
• The intermediate code keeps the analysis portion same for all the compilers that's
why it doesn't need a full compiler for every unique machine.
Syntax Trees
A syntax tree is a graphical representation of the source program. Here the node
represents an operator and
children of the node represent operands. It is a hierarchical structure that can be
constructed by syntax rules.
Postfix Notation
Postfix notation is a linear representation of a syntax tree. This can be written by
traversing the tree in the
post order form. The edges in a syntax tree do not appear explicitly in postfix
notation;
Three address
Three address code is a type of intermediate code which is easy to generate and can
be easily converted to machine code. It makes use of at most three addresses and
one operator to represent an expression and the value computed at each instruction
is stored in temporary variable generated by compiler. The compiler decides the
order of operation given by three address code.
1. Quadruple – It is a structure which consists of 4 fields namely op, arg1, arg2 and
result. op denotes the operator and arg1 and arg2 denotes the two operands and
result is used to store the result of the expression.
3. Indirect Triples – This representation makes use of pointer to the listing of all
references to computations which is made separately and stored. Its similar in utility
as compared to quadruple representation but requires less space than it.
Temporaries are implicit and easier to rearrange code.
Directed acyclic graphs are a type of data structure and they are used to apply
transformations to basic blocks.
The Directed Acyclic Graph (DAG) facilitates the transformation of basic blocks.
DAG is an efficient method for identifying common sub-expressions.
It demonstrates how the statement’s computed value is used in subsequent
statements.
Q3. Backpatching
Backpatching is basically a process of fulfilling unspecified information. This
information is of labels. It basically uses the appropriate semantic actions during the
process of code generation. It may indicate the address of the Label in goto
statements while producing TACs for the given expressions. Here basically two
passes are used because assigning the positions of these label statements in one
pass is quite challenging. It can leave these addresses unidentified in the first pass
and then populate them in the second round. Backpatching is the process of filling
up gaps in incomplete transformations and information.
The main problem with generating code for boolean expressions and flow-of-control
statements in a single pass is that during one single pass we may not know the
labels that control must go to at the time the jump statements are generated.
Q4. CODE GENERATION
The final phase in compiler model is the code generator. It takes as input an
intermediaterepresentation of
the source program and produces as output an equivalent target program. Thecode
generation techniques presented below can be used whether or not an optimizing
phaseoccurs before code generation.
The first step is to divide a group of three-address codes into the basic block. The
new basic block always begins with the first instruction and continues to add
instructions until it reaches a jump or a label. If no jumps or labels are identified, the
control will flow from one instruction to the next in sequential order.