Unit 4 and 5
Unit 4 and 5
1. Parse tree:
● Parse tree is the graphical representation of symbol. The symbol can be terminal
or non-terminal.
● In parsing, the string is derived using the start symbol. The root of the parse tree
is that start symbol.
● Parse tree follows the precedence of operators. The deepest sub-tree traversed
first
Example:
Production rules:
1. S= S + S | S *S
2. S = a|b|c
Input: a*b+c
Step 1:
Step 2:
Step 3:
Step 4:
Step 5:
2. Syntax Tree:
Example-
Rules for constructing a syntax tree:
● mknode (op, left, right) − It generates an operator node with label op and two
field including pointers to left and right.
● mkleaf (id, entry) − It generates an identifier node with label id and the field
including the entry, a pointer to the symbol table entry for the identifier.
● mkleaf (num, val) − It generates a number node with label num and a field
including val, the value of the number.
The tree is generated in a bottom-up fashion. The function calls mkleaf (id, entry a) and
mkleaf (num 4) construct the leaves for a and 4. The pointers to these nodes are stored
using p1and p2. The call mknodes (′−′, p1, p2 ). Do same for remaining.
3. Parse Trees Vs Syntax Trees-
Parse Tree Syntax Tree
Parse trees are comparatively less Syntax trees are comparatively more dense
dense than syntax trees. than parse trees.
● Every non-terminal can get one or more than one attribute or sometimes 0
attribute depending on the type of the attribute.
● The value of these attributes is evaluated by the semantic rules associated with
the production rule.
● In the semantic rule, attribute is VAL and an attribute may hold anything like a
string, a number, a memory location and a complex record
Example
5. Attribute:
● Semantic information is stored in attributes associated with terminal and
● The attributes are divided into two groups: Synthesized attributes and Inherited
attribute
● A → XY
1. Synthesized attributes:
● A Synthesized attribute is an attribute of the non-terminal on the left-hand
side of a production.
● Synthesized attributes represent information that is being passed up the
parse tree.
● The attribute can take value only from its children.
● For eg. let’s say A -> BC is a production of a grammar, and A’s attribute is
dependent on B’s attributes or C’s attributes then it will be a synthesized
attribute.
● To illustrate, assume the following production:
S → ABC
● If S is taking values from its child nodes (A, B, C), then it is said to be a
2. Inherited attributes:
● An attribute of a nonterminal on the right-hand side of a production is
● The attribute can take value either from its parent or from its siblings.
● For example, let’s say A -> BC is a production of a grammar and B’s
inherited attribute.
grammar rules. Syntax trees are parsed top-down and left to right. Whenever reduction
L-attributed SDT:
● If an SDT uses both synthesized attributes and inherited attributes with a
restriction that inherited attribute can inherit values from left siblings only, it
is called as L-attributed SDT.
● In L-attributed SDTs, a non-terminal can get values from its parent, child, and
sibling nodes. As in the following production
● S → ABC
● S can take values from A, B, and C (synthesized). A can take values from S only.
B can take values from S and A. C can get values from S, A, and B. No
non-terminal can get values from the sibling to its right.
● For example,
A -> XYZ {Y.S = A.S, Y.S = X.S, Y.S = Z.S}
is not an L-attributed grammar since Y.S = A.S and Y.S = X.S are allowed
but Y.S = Z.S violates the L-attributed SDT definition as attributed is
inheriting the value from its right sibling.
●
7. Three Address code:
● Three-address code is an intermediate code. It is used by the optimizing
compilers.
● In three-address code, the given expression is broken down into several separate
instructions. These instructions can easily translate into assembly language.
● Each Three address code instruction has at most three operands, or it can have 2
operands. It is a combination of assignment and a binary operator.
Example−
t1 = b + c
t2 = t1 + d
a = t2
where t1 and t2 are temporary variables generated by the compiler. Most of the time a
statement includes less than three references, but it is still known as a three address
statement.
x = y op z and x = op y
Here,
It assigns the result obtained after solving the right side expression of the assignment
x = y, value of y is assigned to x.
2. Unconditional Jump-
3. Conditional Jump-
If x relop y goto X
Here,
4. Procedure Call-
Param x1
Param x2
Param xn
5. Array Statements −
Problem-01:
Write Three Address Code for the following expression-
a=b+c+d
Three Address Code for the given expression is-
(1) T1 = b + c
(2) T2 = T1 + d
(3) a = T2
Problem-02:
Write Three Address Code for the following expression-
Solution-
Three Address Code for the given expression is-
(2) T1 = 0
(4) T1 = 1
(5)
Problem-03:
Write Three Address Code for the following expression-
Solution-
Three Address Code for the given expression is-
(1) If (A < B) goto (3)
(4) t = 0
(6) t = 1
(7)
There are three implementations used for three address code statements which are as
follows −
● Quadruples
● Triples
● Indirect Triples
Quadruples
Quadruple is a structure that contains at most four fields, i.e., operator, Argument 1,
Argument 2, and Result.
a=b+c*d
t1 = c ∗ d
t2 = b + t1
a = t2.
representation as follows−
Quadruple
(0) * c d t1
(1) + b t1 t2
(2) = t2 a
The content of fields arg 1, arg 2 and Result are pointers to symbol table entries for
Triples
This three address code representation contains three (3) fields, i.e., one for operator
In this representation, temporary variables are not used. Instead of temporary variables,
symbol table.
a=b+c*d
∴ t1 = c ∗ d
t2 = b + t1
a = t2
Triple for this Three Address Code will be −
Triple
(0) ∗ c d
(1) + b (0)
(2) = a (1)
Here (0) represents a pointer that refers to the result c * d, which can be used in further
statements, i.e., when c * d is added with b. This result will be saved at the position
Indirect Triples
The indirect triple representation uses an extra array to list the pointers to the triples in
pointers(11), (12), (13) respectively & then pointers (11), (12), (13) point to triples that is
The front end translates a source program into an intermediate representation from
which the back end generates target code.
2. Three-Address Code –
A statement involving no more than three references(two for operands and one for
result) is known as three address statement.
Example – The three address code for the expression a + b * c + d :
T1=b*c
T2=a+T1
T3=T2+d
T 1 , T 2 , T 3 are temporary variables.
3. Syntax Tree –:
The operator and keyword nodes of the parse tree are moved to their parents and a
chain of single productions is replaced by single link in syntax tree the internal
nodes are operators and child nodes are operands. To form syntax tree put
parentheses in the expression, this way it's easy to recognize which operand should
come first.
Example –
x = (a + b * c) / (a – b * c)
expressions. The expression can be of type real, integer, array and records.
1. S → id := E
2. E → E1 + E2
3. E → E1 * E2
4. E → (E1)
5. E → id
For this given grammar SDT= Production rule + Semantic action
The translation scheme of above grammar is given below:
S → id :=E {
p.value = look_up(id.name);
if p ≠ nil then
Emit (p = E.place) //GEN()
else
error;
}
E → E1 + E2 {E.place = newtemp();
Emit (E.place = E1.place '+' E2.place)
}
E → E1 * E2 {E.place = newtemp();
Emit (E.place = E1.place '*' E2.place)
}
E → (E1) {E.place = E1.place}
E → id {p = look_up(id.name);
If p ≠ nil then
Emit (p = E.place)
Else
Error;
}
3. Boolean expressions
1. E → E OR E
2. E → E AND E
3. E → NOT E
4. E → (E)
5. E → id relop id
6. E → TRUE
7. E → FALSE
The comparison operators <, <=, =, !=, >, or => is represented by rel.op.
We also assume that || and && are left-associative. || has the lowest precedence and
The E → id relop id2 contains the next_state and it gives the index of next three
E → E1 OR E2 {E.place = newtemp();
Emit(E.place = E1.place OR E2.place)
}
Numerical Representation
t1 : = not c
t2 : = b and t1
t3 : = a or t2
3. Procedures call
compiler.
Calling sequence:
The translation for a call includes a sequence of actions taken on entry and exit
● Save the state of the calling procedure so that it can resume execution after
the call.
● Also save the return address. It is the address of the location to which the
called routine must transfer after it is finished.
● Finally generate a jump to the beginning of the code for the called procedure
Syntax:
param x1
param x2
………..
param xn
call p, n
Here param refers to the parameter, & call p, n will call procedure p with n
arguments.
GEN (param p)
E.PLACE
4. Switch Statement
multiple cases.
● Once the case match is found, a block of statements associated with that
referred to as an identifier.
● The value provided by the user is compared with all the cases inside the
● If a case match is NOT found, then the default statement is executed, and the
default: Sn .
end .
goto NEXT
Ln : code for Sn
goto NEXT
TEST: if T = V1 goto L1
if T = V2 goto L2
goto Ln
NEXT: