SE Compiler Chapter 4-SDT
SE Compiler Chapter 4-SDT
4. Semantic Analysis
Semantic analysis is the phase in which the compiler adds semantic information to the parse tree and
generates the symbol table. Semantic checks are also performed which may include type checking. Parse
Tree is passed as the input for Semantic Analyser. Semantic Analyser generates the Abstract Syntax Tree
which further passes to the Intermediate Code Generator. Using this Abstract Syntax Tree the Intermediate
Code Generator generates the Three Address Codes. This completes the Front End of the Compiler.
We have learnt how a parser constructs parse trees in the syntax analysis phase. The plain parse-tree
constructed in that phase is generally of no use for a compiler, as it does not carry any information of how
to evaluate the tree. The productions of context-free grammar, which makes the rules of the language, do
not accommodate how to interpret them.
For example:
E→E+T
The above CFG production has no semantic rule associated with it, and it cannot help in making any sense
of the production.
Semantics of a language provide meaning to its constructs, like tokens and syntax structure. Semantics help
interpret symbols, their types, and their relations with each other. Semantic analysis judges whether the
syntax structure constructed in the source program derives any meaning or not.
For example:
int a = “value”;
should not issue an error in lexical and syntax analysis phase, as it is lexically and structurally correct, but
it should generate a semantic error as the type of the assignment differs. These rules are set by the grammar
of the language and evaluated in semantic analysis. The following tasks should be performed in semantic
analysis:
· Scope resolution
· Type checking
· Array-bound checking
Semantic Errors
We have mentioned some of the semantic errors that the semantic analyzer is expected to recognize:
PRODUCTION SEMANTIC RU LE
E → Ei + T E.code = Ei.code || T.code || '+'
This production has two non-terminals, E and T; the subscript in Ei distinguishes the occurrence of E in the
production body from the occurrence of E as the head. Both E and T have a string-valued attribute code.
The semantic rule specifies that the string E.code is formed by concatenating Ei.code, T.code, and the
character '+'. While the rule makes it explicit that the translation of E is built up from the translations of Ei,
T, and '+', it may be inefficient to implement the translation directly by manipulating strings.
Attribute Grammar: An SDD without side effects is sometimes called an attribute grammar. The rules in
an attribute grammar define the value of an attribute purely in terms of the values of other attributes and
constants.
PRODUCTION SEMANTIC RU LE
E → Ei + T { printf '+' }
By convention, semantic actions are enclosed within curly braces. (If curly braces occur as grammar
symbols, we enclose them within single quotes, as in '{' and '}'.) The position of a semantic action in a
production body determines the order in which the action is executed.
Between the two notations, syntax-directed definitions can be more readable, and hence more useful for
specifications. However, translation schemes can be more efficient, and hence more useful for
implementations.
Attributes:
We associate additional information to every grammar symbol (Terminals and Non Terminals) of the
Context Free Grammar (CFG). This additional information is called attribute. Attribute can be of any type
for example string, integer, table, references or even a fragment of code. There are two types of attributes
Synthesized and Inherited.
Synthesized Attributes:
The attributes of a Non-terminal which can be computed only from the attribute values of children and the
node itself from the generated parse tree are called Synthesized Attributes. To illustrate, assume the
following production:
S → ABC
If S is taking values from its child nodes (A,B,C), then it is said to be a synthesized attribute, as the values
of ABC are synthesized to S.
Inherited Attributes:
The attributes of a Non-terminal which can be computed only from the attribute values of parents, siblings
and the node itself are called Inherited Attributes. As in the following production,
S → ABC
A can get values from S, B, and C. B can take values from S, A, and C. Likewise, C can take values from
S, A, and B.
S-attributed SDT
If an SDT uses only synthesized attributes, it is called as S-attributed SDT. These attributes are evaluated using
S-attributed SDTs that have their semantic actions written after the production (right hand side).
As depicted above, attributes in S-attributed SDTs are evaluated in bottom-up parsing, as the values of the
parent nodes depend upon the values of the child nodes.
L-attributed SDT
This form of SDT uses both synthesized and inherited attributes with restriction of not taking values from
right siblings.
S → ABC
S can take values from A, B, and C (synthesized). A can take values from S only. B can take values from S
and A. C can get values from S, A, and B. No non-terminal can get values from the sibling to its right.
Attributes in L-attributed SDTs are evaluated by depth-first and left-to-right parsing manner.
We may conclude that if a definition is S-attributed, then it is also L-attributed, as L-attributed definition
encloses S-attributed definitions.
Reduction: When a terminal is reduced to its corresponding non-terminal according to grammar rules.
Syntax trees are parsed top-down and left to right. Whenever reduction occurs, we apply its corresponding
semantic rules (actions).
Semantic analysis uses Syntax Directed Translations to perform the above tasks.
Semantic analyzer receives AST (Abstract Syntax Tree) from its previous stage (syntax analysis).
Semantic analyzer attaches attribute information with AST, which are called Attributed AST.
Parse tree helps us to visualize the translation specified by SDD. The rules of an SDD are applied by first
constructing a parse tree and then using the rules to evaluate all of the attributes at each of the nodes of the
parse tree. A parse tree, showing the value(s) of its attribute(s) is called an annotated parse tree.
"Dependency graphs" are a useful tool for determining an evaluation order for the attribute instances in a
given parse tree. While an annotated parse tree shows the values of attributes, a dependency graph helps us
determine how those values can be computed.
Dependency Graphs
A dependency graph depicts the flow of information among the attribute instances in a particular parse tree;
an edge from one attribute instance to another means that the value of the first is needed to compute the
second. Edges express constraints implied by the semantic rules. In more detail:
• For each parse-tree node, say a node labeled by grammar symbol X, the dependency graph has a
node for each attribute associated with X.
• Suppose that a semantic rule associated with a production p defines the value of synthesized
attribute A.b in terms of the value of X.c (the rule may define A.b in terms of other attributes in
addition to X.c). Then, the dependency graph has an edge from X.c to A.b. More precisely, at every
node N labeled A where production p is applied, create an edge to attribute b at N, from the
At every node N labeled E, with children corresponding to the body of this production, the synthesized
attribute val at N is computed using the values of val at the two children, labeled E and T. Thus, a portion
of the dependency graph for every parse tree in which this production is used looks like Figure below.
As a convention, we shall show the parse tree edges as dotted lines, while the edges of the dependency
graph are solid.
Example of expression evaluation: Consider the following S-attributed grammar (SDT) and draw the
annotated parse tree for the input 2 + 3 * 4.
Solution:
Step1: draw the parse tree in top-down ( )
Step 2: draw the dependency graph ( )
Step 3: compute the values
E.val = 14
E
T.val = 12
E E.val = 2 + T
F F.val = 3 id
F F.val = 2
id.lexval = 4
id id
id.lexval = 2 id.lexval = 3
Ins: Fkrezgy Yohannes Compiler Design 5|Page
Example of infix to postfix conversion: Consider the following S-attributed grammar (SDT) and draw the
annotated parse tree and the output postfix for the input 2 + 3 * 4.
E → Ei + T {printf (“+”)}
E→T {}
T → Ti * F { printf (“*”)}
T→F {}
F→(E) {}
F → id { printf(id.lexval)}
Solution:
E printf(“+”)
+ T printf(“*”)
E
* F printf(id.lexval)
T T
printf(id.lexval) F printf(id.lexval)
id
F
id id
Postfix: 234*+
Example of generating Abstract Parse Tree: Consider the following S-attributed grammar (SDT) and
draw the annotated parse tree and the abstract parse tree for the input 2 + 3 * 4.
Solution:
E E.nptr = 500
T.nptr =400
E E.nptr = 100 + T
T.nptr *
T T.nptr = 100 T F F.nptr = 300
= 200
id id
id.lexval = 2
Abstract Parse Tree
Where 100, 200, 300, 400 and 500 are addresses of the nodes.
100 + 400
500
\0 3 \0 200 * 300
100 400
\0 3 \0 \0 4 \0
200 300
Example of expression evaluation: Consider the following L-attributed grammar (SDT) for C++
declaration and draw the annotated parse tree for the input float id1, id2, id3
Solution:
L.type
float L = float , id
id id2.type = float