0% found this document useful (0 votes)
33 views23 pages

CD Unit 3

1. Syntax-directed translation associates attributes with grammar symbols and rules to perform semantic analysis during parsing. Attributes can represent values, types, references, etc. 2. There are two kinds of attributes - synthesized attributes depend on child nodes, inherited attributes depend on parent/sibling nodes. 3. Syntax-directed definitions and translation schemes associate semantic rules with productions to compute attribute values during parsing.

Uploaded by

TAYYAB ANSARI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views23 pages

CD Unit 3

1. Syntax-directed translation associates attributes with grammar symbols and rules to perform semantic analysis during parsing. Attributes can represent values, types, references, etc. 2. There are two kinds of attributes - synthesized attributes depend on child nodes, inherited attributes depend on parent/sibling nodes. 3. Syntax-directed definitions and translation schemes associate semantic rules with productions to compute attribute values during parsing.

Uploaded by

TAYYAB ANSARI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Prof Anil Kumar 2023

Syntax-Directed Translation

• A SDD, is a CFG, with attributes and rules.


• Attributes are associated with grammar symbols and rule with production.
• Attributes may be many kinds: Number, types, table references, string etc..
• Synthesized attributes – at node N is defined only in terms of attributes values of children of N.
• Inherited attributes – at node N is defined only in terms of attribute values at N’s parent, N itself and N’s
siblings.
• Each grammar symbol has two kinds of associated attributes:
• Synthesized: available from children (RHS) of grammar rule;
• e.g., for E  E1+ T, have E.val = E1.val + T.val
• Inherited: available from siblings or parent of RHS symbols;
• e.g., for ML dec id : type, id gets its type from type.
• Attributes of terminals are specified by the lexical analyzer.

Approaches to implement Syntax-Directed Translation


1、Basic idea
– Guided by context-free grammar (Translating when parsing )
– Attaching attributes to the grammar symbols representing the program construction.
– Values for attributes are computed by “semantic rules ”associated with the grammar productions.
2、Two notations for associating semantic rules with productions
– Syntax-directed definitions
• High-level specifications for translations
• Hide implementation details
• No need to specify explicitly the order in which translation take place
EE1+E2 {E.val=E1.val+E2.val}
– Translation schemes
• Indicate the order in which semantic rules are to be evaluated.
• Allow some implementation details to be shown
S {B.ps=10}B{S.ht=B.ht}

1
Prof Anil Kumar 2023

Note: 2) Special cases of syntax-directed definitions can be implemented in a single pass by


evaluating semantic rules during parsing, without explicitly constructing a parse tree or a graph showing
dependencies between attributes.
Role of SDT
• To associate actions with productions
• To associate attributes with non-terminals
• To create implicit or explicit syntax tree
• To perform semantic analysis

Syntax-Directed Definitions (SDD)


1、Definitions
1)Syntax-directed definition
• A generalization of a context-free grammar in which each grammar symbol has an associated
set of attributes
2)Attribute
• Represent anything we choose: a string, a number, a type, a memory location, etc.
Notes: The value of an attribute at a parse-tree node is defined by a semantic rule associated with the
production used at that node.
 A SDD is a context free grammar with attributes and rules
 Attributes are associated with grammar symbols and rules with productions
 Attributes may be of many kinds: numbers, types, table references, strings, etc.
 Synthesized attributes
 A synthesized attribute at node N is defined only in terms of attribute values of children of N and at
N it
 Inherited attributes

2
Prof Anil Kumar 2023

 An inherited attribute at node N is defined only in terms of attribute values at N’s parent, N itself and
N’s siblings
 􀀀 Each grammar symbol has two kinds of associated attributes:
 – Synthesized: available from children (RHS) of grammar rule;
e.g., for E  E1+ T, have
E.val = E1.val + T.val
 – Inherited: available from siblings or parent of RHS symbols;
e.g., for ML dec id : type, id gets its type from type.
 􀀀 Attributes of terminals are specified by the lexical analyzer.

3
Prof Anil Kumar 2023

4
Prof Anil Kumar 2023

5
Prof Anil Kumar 2023

6
Prof Anil Kumar 2023

Evaluating Expression – 3 * 5 + 4n
Production Semantic Rule
LEn L.val = E.val
E Ei + T E.val = F..val + T.val
E - T E.val = T.val
T  Ti * F T.val = Ti.val * F.val
T - F T.val = F.val
F (E) F.val = E.val
F digit F.val = digit.lexval

A parse tree showing the values of its attributes is called Annomated Parse Tree.

7
Prof Anil Kumar 2023

8
Prof Anil Kumar 2023

9
Prof Anil Kumar 2023

Construction of Syntax Trees


􀀀 Parse trees may be too concrete – e.g., don't need keywords if, then, else, knowing that you have a conditional
􀀀 Syntax Tree (a.k.a. Abstract Syntax Tree) encodes only essential information

Application: syntax tree construction


• SDDs are useful for is construction of SYNTAX TREES.
• Syntax trees are useful for representing programming language constructs like expressions and statements.

10
Prof Anil Kumar 2023

• They help compiler design by decoupling parsing from translation.


Syntax Tree
• Leaf nodes for operators and keywords are removed.
• Internal nodes corresponding to uninformative non-terminals are replaced by the more meaningful operators.

SDD for syntax tree construction


• We need some functions to help us build the syntax tree:
• mknode(op,left,right) constructs an operator node with label op, and two children, left and right
• mkleaf(id,entry) constructs a leaf node with label id and a pointer to a symbol table entry
• mkleaf(num,val) constructs a leaf node with label num and the token’s numeric value val
• Use these functions to build a syntax tree for a-4+c:
• P1 := mkleaf( id, st_entry_for_a )
• P2 := …

11
Prof Anil Kumar 2023

SDD for syntax tree construction

Production Semantic Rules


E -> E + T E.nptr := mknode( ‘+’, E .nptr,T.nptr)
1 1
E -> E - T E.nptr := mknode( ‘-’, E .nptr,T.nptr)
1 1
E -> T E.nptr := T.nptr
T -> ( E ) T.nptr := E.nptr
T -> id T.nptr := mkleaf( id, id.entry )
T -> num T.nptr := mkleaf( num, num.val )
Note that this is a S-attributed definition.
Try to derive the annotated parse tree for a-4+c.

12
Prof Anil Kumar 2023

Nptr: is a synthesized attribute, pointer to nodes in syntax tree.

Operation:
13
Prof Anil Kumar 2023

• Mknode (op left, right)


• Mkleaf(id, entry)
• Mkleaf(num, entry)
For example: a – 4 + c

P1 := mkleaf (id, entry –a)


P2 := mkleaf (num ,4)
P3 := mknode (‘ – ‘ , P1, P2)
P4 := mkleaf (id, entry – c )
P5 := mknode (‘+’ , P3, P4)

Bottom-up evaluation of S-attributed:


 SDD, that uses Synthesized attributes, exclusively called S-attributed definition.
 A parse tree of S-attributed definition is annotated with a single bottom-up traversal.
 How can we build a translator for a given SDD?
 For S-attributed definitions, it’s pretty easy!
 A bottom-up shift-reduce parser can evaluate the (synthesized) attributes as the input is parsed.
 We store the computed attributes with the grammar symbols and states on the stack.
 When a reduction is made, we calculate the values of any synthesized attributes using the already-computed
attributes from the stack.
Example:
Production Semantic Values (Code)

L ---> E$ Print (Val[Top])

E ---> E + T Val[nTop]:= Val[Top – 2] + Val[Top]

E ---> T

T ---> T * F Val[nTop]:= Val[Top – 2] * Val[Top]

T ---> F

14
Prof Anil Kumar 2023

F ---> (E) Val[nTop]:=Val[Top – 1]

F ---> digit

Input Stack Attribute Production Used

3*5+4$ - -

*5+4$ 3 3

*5+4$ F 3 F ---> digit

*5+4$ T 3 T ---> F

5+4$ T* 3

15
Prof Anil Kumar 2023

+4$ T*5 3 * 5

+4$ T*F 3 *5 F - digit

+4$ T 15 T ---> T * F

+4$ E 15 E ---> T

4$ E+ 15

$ E+4 15 + 4

$ E+F 15 + 4 F ---> digit

$ E+T 15 4 T ---> F

$ E 19 E ---> E + T

E 19

L 19 L ---> E $

S-attributed grammar are a class of attribute grammars characterized by having no inherited attributes.
Inherited attributes, which must be passed down from parent nodes of children nodes of the abstract syntax tree
during semantic analysis of the parsing process, are a problem of bottom-up parsing, because bottom-up parsing, the
parent nodes of the abstract syntax tree are created after creation of all of their children.
Attributes evaluation is S-attributed grammars can be incorporated conveniently in both top-down and bottom-up
parsing. YACC is based on the S-attributes approach.

16
Prof Anil Kumar 2023

Any S-attributed grammar is also an L-attributed grammars.


A grammar is called S-attributed, if all attributes are synthesized.

What is intermediate code generation in compiler design?

Intermediate code generator receives input from its predecessor phase, semantic analyzer, in the
form of an annotated syntax tree. That syntax tree then can be converted into a linear representation,
e.g., postfix notation.
Intermediate code tends to be machine independent code.

A source code can directly be translated into its target machine code, then why at all we need to
translate the source code into an intermediate code which is then translated to its target code? Let us
see the reasons why we need an intermediate code.

17
Prof Anil Kumar 2023

Compiler usually generate intermediate codes.


• Ease of re-targeting different machines.
• Perform machine-independent code optimization.
Intermediate language:
• Postfix language: a stack-based machine-like language.
• Syntax tree: a graphical representation.
• Three-address code: a statement containing at most 3 addresses or operands.

Stack Based (one address) – compact

PUSH 2
PUSH Y
MULTIPLY
PUSH X
SUBSTRACT
18
Prof Anil Kumar 2023

Syntax Tree: (Abstract syntax tree – retain essential structure of parse tree, eliminate unneeded
node)

Direct Acyclic Graph(DAG)


Compact abstract syntax tree to avoid duplication – smaller footprint as well.

19
Prof Anil Kumar 2023

Three-Address Code
Intermediate code generator receives input from its predecessor phase, semantic analyzer, in the
form of an annotated syntax tree. That syntax tree then can be converted into a linear representation,
e.g., postfix notation. Intermediate code tends to be machine independent code.
Therefore, code generator assumes to have unlimited number of memory storage register to
generate code.
For example:

a = b + c * d;

The intermediate code generator will try to divide this expression into sub-expressions and then
generate the corresponding code.

r1 = c * d;
r2 = b + r1;
r3 = r2 + r1;
20
Prof Anil Kumar 2023

a = r3
r being used as registers in the target program.
A three-address code has at most three address locations to calculate the expression. A three
address code can be represented in two forms : quadruples and triples.

Quadruples
Each instruction in quadruples presentation is divided into four fields: operator, arg1, arg2, and
result. The above example is represented below in quadruples format:
Op arg1 arg2 result
* c d r1
+ b r1 r2
+ r2 r1 r3
= r3 a
Triples
Each instruction in triples presentation has three fields : op, arg1, and arg2.The results of respective
sub-expressions are denoted by the position of expression.
Triples represent similarity with DAG and syntax tree. They are equivalent to

DAG while representing expressions.


Op arg1 arg2
* c d
+ b 0
+ 1 0
= 2
Triples face the problem of code immovability while optimization, as the results are positional and
changing the order or position of an expression may cause problems.

Declarations
A variable or procedure has to be declared before it can be used. Declaration involves allocation of
space in memory and entry of type and name in the symbol table. A program may be coded and
designed keeping the target machine structure in mind, but it may not always be possible to
accurately convert a source code to its target language.
Taking the whole program as a collection of procedures and sub-procedures, it becomes possible to
declare all the names local to the procedure. Memory allocation is done in a consecutive manner
and names are allocated to memory in the sequence they are declared in the program.

We use offset variable and set it to zero {offset = 0} that denote the base address.

21
Prof Anil Kumar 2023

The source programming language and the target machine architecture may vary in the way names
are stored, so relative addressing is used. While the first name is allocated memory starting from the
memory location 0 {offset=0}, the next name declared later, should be allocated memory next to
the first one.
Example:
We take the example of C programming language where an integer variable is assigned 2 bytes of
memory and a float variable is assigned 4 bytes of memory.
int a;
float b;
Allocation process:
{offset = 0}
int a;
id.type = int
id.width = 2
offset = offset + id.width
{offset = 2}
float b;
id.type = float
id.width = 4
offset = offset + id.width
{offset = 6}
To enter this detail in a symbol table, a procedure enter can be used. This method may have the
following structure:
enter(nam e, type, offset)
This procedure should create an entry in the symbol table, for variable name, having its type set to
type and relative address offset in its data area.
Semantic analysis
• Semantic analysis is the task of ensuring that the declarations and statements of a program are semantically
correct,
• i.e, that their meaning is clear and consistent with the way in which control structures and data types are
supposed to be used.
• Semantic analysis is the process of drawing meaning from text.

22
Prof Anil Kumar 2023

• It allows computers to understand and interpret sentences, paragraphs, or whole documents, by analyzing
their grammatical structure, and identifying relationships between individual words in a particular context.
• semantics errors that the semantic analyzer is expected to recognize:
• Type mismatch
• Undeclared variable
• Reserved identifier misuse.
• Multiple declaration of variable in a scope.
• Accessing an out of scope variable.
• Actual and formal parameter mismatch.

Type systems in compiler design


• Type checking is the process of verifying and enforcing constraints of types in values.
• A compiler must check that the source program should follow the syntactic and semantic conventions of the
source language and it should also check the type rules of the language.
• It allows the programmer to limit what types may be used in certain circumstances and assigns types to
values.
• The type-checker determines whether these values are used appropriately or not.
• Static type checking is defined as type checking performed at compile time.
• It checks the type variables at compile-time, which means the type of the variable is known at the compile
time
• The Benefits of Static Type Checking:
• Runtime Error Protection.
• It catches syntactic errors like spurious words or extra punctuation.
• It catches wrong names like Math and Predefined Naming.
• Detects incorrect argument types.
• It catches the wrong number of arguments.
• Dynamic Type Checking is defined as the type checking being done at run time.
• In Dynamic Type Checking, types are associated with values, not variables.
What is type checking and type conversion?
• Type conversions occur when a program converts one data type to another.
• The type checker ensures that the conversion is valid and compatible with the program's type rules.
• Type inconsistencies occur when a program contains inconsistent types or typing conventions.
• Type checking an expression or statement determines whether it is type-correct, and if type-correct also
determines the type of its result.
• Type checking a declaration simply checks for type correctness.
• An expression is a combination of one or more operands, zero or more operators, and zero or more pairs of
parentheses.
• Expressions: An arithmetic expression evaluates to a single arithmetic value.
• A character expression evaluates to a single value of type character.

23

You might also like