0% found this document useful (0 votes)
10 views

02 Simple Sysntax Directed Translation

Uploaded by

iasifiqbal0285
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

02 Simple Sysntax Directed Translation

Uploaded by

iasifiqbal0285
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 57

A Simple

Syntax Directed
Translation
Chapter 2

Faryal Shamsi
Lecturer Computer Science

Sukkur IBA University


Recap. Lecture
1

Structure of
Compiler

Symbol Table
Analysis
• The analysis phase of a compiler breaks up a source program into
constituent pieces and produces an internal representation for it,
called intermediate code.

• Lexical Analysis,
• Syntax and Semantics Analysis,
• Intermediate Code Generation.
A Syntax Directed Translation
Tools and Techniques
Context Free Grammar
• A set of terminal symbols, may be referred to as “tokens”.
• A set of non-terminals, may be called “syntactic variables”.
• A designation of one of the non-terminals as the start symbol.
• A set of productions, where each production consists of a
1. Non-terminal, called the head or left-side of the production,
2. an arrow, and
3. a sequence of terminals and/or non-terminals , called the body or right
side of the production.
CFG Example
• lists of digits separated by plus or minus signs .

• OR

• The string of zero terminals, written as є, is called the empty string.


• A grammar for a subset of Java statements
Suppose,
Syntax  CFG
• Using the variable expr to denote an expression and the variable stmt to
denote a statement , this structuring rule can be expressed as:

• Such a rule is called a production.


• In a production, lexical elements like the keyword if and the parentheses are
called terminals.
• Variables like expr and stmt represent sequences of terminals and are called
non-terminals.
Parse Trees
• Parsing is the problem of
• taking a start symbol and
• figuring out how to derive a string of terminal from the it following the grammar
rules

• And if it cannot be derived from the start symbol of the grammar , then
reporting syntax errors within the string.
Parse Trees
• A parse tree pictorially shows how the start symbol of a grammar derives a
string in the language.
• If non-terminal A has a production A  XYZ, then a parse tree may have an
interior node labeled A with three children labeled X, Y, and Z, from left to
right:
Parse Trees
Parse Trees
• The process of finding a parse tree for a given string of terminals using a
predefined CFG, is called parsing.

• Such as 9+5-2

• Can you parse it?


Parse tree Example
• The derivation of 9-5+2
Input:
for ( ; expr ; expr )
Ambiguity
• A grammar can have more than one parse trees generating a given string of
terminals.
• Such a grammar is said to be ambiguous.

• Since a string with more than one parse trees usually has more than one
meaning,
• we need to design unambiguous grammars for compiling applications, or to use
ambiguous grammars with additional rules to resolve the ambiguities .
Consider a CFG -

E  E+E
E  E*E
E  id

Can you parse id+id*id ?


Parsing id+id*id E
E+E |E*E | id
E E

E E E E
+ *

E E E E id
id * +

id id id id
Precedence of Operators
• Precedence? 9+5*2

• *, / should have higher precedence

• -, + should have lower precedence


Precedence of Operators
Resolving Precedence Problem
E  E+E | E*E | id

Above statement can be

EE+T
TT*F
F  id
Resolving Precedence Problem
Can you identify some problem here?

EE+T
TT*F
F  id
Resolving Precedence Problem

EE+T
TT*F
F  id

Try parsing:
• id+id*id*id
• id+id*id+id
Resolving Precedence Problem

EE+T
TT*F
F  id

• Convert into

EE+T|T
TT*F|F
F  id
Resolving Precedence Problem

E  E+T | E-T | T
T  T*F | T/F | F
F  id | E
Associativity of Operators
• Associativity of 5? 9-5+2

• +,-,*,/ are mostly considered as left associative

• Special operators like ^ used for exponential is mostly right


associative

• = and its descendants are right associative in C


• a=b=c should be treated as a=(b=c)
Removing Ambiguity

• E  E+E | id

• E  E+id | id
• Also called left associative or left recursive

• E  id+E | id
• Also called right associative or right recursive
Associativity of Operators

• A well written grammar must have well defined associativity of


operators

• Left Associativity

• Right Associativity
Associativity of Operators
4.3.4 – Left Factoring

• A left recursive grammar is not useful for predictive parsing

• Left factoring is a grammar transformation that is useful for producing


a grammar suitable for predictive, or top-down, parsing.

• When the choice between two alternative A-productions is not clear,


we may be able to rewrite the productions to defer the decision until
enough of the input has been seen that we can make the right choice.
4.3.4 – Left Factoring
• A left recursive grammar – • A left factored grammar –

AAα A  β A’
Aβ A’  α A’ | ε
4.3.4 – Left Factoring
• A left recursive grammar –
• A left factored grammar –
AAα
Aβ A  β A’
A’  α A’ | ε

EE+T
ET
4.3.4 – Left Factoring
• A left recursive grammar –
• A left factored grammar –
AAα
Aβ A  β A’
A’  α A’ | ε

EE+T
ET E  T E’
E’  +T E’ | ε
Solve

• S  (L) | i • S  Sa | Sb | c
L  L,S | S

• S  Aa | Bb
•SS0S1S A  Ac | Ad | B
S  01 B  Ba | Bc
4.3.4 – Left Factoring
• Left factoring is a grammar transformation that is useful for producing
a grammar suitable for predictive, or top-down, parsing.

• For example, if we have the two productions of a non deterministic


grammar –

stmt  if expr then stmt else stmt


stmt  if expr then stmt
4.3.4 – Left Factoring
• A left factored grammar –
• A non deterministic grammar –
4.3.4 – Left Factoring
• A left recursive grammar – • A left factored grammar –

This grammar is still ambiguous


Sometimes, you need to think out of
the box!
Infix VS Postfix Notation

• Infix expressions are readable and solvable by humans. We can easily


distinguish the order of operators, and also can use the parenthesis to
solve that part first during solving mathematical expressions.

• The computer cannot differentiate the operators and parenthesis


easily, that's why postfix conversion is needed.
Postfix Notation
Postfix Notation (Example)
• E = E1 op E2 postfix will be E = É1 É2 op

• (9-5)+2 postfix will be 95-2+


• Parentheses are not needed in postfix conversion

• 9-5+2 postfix will be 952-+


• 952+-3*
• Scan from left, first we face + sign, solve 52+ will be 7
• Now 97-3*, first we face – sign, solve 97- will be 2
• Now 23* solve it as 6
Convert into Postfix

•A*B+C/D
Convert into Postfix

• A*B+C/D
• A * (B + C) / D
Convert into Postfix

• A*B+C/D
• A * (B + C) / D
• A * (B + C / D)
Convert into Postfix

•A*B+C/D AB*CD/+
• A * (B + C) / D ABC+*D/
• A * (B + C / D) ABCD/+*
Intermediate Code Generation

• Two forms of intermediate code.


• One form, called abstract syntax trees (AST) or simply syntax trees, represents
the hierarchical syntactic structure of the source program.
• The other one is the parser, produces a syntax tree , that is further translated
into three-address code.

• Some compilers combine parsing and intermediate-code generation into one


component.
Intermediate Representation
• A common intermediate representation is a sequence of "three-address"
instructions.
• This form of intermediate code takes its name from instructions of the form x
= y op z ,
• where op is a binary operator,
• y and z the are addresses for the operands, and
• x is the address for the result of the operation.
• A three­-address instruction carries out at most one operation,
• typically a computation, a comparison, or a branch.
a) AST – Abstract Syntax Tree
b) Intermediate Code

do i = i + 1; while ( a [i] > v) ;


Consider the following
The 3 address code

You might also like