Syntax Analysis: - Check Syntax and Construct Abstract Syntax Tree
Syntax Analysis: - Check Syntax and Construct Abstract Syntax Tree
if
== = ;
b 0 a b
• Grammar
list → list + digit
| list – digit
| digit
digit → 0 | 1 | … | 9
list + digit
list - digit 2
digit 5
9
12
Ambiguity
• A Grammar can have more than one
parse tree for a string
• Consider grammar
list list+ list
| list – list
|0|1|…|9
9 5 5 2
14
Ambiguity …
• Ambiguity is problematic because meaning
of the programs can be incorrect
• Ambiguity can be handled in several ways
– Enforce associativity and precedence
– Rewrite the grammar (cleanest way)
• There is no algorithm to convert
automatically any ambiguous grammar to
an unambiguous grammar accepting the
same language
• Worse, there are inherently ambiguous
languages! 15
Ambiguity in Programming Lang.
• Dangling else problem
stmt if expr stmt
| if expr stmt else stmt
• For this grammar, the string
if e1 if e2 then s1 else s2
has two parse trees
16
if e1
if e2
stmt
s1
else s2
if expr stmt else stmt
if e1 e1 if expr stmt s2
if e2
s1
else s2 e2 s1
stmt
if expr stmt
e2 s1 s2 17
Resolving dangling else problem
• General rule: match each else with the closest
previous unmatched if. The grammar can be
rewritten as
stmt matched-stmt
| unmatched-stmt
matched-stmt if expr matched-stmt
else matched-stmt
| others
unmatched-stmt if expr stmt
| if expr matched-stmt
else unmatched-stmt 18
Associativity
• If an operand has operator on both the
sides, the side on which operator takes this
operand is the associativity of that
operator
• In a+b+c b is taken by left +
• +, -, *, / are left associative
• ^, = are right associative
• Grammar to generate strings with right
associative operators
right letter = right | letter
letter a| b |…| z
19
Precedence
• String a+5*2 has two possible
interpretations because of two
different parse trees corresponding to
(a+5)*2 and a+(5*2)
• Precedence determines the correct
interpretation.
• Next, an example of how precedence
rules are encoded in a grammar
20
Precedence/Associativity in the
Grammar for Arithmetic Expressions
Ambiguous • Unambiguous,
with precedence
EE+E and associativity
| E*E rules honored
| (E) EE+T|T
| num | id TT*F|F
F ( E ) | num
3+2+5 | id
3+2*5 21
Parsing
• Process of determination whether a string
can be generated by a grammar
• Parsing falls in two categories:
– Top-down parsing:
Construction of the parse tree starts at the root
(from the start symbol) and proceeds towards
leaves (token or terminals)
– Bottom-up parsing:
Construction of the parse tree starts from the
leaf nodes (tokens or terminals of the grammar)
and proceeds towards root (start symbol)
22