CSC 409 Note 2
CSC 409 Note 2
1
A1={a,b,c} L1={a,b,c, ac, abcc..}
A2={all of C tokens} L2= {all sentences of C program }
• Example 2.1. Grammar for expressions consisting of digits and plus and minus signs.
o Language of expressions L={9-5+2, 3-1, ...}
o The productions of grammar for this language L are:
list → list + digit
list → list - digit
list → digit
digit → 0|1|2|3|4|5|6|7|8|9
o list, digit : Grammar variables, Grammar symbols
o 0,1,2,3,4,5,6,7,8,9,-,+ : Tokens, Terminal symbols
• Convention specifying grammar
o Terminal symbols : bold face string if, num, id
o Nonterminal symbol, grammar symbol : italicized names, list, digit ,A,B
• Grammar G=(N,T,P,S)
o N : a set of nonterminal symbols
o T : a set of terminal symbols, tokens
o P : a set of production rules
o S : a start symbol, S∈N
• Grammar G for a language L={9-5+2, 3-1, ...}
o G=(N,T,P,S)
N={list,digit}
T={0,1,2,3,4,5,6,7,8,9,-,+}
P : list -> list + digit
list -> list - digit
list -> digit
digit -> 0|1|2|3|4|5|6|7|8|9
S=list
• Some definitions for a language L and its grammar G
• Derivation :
A sequence of replacements S⇒α1⇒α2⇒…⇒αn is a derivation of αn.
Example, A derivation 1+9 from the grammar G
2
• left most derivation
list ⇒ list + digit ⇒ digit + digit ⇒ 1 + digit ⇒ 1 + 9
• right most derivation
list ⇒ list + digit ⇒ list + 9 ⇒ digit + 9 ⇒ 1 + 9
• Language of grammar L(G)
L(G) is a set of sentences that can be generated from the grammar G.
L(G)={x| S ⇒* x} where x ∈ a sequence of terminal symbols
• Example: Consider a grammar G=(N,T,P,S):
N={S} T={a,b}
S=S P ={S → aSb | ε }
• is aabb a sentecne of L(g)? (derivation of string aabb)
S⇒aSb⇒aaSbb⇒aaεbb⇒aabb(or S⇒* aabb) so, aabbεL(G)
• there is no derivation for aa, so aa∉L(G)
• note L(G)={anbn| n≧0} where anbn meas n a's followed by n b's.
• Parse Tree
A derivation can be conveniently represented by a derivation tree( parse tree).
o The root is labeled by the start symbol.
o Each leaf is labeled by a token or ε.
o Each interior none is labeled by a nonterminal symbol.
o When a production A→x1… xn is derived, nodes labeled by x1… xn are made as children
nodes of node labeled by A.
• root : the start symbol
• internal nodes : nonterminal
• leaf nodes : terminal
o Example G:
list -> list + digit | list - digit | digit
digit -> 0|1|2|3|4|5|6|7|8|9
• left most derivation for 9-5+2,
list ⇒ list+digit ⇒ list-digit+digit ⇒ digit-digit+digit ⇒ 9-digit+digit
⇒ 9-5+digit ⇒ 9-5+2
• right most derivation for 9-5+2,
list ⇒ list+digit ⇒ list+2 ⇒ list-digit+2 ⇒ list-5+2
3
⇒ digit-5+2 ⇒ 9-5+2
parse tree for 9-5+2
Ambiguity
• A grammar is said to be ambiguous if the grammar has more than one parse tree for a given string of
tokens.
• Example 2.5. Suppose a grammar G that cannot distinguish between lists and digits as in
Example 2.1.
• G : string → string + string | string - string |0|1|2|3|4|5|6|7|8|9
Associativity of operator
A operator is said to be left associative if an operand with operators on both sides of it is
taken by the operator to its left.
4
eg) 9+5+2≡(9+5)+2, a=b=c≡a=(b=c)
• Left Associative Grammar :
list → list + digit | list - digit
digit →0|1|…|9
• Right Associative Grammar :
right → letter = right | letter
letter → a|b|…|z
Precedence of operators
We say that an operator(*) has higher precedence than other operator(+) if the operator(*) takes
operands before other operator(+) does.
• ex. 9+5*2≡9+(5*2), 9*5+2≡(9*5)+2
• left associative operators : + , - , * , /
• right associative operators : = , **
5
digit → 0 | 1 | … | 9
Syntax of statements
o stmt → id = expr ;
| if (expr) stmt ;
| if (expr) stmt else stmt ;
| while (expr) stmt ;
expr → expr + term | expr - term | term
term → term * factor | term / factor | factor
factor → digit | (expr)
digit → 0 | 1 | … | 9
6
o construct a parse tree for X.
o synthesize attributes over the parse tree.
Suppose a node n in parse tree is labeled by X and X.a denotes the value of attribute
a of X at that node.
Compute X's attributes X.a using the semantic rules associated with X.
Example 2.6. SDD for infix to postfix translation
7
• {print("+");} : translation(semantic) action.
• SDTS generates an output for each sentence x generated by underlying grammar by executing actions
in the order they appear during depth-first traversal of a parse tree for x.
1. Design translation schemes(SDTS) for translation
2. Translate :
a) parse the input string x and
b) emit the action result encountered during the depth-first traversal of parse tree.
Example 2.8.
• SDD vs. SDTS for infix to postfix translation.
8
• Action translating for input 9-5+2
1) Parse.
2) Translate.
Do we have to maintain the whole parse tree ?
No, Semantic actions are performed during parsing, and we don't need the nodes (whose
semantic actions done).
2.4 PARSING
if token string x ∈ L(G), then parse tree
else error message
Top-Down parsing
1. At node n labeled with nonterminal A, select one of the productions whose left part is
A and construct children of node n with the symbols on the right side of that production.
2. Find the next node at which a sub-tree is to be constructed.
ex. G: type → simple
|↑id
9
|array [ simple ] of type
simple → integer
|char
|num dotdot num
Fig 2.10. Top-down parsing while scanning the input from left to right.
10
Fig 2.11. Steps in the top-down construction of a parse tree.
• The selection of production for a nonterminal may involve trial-and-error. =>
backtracking
11
but process is too difficult. It needs 18 steps including 5 backtrackings.
• procedure of top-down parsing
let a pointed grammar symbol and pointed input symbol be g, a respectively.
o if( g ∈ N ) select and expand a production whose left part equals to g next to
current production.
else if( g = a ) then make g and a be a symbol next to current symbol.
else if( g ≠a ) back tracking
let the pointed input symbol a be the symbol that moves back to steps same with the
number of current symbols of underlying production
eliminate the right side symbols of current production and let the pointed symbol g
be the left side symbol of current production.
12