Ch3 - Syntax Analysis
Ch3 - Syntax Analysis
Syntax Analysis
▪ Left Factoring
▪ Context-Free Grammars versus Regular Expressions
Chapter – 3 : Syntax Analysis 2 Bahir Dar Institute of Technology
Introduction
▪ Syntax analysis is the second phase of the compiler.
▪ The parser takes the token produced by lexical analysis and
builds the syntax tree (parse tree).
▪ The syntax tree can be easily constructed from Context-Free
Grammar.
▪ The parser reports syntax errors in an intelligible/understandable
fashion and recovers from commonly occurring errors to
continue processing the remainder of the program.
▪ The process of syntax analysis is performed using syntax
analyzer/parser.
▪ RMD for - ( id + id )
E -E E
-(E) E
-(E+E) E
- E - E - E
( E ) ( E )
E E E + E
- E - E
-(id+E) -(id+id)
( E ) ( E )
E + E E + E
id id id
id * E + E id
E E
id id
id id
▪ Describe the same language, the set of strings of a's and b's ending
in abb. So we can easily describe these languages either by finite
Automata or PDA.
▪ On the other hand, the language L ={anbn | n ≥1} with an equal
number of a's and b's is a prototypical example of a language that
can be described by a grammar but not by a regular expression.
▪ We can say that "finite automata cannot count" meaning that a
finite automaton cannot accept a language like {anbn | n ≥ 1} that
would require it to keep count of the number of a's before it sees
the b’s.
▪ So these kinds of languages (Context-Free Grammars) are accepted
by PDA as PDA uses stack as its memory.
Chapter – 3 : Syntax Analysis 18 Bahir Dar Institute of Technology
Context-Free Grammars versus Regular Expressions
▪ The general comparison of Regular Expressions vs. Context-
Free Grammars:
Recursive descent
Involves Back tracking Operator precedence
predictive parsing
Parsing without LR parsing
backtracking
SLR
Recursive
predictive
CLR
Non-Recursive
predictive
Or LL(1) LALR
2. If X is , then FIRST(X)={}
3. If X is a non-terminal symbol and X → is a
production rule, then add in FIRST(X).
2. If in FIRST()
➔ for each terminal a in FOLLOW(A) add A → to M[A,a]
▪ All other undefined entries of the parsing table are error entries.
E→E+E
E→E*E String: id1+id2*id3
E→id
Right sentential Handle Production
Rightmost Derivation form
E id1+id2*id3 id1 E→id
E+E E+id2*id3 id2 E→id
E+E*E E+E*id3 id3 E→id
E+E*id3 E+E*E E*E E→E*E
E+id2*id3 E+E E+E E→E+E
id1+id2*id3 E
S 0 1 rm
2 ... n-1 n=
rm rm rm rm
input string
▪ NB: If a shift-reduce parser cannot be used for a grammar, that grammar is called
non-LR(k) grammar. An ambiguous grammar can never be an LR grammar.
CFG
▪ LR-Parsers covers wide range of grammars. LR
LALR
• Simple LR parser (SLR)
SLR
• Look Ahead LR (LALR)
• most general LR parser (LR)
▪ SLR, LR and LALR work same, only their parsing tables are
different.
Chapter – 3 : Syntax Analysis 75 Bahir Dar Institute of Technology
LR Parsers
▪ LR parsing is attractive because:
• LR parsers can be constructed to recognize virtually
(effectively) all programming-language constructs for which
context-free grammars can be written.
• LR parsing is most general non-backtracking shift-reduce
parsing, yet it is still efficient.
• The class of grammars that can be parsed using LR methods
is a proper superset of the class of grammars that can be
parsed with predictive parsers.
• LL(1)-Grammars LR(1)-Grammars
• An LR-parser can detect a syntactic error as soon as it is
possible to do so a left-to-right scan of the input.
▪ Drawback of the LR method is that it is too much work
to construct an LR parser by hand.
• Use tools e.g. yacc
Sm
Xm
LR Parsing Algorithm output
Sm-1
Xm-1
.
.
Action Table Goto Table
S1 terminals and $ non-terminal
X1 s s
t four different t each item is
S0 a actions will be a a state number
t applied t
e e
s s
A Configuration of LR Parsing Algorithm
▪ NB: I1, I4, I5, I6 are called final items. They lead to fill the
‘reduce’/ri action in specific row of action part in a table
NB: In the LR(0) construction table whenever any state having final item in
that particular row of action part put Ri completely.
egg. in row 4, put R3 , 3 is a leveled number for production in G
Chapter – 3 : Syntax Analysis 89 Bahir Dar Institute of Technology
Example of LR(0) parsing Table
▪ Step 6: check the parser by implementing using stack for string abb$
S→AA ---1
A→aA ---2
A→b -----3
▪ Stack implementation is the same as to LR(0)
Chapter – 3 : Syntax Analysis 93 Bahir Dar Institute of Technology
LALR and CLR parser
▪ NB:
• LR(0) and SLR(1) used LR(0) items to create a parsing table
• But LALR and CLR parsers used LR(1) items in order to
construct a parsing table.
▪ Reading assignment
• LALR parser and
• CLR parser