syntax analysis
syntax analysis
By:
Trusha R. Patel
Asst. Prof.
CE Dept., CSPIT, CHARUSAT
Role of Parser
token Parse
Lexical Syntax tree
Source Rest of
program Analyzer Analyzer
Front end
(Scanner) getNextToke (Parser)
n
Symbol table
2
CFG (Context Free Grammar)
CFG consists of terminals, nonterminals, start symbols and productions
❖ Terminals
Basic symbols from which strings are formed
“token name” is synonym for “terminal”
❖ Nonterminals
Syntactic variable that denote sets of strings
3
CFG (Context Free Grammar)
CFG consists of terminals, nonterminals, start symbols and productions
❖ Start symbol
One nonterminal different from other
Set of strings it denotes is the language generated by the grammar
Its productions are listed first
❖ Production
Specify the manner in which the terminal and nonterminal can combine to form strings
4
CFG (Context Free Grammar)
Production consist of
❖ Nonterminal called the “head” or “left side”
❖ Symbol
❖ “body” or “right side” consisting of zero or more terminals and nonterminals
CFG (Context Free Grammar)
Grammar for arithmetic expression
7
Notational convention for grammar
X , Y , Z represents grammar symbols
either nonterminal or terminal
A α1 , A α2 , … , A αk may be written as
A α1 | α2 | … | αk
8
Language generated by grammar
G : Grammar
L(G) : Language generated by grammar G
Two grammar generate the same language, the grammars are said to be
equivalent
9
Derivation
Beginning with the start symbol, each rewriting step replaces a nonterminal by
the body of one of its production
Grammar: E E + E | id
E
String: id + id +id
Derivation: E
E+E E + E
E+E+E
id + E + E id
E E
id + id + E +
id + id + id
id id
10
Derivation
11
Derivation
Lest most derivation
❖ Left most nonterminal will be first replace by its production
Right most derivation (canonical derivation)
❖ Right most nonterminal will be first replace by its production
Grammar : E E + E | E * E | - E | ( E ) | id
String : - ( id + id )
12
Reduction
Specific substring matching with the production of nonterminal will be
replaced by that nonterminal
Grammar: E E + E | id E
String: id + id +id
Derivation: id + id + id
E + id + id
E + E + id
E E
E + id
E+E
E E
E
id + id + id
13
Parse tree
Graphical representation of derivation
Parse tree for the string - ( id + id ) is
- E
( E )
E + E
id id
14
Ambiguity
A grammar that produce more than one parse tree for some string is said to
be ambiguous grammar
more than one left most derivation or more than one right most derivation
E E
* E E + E
E
Grammar:
E E + E | E * E | id E E
id id
E E *
String : +
id + id * id id id
id id
15
CFG vs. RE
Grammar are more powerful than RE
16
CFG vs. RE
RE : (a|b)*abb
Grammar :
S aX | aS | bS
X bY
Y bZ
Z ϵ
17
Left recursion
18
Left factoring
Left factoring is a grammar transformation that is useful for producing a
grammar suitable for predictive parsing.
Left factoring
19
General types of parser
• General method
• Can parse any grammar
• Methods such as
• Cocke-Younger-Kasami algorithm
• Earley’s algorithm
20
General types of parser
21
General types of parser
22
Top-Down Parsing
Construct parse tree for the input string starting from root and creating the
nodes of parse tree in preorder (derivation)
Grammar ( G ) : E E + E | E * E | - E | ( E ) | id
String : id + id * id
E
E + E
id E * E
id id
23
Different Top-Down Parsing Techniques
1. Recursive-Decent Parsing ( RDP )
2. Predictive Parsing
24
1. Recursive-Decent Parsing ( RDP )
Require backtracking to find correct production to be applied
Left recursive grammar can cause RDP to go into an infinite loop
25
1. Recursive-Decent Parsing ( RDP )
Algorithm
void A( )
{
choose an A-production, A X1,X2,…,Xk ;
for ( i = 1 to k)
{
if ( Xi is a nonterminal)
call procedure Xi( );
else if ( Xi equals the current input symbol α )
advance the input to the next symbol;
else
/* error occurred */ ;
}
}
26
1. Recursive-Decent Parsing ( RDP )
Process:
❖ Maintain 2 pointer
Lookahead pointer (LP) (point to top element of stack)
Input pointer (IP) (point to symbol in input string)
❖ If nonterminal in stack (pointed by LP) then
replace it by its production, and LP point to left most symbol in production
❖ If terminal in stack (pointed by LP) then
compare stack and input (pointed by LP and IP)
If match then
advance both pointers (LP and IP)
If not match then
backtrack
27
1. Recursive-Decent Parsing ( RDP )
S S S
S S
LP c A d c A d c A d c A d
LP LP a b a
LP LP
LP LP LP
Grammar:
String Match
S cAd String : c a d
backtrack
A ab|a
IP IP IP
28
FIRST and FOLLOW
Used to construct top-down and bottom-up parser
FIRST ( α ) :
Set of terminals that begin strings derived from α
FOLLOW ( α ) :
Set of terminals that can appear immediately to the right of α
29
FIRST
FIRST ( α )
Termina Non
l Terminal Look production
FIRST ( α ) = { α } of α
α ϵ α βγ
FIRST ( α ) = { ϵ }
Termina Non
l Terminal
FIRST ( α ) = { β } FIRST ( α ) = FIRST ( β )
Contain ϵ
start Non
Terminal
Find α in RHS of
FOLLOW ( α ) {$} Grammar
β αγ
ϵ Termina Non
l Terminal
FOLLOW ( α ) FOLLOW ( β ) FOLLOW ( α ) {γ} FOLLOW ( α ) FIRST ( γ )
Contain ϵ
32
2. Predictive Parsing
LL(1) grammar
❖ Cover most programming constructs
❖ Properties
Unambiguous
No left-recursion
33
2. Predictive Parsing
34
2. Predictive Parsing
Grammar: FIRST( E ) = { id , ( } FOLLOW ( E ) = { $ , ) }
E T E’ FIRST ( E’ ) = { + , ϵ } FOLLOW ( E’ ) = { $ , ) }
E’ + T E’ | ϵ FIRST ( T ) = { id , ( } FOLLOW ( T ) = { $, ) , + }
FIRST ( T’ ) = { * , ϵ } FOLLOW ( T’ ) = { $ , ) , + }
T F T’
FIRST ( F ) = { id , ( } FOLLOW ( F ) = { $ , ) , + , *
T’ * F T’ | ϵ
}
F ( E ) | id
Terminal
Nonterminal
id + * ( ) $
E TE’ TE’
E’ +TE’ ϵ ϵ
T FT’ FT’
T’ ϵ *FT’ ϵ ϵ
F id (E)
All cell contain one and only one production so grammar is LL(1)
35
2. Predictive Parsing
(1) Parse the string id+id STACK INPUT OUTPUT
$E id + id $
$ E’ T id + id $ E T E’
$ E’ T’ F id + id $ T F T’
$ E’ T’ id id + id $ F id
$ E’ T’ + id $
$ E’ + id $ T’ ϵ
$ E’ E’ T + + id $ E’ + T E’
$ E’ E’ T id $
$ E’ E’ T’ F id $ T F T’
$ E’ E’ T’ id id $ F id
$ E’ E’ T’ $
$ E’ E’ $ T’ ϵ
$ E’ $ E’ ϵ
$ $ E’ ϵ
36
2. Predictive Parsing
(2) Parse the string (id+id)*id $ E’ T’ ) E’ T id ) * id$
STACK INPUT OUTPUT $ E’ T’) E’ T’ F id ) * id$ T FT’
$E ( id + id ) * id $ $ E’ T’ ) E’ T’ id id ) * id$ F id
$ E’ T ( id + id ) * id $ E TE’ $ E’ T’ ) E’ T’ ) * id$
$ E’ T’ F ( id + id ) * id $ T FT’ $ E’ T’) E’ ) * id$ T’ ϵ
$ E’ T’ ) E ( ( id + id ) * id $ F (E) $ E’ T’ ) ) * id$ E’ ϵ
$ E’ T’ ) E id + id ) * id $ $ E’ T’ * id$
$ E’ T’ ) E’ T id + id ) * id $ (E) TE’ $ E’ T’ F* * id$ T’ *FT’
$ E’ T’ ) E’ T’ F id + id ) * id $ T FT’ $ E’ T’ F id$
$ E’ T’ ) E’ T’ id id + id ) * id $ F id $ E’ T’ id id$ F id
$ E’ T’ ) E’ T’ + id ) * id $ $ E’ T’ $
$ E’ T’ ) E’ + id ) * id $ T’ ϵ $ E’ $ T’ ϵ
$ E’ T’ ) E’ T + + id ) * id $ E’ +TE’ $ $ E’ ϵ
37
2. Predictive Parsing
Grammar: FIRST( be ) = { not , ( , true , false FOLLOW ( be ) = { $ , ) }
be be or bt | bt } FOLLOW ( B’ ) = { $ , ) }
bt bt and bf | bf FIRST ( B’ ) = { or , ϵ } FOLLOW ( bt ) = { $ , ) , or }
FIRST ( bt ) = { not , ( , true , false FOLLOW ( A’ ) = { $ , ) , or }
bf not bf | ( be ) | true | false
} FOLLOW ( bf ) = { $ , ) , or , and
FIRST ( A’ ) = { and , ϵ } }
Remove left recursion
FIRST ( bf ) = {not , ( , true , false }
Nonterm Terminal
Grammar: inal or and not ( ) true false $
be bt B’ be bt B’ bt B’ bt B’ bt B’
B’ or bt B’ | ϵ B’ or bt B’ ϵ ϵ
bt bf A’
bt bf A’ bf A’ bf A’ bf A’
A’ and bf A’ | ϵ
bf not bf | ( be ) | true | false A’ ϵ and bf A’ ϵ ϵ
bf not bf (be) true false
All cell contain one and only one production so grammar is LL(1)
38
2. Predictive Parsing
Grammar: FIRST ( S ) = { i , a } FOLLOW ( S ) = { e , $ }
S i E t S S’ | a FIRST ( S’ ) = { e , ϵ } FOLLOW ( S’ ) = { e , $ }
S’ eS|ϵ FIRST ( E ) = { b } FOLLOW ( E ) = { t }
E b
Terminal
Nonterminal
i t a e b $
S i E t S S’ a
eS
S’ ϵ
ϵ
E b
39
2. Predictive Parsing
Grammar: FIRST ( S ) = { ( , a } FOLLOW ( S ) = { $ , , , )
S (L)|a FIRST ( L ) = { ( , a } }
L L,S|S FIRST ( L’ ) = { , , ϵ } FOLLOW ( L ) = { ) }
FOLLOW ( L’ ) = { ) }
All cell contain one and only one production so grammar is LL(1)
40
2. Predictive Parsing
Grammar: FIRST ( D ) = { int , float} FOLLOW ( D ) = { $ }
Space Space
D type list ; FIRST ( list ) = { id } FOLLOW ( list ) = { ; }
list list , id | id FIRST ( L’ ) = { , , ϵ} FOLLOW ( L’ ) = { ; }
FIRST ( type ) = { int , float } FOLLOW ( type ) = { ‘ ’ }
type int | float
41
Bottom-Up Parsing
Construct parse tree for the input string starting at the leaves (bottom) and
working up towards the root (top) (reduction)
Grammar ( G ) : E E + E | E * E | - E | ( E ) | id
String : id + id + id
E
E E
E E
id + id + id
42
Bottom-Up Parsing
44
Different Bottom-Up Parsing Techniques
1. Shift-Reduce Parsing
2. Operator Precedence Parsing
3. LR Parsing
1) Simple LR ( SLR or LR(0) )
2) Canonical LR ( CLR or LR(1) )
3) Lookahead LR ( LALR )
45
1. Shift-Reduce Parsing
Stack holds grammar symbols
Input buffer holds the string to be parsed
Handle always appears at the top of the stack
Use $ to mark bottom of the stack and also the right end of the input
Process:
❖ During left to right scan of input string, shift zero or more input symbols onto the
stack, until it is ready to reduce a string β
❖ The reduce β to the head (LHS) of the appropriate production
❖ Repeats this cycle until detect error or until stack contain start symbol and input is
empty
46
1. Shift-Reduce Parsing
There are 4 possible actions
❖ Shift
Shift the next input symbol onto the top of the stack
❖ Reduce
Replace handle with LHS in the stack
❖ Accept
Parsing complete successfully
❖ Error
Discover a syntax error and call an error recovery routine
47
1. Shift-Reduce Parsing
Stack Input Action
Grammar :
E E + E | E * E | id
String:
id + id * id
48
1. Shift-Reduce Parsing
Stack Input Action
$ id + id * id $ Shift
$ id + id * id $ Reduce E id
Grammar :
$E + id * id $ Shift
E E + E | E * E | id $E+ id * id $ Shift
$ E + id * id $ Reduce E id
String:
$E+E * id $ Shift
id + id * id $E+E* id $ Shift
$ E + E * id $ Reduce E id
$E+E*E $ Reduce E E*E
$E+E $ Reduce E E+E
$E $ Accept
49
1. Shift-Reduce Parsing
Conflict during shift reduce parsing
50
2. Operator Precedence Parsing
Operator grammar
❖ The grammar has the property (among other essential requirements) that no
production right side is ϵ or has two adjacent nonterminals.
51
2. Operator Precedence Parsing
Define precedence relation between pair of terminals
by disjoint relation symbols ⋗,⋖ and ≐
52
2. Operator Precedence Parsing
How to parse string (using operator precedence table)
53
2. Operator Precedence Parsing
Grammar: String:
E E + E | E * E | id
$ ⋖ id ⋗ + ⋖ id ⋗ * ⋖ id ⋗ $
id is replaced with E
Now compare $ + id * id $ $ ⋖ E + ⋖ id ⋗ * ⋖ id ⋗ $
operator precedence table
Right side $ ⋖ E + E ⋖ * ⋖ id ⋗ $
id + * $
Lef id ⋗ ⋗ ⋗ $ ⋖ E + E ⋖ * E ⋗ $
t Left + has high priority
+ ⋖ ⋗ ⋖ ⋗
sid then right + $ ⋖ E + E ⋗ $
e * ⋖ ⋗ ⋗ ⋗
$ ⋖ ⋖ ⋖ $ E $
54
2. Operator Precedence Parsing
Algorithm : operator precedence parsing
❖ Method :
Initially the stack contains $ and the input buffer the string w$. To parse we execute the below
program
1. Set ip to point to the first symbol of w$:
2. Repeat forever
3. if $ is on top of the stack and ip points to $ then
4. return
5. else begin
6. let a be the topmost terminal symbol on the stack and b be the symbol pointed by ip
7. if a ⋖ b or a ≐ b the begin
8. push b onto the stack
55
2. Operator Precedence Parsing
Algorithm : operator precedence parsing
❖ Method :
9. advance ip to the next input symbol
10. end
11. else if a ⋗ b then
12. repeat
13. pop the stack
14. until the top stack terminal is related by ⋖ to the terminal most recently popped
15. else
16. error()
17. end
56
2. Operator Precedence Parsing
Operator precedence function
❖ Precedence between “a” and “b” can be determined by numerical comparison
function f and g
❖ f(a) = g(b) if a ≐ b
❖ f(a) < g(b) if a ⋖ b
❖ f(a) > g(b) if a ⋗ b
57
2. Operator Precedence Parsing
Algorithm : construct precedence functions
❖ Input :
An operator precedence matrix
❖ Output :
Precedence functions representing the input matrix, or an indication that none exist
58
2. Operator Precedence Parsing
Algorithm : construct precedence functions
❖ Method :
1. Create symbol “fa” and ga” for each terminal “a” and $
2. Partitions the created symbols into as many group as possible in a such a way that if a ≐ b then
“fa” & “gb” are in same group
3. Create a directed graph whose nodes are the groups found in step-2
for any “a” and “b”
if a ⋖ b then place an edge from group “gb” to group “fa”
if a ⋗ b then place an edge from group “fa” to group “gb”
4. If graph is constructed in step-3 has a cycle then no precedence function exist. If there are no
cycle then let f(a) be the length of the longest path beginning at the group of “fa” and g(a) be
the length of the longest path from beginning at the group of “ga”
59
2. Operator Precedence Parsing
gid fid
Right side
------- g -------
id + * $
f* g*
Left id ⋗ ⋗ ⋗
side
+ ⋖ ⋗ ⋖ ⋗
-----
-- f * ⋖ ⋗ ⋗ ⋗ Find max path to reach
----- g+ f+ either f$ or g$
-- $ ⋖ ⋖ ⋖
id + * $
Draw edge from grater to less f$ g$ f 4 2 4 0
e.g. F(+) > g(+) so edge from f+ to g+
g 5 1 3 0
60
2. Operator Precedence Parsing
Parse string id + id * id
$ id + id * id $
0 5 2 5 4 5 0
$ E + E * E $
0 2 4 0
$ E + E $
0 2 0
$ E $ id + * $
0 0 f 4 2 4 0
g 5 1 3 0
61
2. Operator Precedence Parsing
operator precedence table
Right side
Grammar: Lef id + - * / ^ ( ) $
t id ⋗ ⋗ ⋗ ⋗ ⋗ ⋗ ⋗
E E + E | E – E | E * E | E / E | E ^ E | ( E ) | idside
+ ⋖ ⋗ ⋗ ⋖ ⋖ ⋖ ⋖ ⋗ ⋗
- ⋖ ⋗ ⋗ ⋖ ⋖ ⋖ ⋖ ⋗ ⋗
* ⋖ ⋗ ⋗ ⋗ ⋗ ⋖ ⋖ ⋗ ⋗
/ ⋖ ⋗ ⋗ ⋗ ⋗ ⋖ ⋖ ⋗ ⋗
^ ⋖ ⋗ ⋗ ⋗ ⋗ ⋖ ⋖ ⋗ ⋗
( ⋖ ⋖ ⋖ ⋖ ⋖ ⋖ ⋖ ≐
) ⋗ ⋗ ⋗ ⋗ ⋗ ⋗ ⋗
$ ⋖ ⋖ ⋖ ⋖ ⋖ ⋖ ⋖
62
3. LR parsing
LR : left to right scan Right most derivation
63
3. LR parsing
3 types
❖ Simple LR (SLR)
Can solve LR(0) grammars
❖ Canonical LR (CLR)
Can solve LR(1) grammars
❖ Lookahead LR (LALR)
64
3. LR parsing : (1) SLR
Weakest methods from all 3 LR methods
Process
❖ Convert grammar in augmented grammar by adding one more starting symbol S’ S
❖ Construct LR(0) item sets
❖ Construct parsing table
65
3. LR parsing : (1) SLR
How to construct item set
66
3. LR parsing : (1) SLR
How to construct item set
❖ Closure function
If I is a set of items for a grammar G then closure(I) is constructed by two rules
1. Initially every item in I is in closure of I
2. A α . B β is in closure(I) & B γ is the production rule then B . γ is added in closure(I)
continue this rule till no new item can be added
❖ GOTO function
goto( I , x ) where “I” is item set and “x” is grammar symbol
goto( I , x ) = closure of the set of all items [ A α x . β ] such that [ A α . x β ] is in I
67
3. LR parsing : (1) SLR
How to construct parsing table
68
3. LR parsing : (1) SLR
Grammar: Make augmented Augmented Grammar:
grammar
E E+T| E’ E
T E E+T|T
T T*F|F T T*F|F
F ( E ) | id F ( E ) | id
Added new start symbol
E’
69
E’ After
Its same .E
I2 .has . E so add production ofT EFname
Its is
After new so don’t
there
item
. there are
isset
give
EFT so
new
5give name,
possibilities
new
(nonterminal) nameEgive
I1 ( andI2
Same EProduction
when .Titem
idadd
Same way
with
has .T
set
when
so “.
add
match E”productions
find
become
with any “E of T item set
previous
So production
(nonterminal) of new
EF item set give new name
3. LR parsing : (1) SLR T.” Prepare
give same .F hasgoto
name .F sofor
So add production
addallproductions of F
of T
Construct LR(0) item sets I4 = goto( I0 , ( ) = F (.E I8 = goto( I4 , E ) = F (E.)
Io = E ’ . )E .E+ E E.+
E T T
.E+
ET .. TT * I2 = goto( I4 , T ) E T.
T = T T.*
TE .T* F
FT .(. FE ) F
F I3 = goto( I4 , F ) = T F.
FT .. (FE F . id
) I5 = goto( I0 , id ) = F id . I4 = goto( I4 , ( ) = F (.E
F . id )E .E+
I1 = goto( I0 , E ) = E’ E. I6 = goto( I1 , + ) = E E+. T
E E.+ TT .T* ET .. TT *
T F F
I2 = goto( I0 , T ) E T. TF . FE )
.( FT .(. FE )
= T T.* F . id F . id
F
I7 = goto( I2 , * ) = T T*. I5 = goto( I4 , id ) = F id .
I3 = goto( I0 , F ) = T F. FF .( E )
F . id
70
New item set so give new name
3. LR parsing : (1) SLR
Construct LR(0) item sets
I4 = goto( I7 , ( ) = F (.E) I6 = goto( I8 , + ) = E E+.
E .E+ T
I9 = goto( I6 , T ) E E+T
T T .T*
= .
E .T F
T T.*F
I3 = goto( I6 , F ) = T F. T .T* T .F
F F .(E)
I4 = goto( I6 , ( ) = F (.E) T .F F . id
E .E+ F .(E) I7 = goto( I9 , * ) = T T*.
T I5 = goto( I7 , id ) = F . id.
id F
E .T
F .(E)
T .T*
F . id
F I11 = goto( I8 , ) ) = F (E).
T .F
F .(E)
I5 = goto( I6 , id ) = FF .idid.
I ) E ( (
I8 I4 I4
11
+ id
I6 I5
T
I2
T * F I
F I2 I7
10
I3
(
id I4
I5
id
I5
id F
I5 I3
72
I3 has item T F. (dot I5 =atgoto
I1 the (end)
I0,id ))
I0,E
Production numberinof0,E T do
0,id F entry
do is 4 of
entry of 1S5 (shift 5)
3. LR parsing : (1) SLR Follow(T)={+,*,),$}
In (3,+) (3,*) (3,)) (3,$) do entry of R4 (reduce 4)
action goto
Item set
id + * ( ) $ E T F
0 S5 S4 1 2 3
1 S6 Acc
2 R2 S7 R2 R2
3 R4 R4 R4 R4
4 S5 S4 8 2 3
5 R6 R6 R6 R6
6 S5 S4 9 3
7 S5 S4 10
8 S6 S11
9 R1 S7 R1 R1
10 R3 R3 R3 R3
11 R5 R5 R5 R5
73
3. LR parsing : (1) SLR
Stack Input Action
0 id * id + id $ Shift
0 id 5 * id + id $ Reduce by F id
0F3 * id + id $ Reduce by T F
0T2 * id + id $ Shift
In parsing table entry of (0,id) is S5
So Action is Shift 0T2*7 id + id $ shift
Shift one element from input to stack 0 T 2 * 7 id 5 + id $ Reduce by F id
Place 5 after that in stack 0 T 2 * 7 F 10 + id $ Reduce by T T*F
In parsing table entry of (3,*) is R6 0T2 + id $ Reduce by E T
So Action is Reduce with production 6 (F id) 0E1 + id $ Shift
In stack find id and replace with F
0E1+6 id $ Shift
In stack it become 0F
In parsing table entry of (0,F) is 3 0 E 1 + 6 id 5 $ Reduce by F id
Place 3 in stack 0E1+6F5 $ Reduce by T F
0E1+6T9 $ Reduce by E E+T
0E1 $ Accept
74
3. LR parsing : (2) CLR
Process
❖ Convert grammar in augmented grammar by adding one more starting symbol S’ S
❖ Construct LR(1) item sets
❖ Construct parsing table
75
3. LR parsing : (2) CLR
How to construct item set
❖ Closure function
If I is a set of items for a grammar G then closure(I) is constructed by two rules
1. Initially every item in I is in closure of I
2. A α . B β , a is in closure(I) & B γ is the production rule
then B . γ , FIRST(βa) is added in closure(I) continue this rule till no new item can be added
❖ GOTO function
goto( I , x ) where “I” is item set and “x” is grammar symbol
goto( I , x ) = closure of the set of all items [ A α x . β , a ] such that [A α . x β , a ] is in
I
76
3. LR parsing : (2) CLR
How to construct parsing table
77
3. LR parsing : (2) CLR
Grammar: Make augmented Augmented Grammar:
grammar
S CC S’ S
C cC | d S CC
C cC | d
78
S’ .S need to compare with A α.Bβ , a
here β is ϵ and a is $ , so FIRST(βa) = {$}
3. LR parsing : (2) CLR $ is added in look ahead of S .CC
S
I0 I1
C C
I2 I5
c c
C c c C
I8 I3 I6 I9
d d d
I4 I7 I7
d
I4
80
3. LR parsing : (2) CLR
I3 = goto (I0,c) action goto
Item set
So do entry of shift3(S3) in (0,c) c d $ S C
0 S3 S4 1 2
I2 = goto (I0,C)
So do entry of 2 in (0,C) 1 Acc
2 S6 S7 5
I4 contain production C d.,c/d 3 S3 S4 8
Dot at the end so need to do reduce entry 4 R3 R3
C d has production number 3
5 R1
Look ahead are c and d
So do entry of R3 in (4,c0 and (4,d) 6 S6 S7 9
7 R3
8 R2 R2
9 R2
81
3. LR parsing : (3) LALR
Process
❖ Convert grammar in augmented grammar by adding one more starting symbol S’ S
❖ Construct LR(1) item sets
❖ Combine the item sets having same core but different lookahead
❖ Construct parsing table
82
3. LR parsing : (3) LALR
Grammar: Make augmented Augmented Grammar:
grammar
S CC S’ S
C cC | d S CC
C cC | d
83
3. LR parsing : (3) LALR
Construct LR(1) item sets
I6 = C c.C , $
Io = S ’ .S, I3 = C c.C , c/d I36 = C c.C , c/d/$
C .cC , $
S$ . CC , C .cC , C .cC , C .d , $
C$ .cC , c/d c/d/$
c/d C .d , c/d C .d , c/d/$
C .d , c/d
I7 = C d. ,
I47 = C d. , $
I1 = S ’ S. , I4 = C d. , c/d/$
$
c/d
I8 = C cC. ,
c/d
I2 = S C.C , $
C .cC , I5 = S CC. , I89 = C cC. ,
$ $ c/d/$
C .d , $ I9 = C cC. ,
$
84
3. LR parsing : (3) LALR
85
Using Ambiguous Grammar
Every ambiguous grammar fails to be LR
86
Using Ambiguous Grammar
Ambiguous grammar Equivalent unambiguous grammar
E E + E | E * E | (E) | id E E+T|T
T T*F|F
F (E) | id
87
Using Ambiguous Grammar
Reasons : might want to use ambiguous grammar instead of unambiguous
1. Can easily change the associativities and precedence levels of the operators without
disturbing the production of ambiguous grammar or the number of states in the
resulting parser.
88
Syntax Analyzer Generator YACC
C
y.tab. a.out
c Compiler
89
Structure of Yacc Program
declarations
%%
translation rules
%%
Supporting c-routines
90