Chp3 Syntax Analysis
Chp3 Syntax Analysis
Syntax Analysis
Introduction to parsers
Syntax trees
Context-free grammars
Push-down automata
Top-down parsing
Bison - a parser generator
Bottom-up parsing
Introduction to Parsers
Symbol
Table
Syntax Trees
:=
id1 +
id2 *
id3
60
Context-Free Grammars (CFG)
id { id },
‘+’ { + },
‘-’ { - },
‘*’ { * },
‘/’ { / },
‘(’ { ( },
‘)’ { ) },
op { +, -, *, / },
expr { id, - id, ( id ), id + id, id - id, … }.
Derivations
Grammar:
1. expr expr op expr Derivation:
2. expr ‘(’ expr ‘)’ expr
3. expr ‘-’ expr - expr
4. expr id - (expr )
5. op ‘+’ - (expr op expr )
6. op ‘-’ - ( id op expr )
7. op ‘*’ - ( id + expr )
8. op ‘/’ - ( id + id )
Left- & Right-Most Derivations
id + id
Ambiguous Grammars
id + id * id
expr expr
id id id id
Transform Ambiguous Grammars
Input
$
$
End-Of-File and Bottom-of-Stack
Markers
1 2 2 3 3 4
a a b b $
a
a a a
$ $ $ $ $
CFG versus RE
a b a a d
Predictive Parsing
S if E then S else S S
| begin L end
begin L end
| print E
LS;L S ; L
|
print E
E num = num
num = num
Choosing the Alternative Case
S if E then S else S
| begin L end
| print E
LS;L FIRST(S ; L) = {if, begin, print}
| FOLLOW(L) = {end}
E num = num
An Example
const int
IF = 1, THEN = 2, ELSE = 3, BEGIN = 4,
END =5, PRINT = 6, SEMI = 7, NUM = 8,
EQ = 9;
int token = yylex();
void match(int t)
{
if (token == t) token = yylex(); else error();
}
An Example
void S() {
switch (token) {
case IF: match(IF); E(); match(THEN); S();
match(ELSE); S(); break;
case BEGIN: match(BEGIN); L();
match(END); break;
case PRINT: match(PRINT); E(); break;
default: error();
}
}
An Example
void L() {
switch (token) {
case END: break;
case IF: case BEGIN: case PRINT:
S(); match(SEMI); L(); break;
default: error();
}
}
An Example
void E() {
switch (token) {
case NUM:
match(NUM); match(EQ); match(NUM);
break;
default: error();
}
}
First and Follow Sets
FOLLOW(S) = { $, else, ; }
FOLLOW(L) = { end }
FOLLOW(E) = { then, $, else, ; }
Table-Driven Predictive Parsing
S L E
if S if E then S else S L S ; L
then
else
begin S begin L end LS;L
end L
print S print E LS;L
num E num = num
;
$
An Example
Stack Input
$S begin print num = num ; end $
$ end L begin begin print num = num ; end $
$ end L print num = num ; end $
$ end L ; S print num = num ; end $
$ end L ; E print print num = num ; end $
$ end L ; E num = num ; end $
$ end L ; num = num num = num ; end $
$ end L ; ; end $
$ end L end $
$ end end $
$ $
LL(1) Grammars
S i E t S S' | a
S' e S |
Eb
a b e i t $
S Sa S i E t S S'
S' S' S'
S' e S
E Eb
AR
A A|
RR|
A A R R
A R R
A
A
Direct Left Recursion
A A 1 | A 2 | ... | A m | 1 | 2 | ... | n
E E + T | T
T T * F | F
F ( E ) | id
E T E'
E' + T E' |
T F T'
T' * F T' |
F ( E ) | id
Indirect Left Recursion
S Aa|b
A Ac|Sd|
S Aa Sda
A Ac|Aad|bd|
S Aa|b
A b d A' | A'
A' c A' | a d A' |
Left factoring
S iEtS|iEtSeS|a
E b
S i E t S S' | a
S' e S |
E b
Bottom-Up Parsing
1. S’ S
2. S if E then S else S
3. S begin L end
4. S print E
5. L
6. L S ; L
7. E num = num
An Example
Stack Input Action
$ begin print num = num ; end $ shift
$ begin print num = num ; end $ shift
$ begin print num = num ; end $ shift
$ begin print num = num ; end $ shift
$ begin print num = num ; end $ shift
$ begin print num = num ; end $ reduce
$ begin print E ; end $ reduce
$ begin S ; end $ shift
$ begin S ; end $ reduce
$ begin S ; L end $ reduce
$ begin L end $ shift
$ begin L end $ reduce
$S $ accept
LL(k) versus LR(k)
$ Input
1. S’ S
2. S if E then S else S
3. S begin L end
4. S print E
5. L
6. L S ; L
7. E num = num
An Example
Stack Input Action
$1 begin print num = num ; end $ s4
$1begin4 print num = num ; end $ s5
$1begin4print5 num = num ; end $ s7
$1begin4print5num7 = num ; end $ s12
$1begin4print5num7=12 num ; end $ s16
$1begin4print5num7=12num16 ; end $ r7
$1begin4print5E10 ; end $ r4
$1begin4S9 ; end $ s14
$1begin4S9;14 end $ r5
$1begin4S9;14L17 end $ r6
$1begin4L8 end $ s13
$1begin4L8end13 $ r3
$1S2 $ a
LR Parsing Driver
while (true) {
s = top(); a = gettoken();
if (action[s, a] == shift s‟) { push(a); push(s‟); }
else if (action[s, a] == reduce A ) {
pop 2 * | | symbols off the stack;
s‟ = goto[top(), A]; push(A); push(s‟); }
else if (action[s, a] == accept) { return; }
else { error(); }
}
Bison – A Parser Generator
%{
C declarations
%}
Bison declarations
%%
Grammar rules
%%
Additional C code
An Example
%token DIGIT %%
%start line line: expr „\n‟
;
expr: expr „+‟ term
| term
;
term: term „*‟ factor
| factor
;
factor: „(‟ expr „)‟
| DIGIT
;
An Example - expr.y
%token NEWLINE %%
%token ADD line: expr NEWLINE
%token MUL ;
%token LP expr: expr ADD term
%token RP | term
%token DIGIT ;
%start line term: term MUL factor
| factor
;
factor: LP expr RP
| DIGIT
;
An Example - expr.tab.h
%{
#include <stdio.h>
%}
%token NUMBER
%left „+‟ „-‟
%left „*‟ „/‟
%right UMINUS
%%
An Example
yyerror(char *s)
{
fprintf(stderr, “%s: line %d\n”, s, yylineno);
}
LR Parsing Table Generation
1. E‟ E
2. E E + T
3. E T
4. T T * F
5. T F
6. F ( E )
7. F id
An Example
2 9 18 15
E + T EE+T•
E•E+T EE•+T EE+•T
4
5
10 6 13 17 20
( )
ET• F(•E) E F(E•)
F•(E) F(E)•
T 2
1 3 5 7 14
id 3
E‟•E E•T T•F F•id Fid•
F 12
TF•
E
T•T*F T TT•*F * TT*•F
F TT*F•
8 4 11 16 6 19
E‟E• 7
From NPDA to DPDA
closure(I) =
repeat
for any item A X in I
for any production X
I=I{X}
until I does not change
return I
An Example
1. E‟ E s1 = E‟ E,
2. E E + T I1 = closure({s1 }) = {
3. E T E‟ E,
4. T T * F E E + T,
5. T F E T,
6. F ( E ) T T * F,
7. F id T F,
F ( E ),
F id }
The Goto Function
goto(I, X) =
set J to the empty set
for any item A X in I
add A X to J
return closure(J)
An Example
I1 = {E‟ E,
E E + T, E T,
T T * F, T F,
F ( E ), F id }
goto(I1 , E)
= closure({E‟ E , E E + T })
= {E‟ E , E E + T }
The Subset Construction Function
subset-construction(cfg) =
initialize T to {closure({S‟ S})}
repeat
for each state I in T and each symbol X
let J be goto(I, X)
if J is not empty and not in T then
T=T{J}
until T does not change
return T
An Example
I1 : {E‟ E, E E + T, E T, T T * F,
T F, F ( E ), F id}
goto(I1, E) = I2 : {E‟ E , E E + T}
goto(I1, T) = I3 : {E T , T T * F}
goto(I1, F) = I4 : {T F }
goto(I1, „(‟) = I5 : {F ( E ), E E + T, E T
T T * F, T F, F ( E ), F id}
goto(I1, id) = I6 : {F id }
goto(I2, „+‟) = I7 : {E E + T, T T * F, T F
F ( E ), F id}
An Example
goto(I5, E) = I9 : {F ( E ), E E + T}
goto(I5, T) = I3
goto(I5, F) = I4
goto(I5, „(‟) = I5
goto(I5, id) = I6
goto(I7, T) = I10 : {E E + T , T T * F}
goto(I7, F) = I4
goto(I7, „(‟) = I5
goto(I7, id) = I6
An Example
goto(I8, F) = I11 : {T T * F }
goto(I8, „(„) = I5
goto(I8, id) = I6
goto(I10, „*‟) = I8
An Example
E‟ • E F id • F(•E)
E • E + T id 6 id E•E+T F (E•) 9
E•T E•T
( E EE•+T
T•T*F T•T*F +
T•F T T•F ( )
F•(E) 8 id ( F F•(E)
F ( E ) • 12
F • id 1 TT*•F F • id 5
E F T 3 F•(E) F
* F • id T T * F • 11
ET•
T T • * F id E E + • T *
(
T•T*F
TF• 4 F T E E + T • 10
T•F TT•*F
E‟ E • + F•(E)
EE•+T 2 F • id 7
SLR(1) Parsing Table Generation
SLR(cfg) =
for each state I in subset-construction(cfg)
if A a in I and goto(I, a) = J for a terminal a then
action[I, a] = “shift J”
if A in I and A S‟ then
action[I, a] = “reduce A ” for all a in Follow(A)
if S‟ S in I then action[I, $] = “accept”
if A X in I and goto(I, X) = J for a nonterminal X
then goto[I, X] = J
all other entries in action and goto are made error
An Example
+ * ( ) id $ E T F
1 s5 s6 g2 g3 g4
2 s7 a
3 r3 s8 r3 r3
4 r5 r5 r5 r5
5 s5 s6 g9 g3 g4
6 r7 r7 r7 r7
7 s5 s6 g10 g4
8 s5 s6 g11
9 s7 s12
10 r2 s8 r2 r2
An Example
+ * ( ) id $ E T F
11 r4 r4 r4 r4
12 r6 r6 r6 r6
LR(I) Items
closure(I) =
repeat
for any item (A X , a) in I
for any production X
for any b First(a)
I = I { (X , b) }
until I does not change
return I
An Example
I1 = closure({(S‟ S, $)}) =
1. S‟ S
2. S C C {(S‟ S, $), First($) = {$}
3. C c C
(S C C, $),
4. C d
(C c C, c), (C c C, d),
First(C$) = {c, d} (C d, c), (C d, d)}
The Goto Function
goto(I, X) =
set J to the empty set
for any item (A X , a) in I
add (A X , a) to J
return closure(J)
An Example
goto(I1, C)
= closure({S C C, $)})
= {S C C, $), (C c C, $), (C d, $)}
The Subset Construction Function
subset-construction(cfg) =
initialize T to {closure({(S‟ S , $)})}
repeat
for each state I in T and each symbol X
let J be goto(I, X)
if J is not empty and not in T then
T=T{J}
until T does not change
return T
An Example
1. S‟ S
2. S C C
3. C c C
4. C d
An Example
I1: closure({(S‟ S, $)}) = I4: goto(I1, c) =
(S‟ S, $) (C c C, c/d)
(S C C, $) (C c C, c/d)
(C c C, c/d) (C d, c/d)
(C d, c/d)
I5: goto(I1, d) =
I2: goto(I1, S) = (S‟ S , $) (C d , c/d)
c d $ S C
1 s4 s5 g2 g3
2 a
3 s7 s8 g6
4 s4 s5 g9
5 r4 r4
6 r2
7 s7 s8 g10
8 r4
9 r3 r3
10 r3
An Example
$,r1
2 6 $,r2
C
S c
c C 10 $,r3
C 3 7
1 c d d
c 4 8 $,r4
d C
d
5 9 c/d,r3
c/d,r4
An Example
$,r1
2 6 $,r2
C
S c
c C 10 $,r3
C 3 7
1 c d d
c 4 8 $,r4
d C
d
5 9 c/d,r3
c/d,r4
The Core of LR(1) Items
c d $ S C
1 s47 s58 g2 g3
2 a
3 s47 s58 g6
47 s47 s58 g910
58 r4 r4 r4
6 r2
910 r3 r3 r3
Shift/Reduce Conflicts
Stack Input
$ - - - if expr then stmt else - - - $
$- - - procid ( id , id ) - - - $
LR Grammars
LL(k) LR(k)
LR(1)
LALR(1)
LL(1) SLR(1)