0% found this document useful (0 votes)
6 views81 pages

Unit-II CD

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views81 pages

Unit-II CD

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 81

Unit-II

Syntax Analysis

1
Syllabus
• Role of Parser – Grammars – Error Handling – Context-free grammars
– Writing a grammar – Top Down Parsing – General Strategies
Recursive Descent Parser Predictive Parser-LL(1) Parser-Shift Reduce
Parser-LR Parser-LR (0)Item Construction of SLR Parsing Table -
Introduction to LALR Parser – Error Handling and Recovery in Syntax
Analyzer-YACC.

2
The role of the Parser

Token
Source Rest of
Lexical Parser front end
Program
Analyzer
Get next
token Intermediate
representation
Lexical error

Symbol Table

3
• Syntax Analyzer creates the syntactic structure of the given source
program.
• This syntactic structure is mostly a parse tree.
• Syntax Analyzer is also known as parser.
• The syntax of a programming is described by a context-free grammar
(CFG).
• The syntax analyzer (parser) checks whether a given source program
satisfies the rules implied by a context-free grammar or not.
• If it satisfies, the parser creates the parse tree of that program.
• Otherwise the parser gives the error messages.
4
Categorize the parsers into two groups:

1. Top-Down Parser
• the parse tree is created top to bottom, starting from the root.
2. Bottom-Up Parser
• the parse is created bottom to top; starting from the leaves

• Both top-down and bottom-up parsers scan the input from left to right (one symbol at a
time).
• Efficient top-down and bottom-up parsers can be implemented only for sub-classes of
context-free grammars.
• LL for top-down parsing
• LR for bottom-up parsing
5
Context-Free Grammars
• Inherently recursive structures of a programming language are defined by a
context-free grammar
G=(V,T,P,S)
• In a context-free grammar, consists of
• A finite set of terminals (in our case, this will be the set of tokens)
• A finite set of non-terminals (syntactic-variables)
• A finite set of productions rules in the following form
• A where A is a non-terminal and
 is a string of terminals and non-terminals
(including the empty string)
• A start symbol (one of the non-terminal symbol)

• Example:
E E+E | E–E | E*E | E/E | -E
E (E)
6
E  id
Derivations
E  E+E

• E+E derives from E


• we can replace E by E+E
• to able to do this, we have to have a production rule EE+E in our grammar.

E  E+E  id+E  id+id

• A sequence of replacements of non-terminal symbols is called a derivation of id+id from E.

• In general a derivation step is


A   if there is a production rule A in our grammar
where  and  are arbitrary strings of terminal and non-terminal
symbols

1  2  ...  n ( 1 derives n )

*
+ : derives in one step
 : derives in zero or more steps
7
 : derives in one or more steps
Derivation Example
E  -E  -(E)  -(E+E)  -(id+E)  -(id+id)
OR
E  -E  -(E)  -(E+E)  -(E+id)  -(id+id)

• At each derivation step, we can choose any of the non-terminal in the sentential form of G for the
replacement.

• If we always choose the left-most non-terminal in each derivation step, this derivation is called as
left-most derivation.

• If we always choose the right-most non-terminal in each derivation step, this derivation is called as
right-most derivation.
8
Left-Most and Right-Most
Derivations
Left-Most Derivation
lm lm lm lm lm
E  -E  -(E)  -(E+E)  -(id+E)  -(id+id)

Right-Most Derivation
rm rm rm rm rm
E  -E  -(E)  -(E+E)  -(E+id)  -(id+id)

• We will see that the top-down parsers try to find the left-most derivation of the given
source program.

• We will see that the bottom-up parsers try to find the right-most derivation of the given
source program in the reverse order.
9
Parse Tree
• Inner nodes of a parse tree are non-terminal symbols.
• The leaves of a parse tree are terminal symbols.

• A parse tree can be seen as a graphical representation of a derivation .

E  -E E
 -(E) E
 -(E+E)
E
- E - E - E

( E ) ( E )

E E E + E
- E - E
 -(id+E)  -(id+id)
( E ) ( E )

E + E E + E

id id id

10
Ambiguity
• A grammar produces more than one parse tree for a sentence is
called as an ambiguous grammar.
• unambiguous grammar
 unique selection of the parse tree for a sentence
E  E+E  id+E  E
id+E*E
E + E
 id+id*E  id+id*id
id E * E

id id

E
E  E*E  E+E*E 
id+E*E *
E E
 id+id*E  id+id*id
E + E id

id id

11
Notational Conventions

12
13
14
Elimination of Left Recursion
• A grammar is left recursive if it has a non-terminal A such that there is a
derivation.
+
A  A for some string 

• Top-down parsing techniques cannot handle left-recursive grammars.


• So, we have to convert our left-recursive grammar into an equivalent
grammar which is not left-recursive.
• The left-recursion may appear in a single step of the derivation (immediate
left-recursion), or may appear in more than one step of the derivation.

15
Immediate Left-Recursion
AA|  where  does not start with A

 eliminate immediate left recursion


A   A’
A’   A’ |  an equivalent grammar

In general,

A  A 1 | ... | A m | 1 | ... | n where 1 ... n do not


start with A

 eliminate immediate left recursion


A  1 A’ | ... | n A’
A’  1 A’ | ... | m A’ |  an equivalent grammar

16
Immediate Left-Recursion --
ExampleE  E+T |
T
T  T*F |
F
F  id |
(E)
 eliminate immediate left recursion

E  T E’
E’  +T E’ |

T  F T’
T’  *F T’ |

F  id |
(E)

17
Left-Recursion -- Problem
• A grammar cannot be immediately left-recursive, but it still can be
left-recursive.
• By just eliminating the immediate left-recursion, we may not get
a grammar which is not left-recursive.

S  Aa | b
A  Sc | d This grammar is not immediately
left-recursive,
but it is still left-recursive.

S  Aa  Sca or
A  Sc  Aac causes to a left-recursion

• So, we have to eliminate all left-recursions from our grammar

18
Eliminate Left-Recursion – Example
S  Aa | b
A  Ac | Sd | f

- Order of non-terminals: A, S

for A:
- we do not enter the inner loop.
- Eliminate the immediate left-recursion in A
A  SdA’ | fA’
A’  cA’ | 

for S:
- Replace S  Aa with S  SdA’a | fA’a
So, we will have S  SdA’a | fA’a | b
- Eliminate the immediate left-recursion in S
S  fA’aS’ | bS’
S’  dA’aS’ | 

So, the resulting equivalent grammar which is not left-recursive is:


S  fA’aS’ | bS’
S’  dA’aS’ | 
A  SdA’ | fA’
A’  cA’ | 

19
Eliminate Left-Recursion -- Algorithm
- Arrange non-terminals in some order: A1 ... An
- for i from 1 to n do {
- for j from 1 to i-1 do {
replace each production
Ai  A j 
by
Ai  1  | ... | k 
where Aj  1 | ... | k
}
- eliminate immediate left-recursions among Ai productions
}
20
Left Factoring
• A predictive parser insists that the grammar must be
left-factored.
• Given a non-terminal A, represent its rules as:
A  1 | 2 | … | 
•  is the longest matching prefix of several A productions
•  is the other productions that does not have leading 
•  should be eliminated to achieve predictive parsing
• Rewrite the production rules
A  A’ | 
A’  1 | 2 | …

11/08/2024 21
Left Factoring-Example1
S  iEtS | iEtSeS | a
Eb
( i stands for “if”; t stands for “then”; and e stands for “else”)
• Left factored, this grammar becomes:
S  iEtSS’ | a
S’  eS | є
Eb

11/08/2024 22
Left-Factoring – Example2
A  abB | aB | cdg | cdeB | cdfB

A  aA’ | cdg | cdeB | cdfB
A’  bB | B

A  aA’ | cdA’’
A’  bB | B
A’’  g | eB | fB
11/08/2024 23
Left-Factoring – Example3
A  ad | a | ab | abc | b

A  aA’ | b
A’  d |  | b | bc

A  aA’ | b
A’  d |  | bA’’
A’’   | c
11/08/2024 24
Top-Down Parsing
• The parse tree is created top to bottom.
• Top-down parser
• Recursive-Descent Parsing
• Backtracking is needed (If a choice of a production rule does not work,
we backtrack to try other alternatives.)
• It is a general parsing technique, but not widely used.
• Not efficient
• Predictive Parsing
• no backtracking
• efficient
• needs a special form of grammars (LL(1) grammars).
• Recursive Predictive Parsing is a special form of Recursive Descent
parsing without backtracking.
• Non-Recursive (Table Driven) Predictive Parser is also known as LL(1)
parser.
11/08/2024 25
Constructing of Predictive Parsing
Table

1. Eliminate Left recursion


2. Left factoring
3. Compute FIRST()
4. Compute FOLLOW()
5. Construct Parsing table
6. Parse the input string.

11/08/2024 26
FIRST & FOLLOW
• Computing First:
• If X is a terminal, then First(X) is {X}
• If X  є is a production, then add є to First(X)
• If X is a non-terminal and X  Y1 Y2 … Yk is a production, then place a in
First(X) if for some i, a is in First(Yi) and є is in all of First(Y1)…First(Yi-1)

11/08/2024 27
• Computing Follow:
• Place $ in Follow(S), where S is the start symbol and $ is the input right end marker.
• If there is a production A  αBβ, then everything in First(β) except for є is placed in Follow(B).
• If there is a production A  αB, or a production A  αBβ where First(β) contains є, then
everything in Follow(A) is in Follow(B)

11/08/2024 28
Example

FIRST FOLLOW
E  TE’ FOLLOW(E) = { $, ) }
E’  +TE’ | 
T  FT’ FOLLOW(E’) = { $, ) }
T’  *FT’ |  FOLLOW(T) = { +, ), $ }
F  (E) | id
FOLLOW(T’) = { +, ), $ }
FIRST(F) = {(,id} FOLLOW(F) = {+, *, ), $ }
FIRST(T’) = {*, }
FIRST(T) = {(,id}
FIRST(E’) = {+, }
FIRST(E) = {(,id}
11/08/2024 29
Non-recursive Predictive Parser

Input: a + b $

Stack: Output
X Predictive Parsing
Y Program
Z
$

Parsing
Table: M
11/08/2024 30
Constructing the Parsing Table
• Algorithm for constructing a predictive parsing table:
1. For each production A  α of the grammar, do steps 2 and 3
2. For each terminal a in First(α), add A  α to M[A, a]
3. If є is in First(α), add A  α to M[A, b] for each terminal b in Follow(A). If є is
in First(α) and $ is in Follow(A), add A  α to M[A, $].
4. Make each undefined entry of M be an error.

11/08/2024 31
FIRST(F) = {(,id} FOLLOW(E) = { $, ) }
FIRST(T’) = {*, } FOLLOW(E’) = { $, ) }

Parsing-Table FIRST(T) = {(,id}


FIRST(E’) = {+, }
FOLLOW(T) = { +, ), $ }
FOLLOW(T’) = { +, ), $ }
FIRST(E) = {(,id} FOLLOW(F) = {+, *, ), $ }

id + * ( ) $
E ETE’ ETE’
E’ E’+TE’ E’є E’є

T T FT’ TFT’
T’ T’є T’*FT’ T’є T’є

F Fid F(E)

11/08/2024 32
Stack Input Output

Parsing of Input $E
$E’T
id+id*id$
id+id*id$ E  TE’
String id+id*id $E’T’F id+id*id$ T  FT’
$E’T’id id+id*id$ F  id
$E’T’ +id*id$
id + * ( ) $ $E’ +id*id$ T’  є
E ETE’ ET $E’T+ +id*id$ E’  +TE’
E’ $E’T id*id$
E’ E’+ E’є E’є $E’T’F id*id$ T  FT’
TE’
$E’T’id id*id$ F  id
T T FT’ TF
$E’T’ *id$
T’
$E’T’F* *id$ T’  *FT’
T’ T’є T’* T’є T’є
FT’ $E’T’F id$
F Fid F(E $E’T’id id$ F  id
) $E’T’ $
$E’ $ T’  є
$ $ E’  є
11/08/2024 33
Bottom Up Parsing
• A bottom-up parser creates the parse tree of the given input starting
from leaves towards the root.
• A bottom-up parser tries to find the right-most derivation of the given
input in the reverse order.
• Bottom-up parsing is also known as shift-reduce parsing because its
two main actions are shift and reduce.

34
• 2 methods in shift reduce parsing
• Operator Precedence Parsing
• LR Parsing(Left-to-right, Rightmost derivation)
• SLR
• Canonical LR
• LALR

35
Shift-Reduce Parsing
Grammar: Reducing a sentence: Shift-reduce corresponds
SaABe abbcde to a rightmost derivation:
AAbc| aAbcde S rm a A B e
b aAde rm a A d e
Bd aABe rm a A b c d e
S rm a b b c d e
These match
production’s
right-hand sides

36
Handles
A handle is a substring of grammar symbols in a
right-sentential form that matches a right-hand side
of a production
Grammar: abbcde
SaABe aAbcde
AAbc| aAde Handle
b aABe
Bd S

Handle Pruning

A rightmost derivation in reverse can be obtained by


handle pruning.
37
A Stack Implementation of A Shift-Reduce
Parser
• There are four possible actions of a shift reduce parser action:
1. Shift : The next input symbol is shifted onto the top of the stack.
2. Reduce: Replace the handle on the top of the stack by the non-terminal.
3. Accept: Successful completion of parsing.
4. Error: Parser discovers a syntax error, and calls an error recovery
routine.

• Initial stack just contains only the end-marker $.


• The end of the input string is marked by the end-marker $.

38
Stack Implementation of
Shift-Reduce Parsing

Stack Input Action


$ id+id*id$ shift
$id +id*id$ reduce E  id
Grammar: $E +id*id$ shift
EE+ $E+ id*id$ shift
E $E+id *id$ reduce E  id
EE*
$E+E *id$ shift
E
E(E)
$E+E* id$ shift
E  id $E+E*id $ reduce E  id
$E+E*E $ reduce E  E * E
Find handles
$E+E $ reduce E  E + E
to reduce $E $ accept
39
Conflicts during Shift-Reduce Parsing
• Shift/reduce conflict
• Reduce/reduce conflict

40
LR-Parser Example: The Parsing Table
State Action Goto
id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 Acc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 R1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
41
LR Parsers
• The most powerful shift-reduce parsing (yet efficient) is:
LR(k) parsing.

left to right right-most k lookhead


scanning derivation (k is omitted  it is 1)

• LR parsing is attractive because:


• LR parsing is most general non-backtracking shift-reduce parsing, yet it is still efficient.
• The class of grammars that can be parsed using LR methods is a proper superset of the class of grammars that can
be parsed with predictive parsers.
LL(1)-Grammars  LR(1)-Grammars
• An LR-parser can detect a syntactic error as soon as it is possible to do so a left-to-right scan of the input.

42
The LR Parser Algorithm

input a1 a2 … ai … an $

stack

sm LR Parsing Program output

Xm
sm-1
Xm-1 action goto
… shift
s0 reduce
accept 43
error
Constructing LR(0) Item
• An LR(0) item of a grammar G is a production of G a
• dot at the some position of the right side.
Ex: A  aBb Possible LR(0) Items: A  aBb .
.
a Bb
(four different possibility) A 
.
A  aB b
aBb . A 

• Sets of LR(0) items will be the states of action and


goto table of the SLR parser.
• A collection of sets of LR(0) items (the canonical LR(0)
collection) is the basis for constructing SLR parsers.

44
1. Augmented Grammar
• G’ is G with a new production rule S’S where S’ is the new starting
symbol.

45
2. The Closure Operation
• If I is a set of LR(0) items for a grammar G, then closure(I) is the
set of LR(0) items constructed from I by the two rules:

.
1. Initially, every LR(0) item in I is added to closure(I).
.
2. If A   B is in closure(I) and B is a production rule of G; then B 
will be in the closure(I). We will apply this rule until no more new LR(0)
items can be added to closure(I).

46
Closure-Example
.
.
E’  E closure({E’  E}) =

.
E  E+T { E’  E

.
ET E E+T

.
T  T*F E T

.
TF T T*F

.
F  (E) T F

.
F  id F (E)
F id }

47
3. goto Operation
• If I is a set of LR(0) items and X is a grammar symbol (terminal or non-
terminal),

. then goto(I,X) is defined as follows:
.
If A   X in I then every item in closure({A  X }) will be in goto(I,X).

48
goto-example
Example:
I ={ .. .. .
E’  E, E  E+T, E  T,

. . ..
T  T*F, T  F,
F  (E), F  id }

.. .
goto(I,E) = { E’  E , E  E +T }
goto(I,T) = { E  T , T  T *F }

.. . . . .
goto(I,F) = {T  F }
goto(I,() = { F  ( E), E  E+T, E  T, T  T*F, T  . F,

. F  (E), F  id }
goto(I,id) = { F  id }

49
SLR Parsing TableAction Table Goto Table
state id + * ( ) $ E T F
0
1
2
3
4
5
6
7
8
9
10
11

50
Constructing SLR Parsing Table
(of an augumented grammar G’)

1. Construct the canonical collection of sets of LR(0) items for G’. C{I0,...,In}
2. Create the parsing action table as follows
• If a is a terminal, A.a in Ii and goto(Ii,a)=Ij then action[i,a] is shift j.
• If A. is in Ii , then action[i,a] is reduce A for all a in FOLLOW(A) where AS’.
• If S’S. is in Ii , then action[i,$] is accept.
• If any conflicting actions generated by these rules, the grammar is not SLR(1).
3. Create the parsing goto table
• for all non-terminals A, if goto(Ii,A)=Ij then goto[i,A]=j
4. All entries not defined by (2) and (3) are errors.
5. Initial state of the parser contains S’.S

51
Action Table Goto Table
state id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 acc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5

52
Kernel Item & Non Kernel Item

53
The LR Parser Algorithm

input a1 a2 … ai … an $

stack

sm LR Parsing Program output

Xm
sm-1
Xm-1 action goto
… shift
s0 reduce
accept 54
error
LR Parsing Algorithm
• The parsing table consists of two parts: a parsing action function and a
goto function.
• The LR parsing program determines sm, the state on top of the stack
and ai, the current input. It then consults action[s m, ai] which can take
one of four values:
• Shift
• Reduce
• Accept
• Error

55
LR Parsing Algorithm
• If action[sm, ai] = shift s, where s is a state, then the parser executes a
shift move.
• If action[sm, ai] = reduce A  β, then the parser executes a reduce
move.
• pop 2*|| items from the stack;
• If action[sm, ai] = accept, parsing is completed
• If action[sm, ai] = error, then the parser discovered an error.

56
LR Parsing Algorithm
set ip to point to the first symbol in w$
initialize stack to 0
repeat forever
let ‘s’ be top most state on stack & ‘a’ be symbol pointed to by ip
if action[s, a] = shift s’
push a then s’ onto stack
advance ip to next input symbol
else if action[s, a] = reduce A  
pop 2*|  | symbols of stack
let s’ be state now on top of stack
push A then goto[s’,A] onto stack
output production A  
else if action[s, a] == accept
return success
else
error()
57
Canonical Collection of Sets of LR(1) Items
• The construction of the canonical collection of the sets of LR(1) items are similar to the construction of
the canonical collection of the sets of LR(0) items, except that closure and goto operations work a little
bit different.

The closure function for LR(1) items is then defined as follows:


closure(I)
{
repeat
for each item A, a in I
for each production  in the grammar
each terminal b in FIRST(a)
add , b to I;
until no more items are added to I;
return I;
}
58
goto operation
• If I is a set of LR(1) items and X is a grammar symbol (terminal or
non-terminal), then goto(I,X) is defined as follows:
• If A  .X,a in I then every item in
closure({A  X.,a}) will be in goto(I,X).

59
Construction of The Canonical LR(1)
Collection
Item(G’)
{
C is { closure({S’.S,$}) }
repeat
for each I in C
for each grammar symbol X
if goto(I,X) is not empty and not in C
add goto(I,X) to C;
Until no new set of items are added to C;
}

60
An Example S’  S
1. S  C C
2. C  c C
3. C  d
I0: closure({(S’   S, $)}) =
(S’   S, $) I3: goto(I1, c) = I6: goto(I3, c) = : goto(I4, c) = I4
(S   C C, $) (C  c  C, c/d) (C  c  C, $)
(C   c C, c/d) (C   c C, c/d) (C   c C, $) : goto(I4, d) = I5
(C   d, c/d) (C   d, c/d) (C   d, $)
I9: goto(I7, c) =
I1: goto(I1, S) = (S’  S  , $) I4: goto(I1, d) = I7: goto(I3, d) =
(C  c C , $)
(C  d , c/d) (C  d , $)
I2: goto(I1, C) = : goto(I7, c) = I7
(S  C  C, $) I5: goto(I3, C) = I8: goto(I4, C) =
(C   c C, $) (S  C C , $) (C  c C , c/d) : goto(I7, d)61= I8
(C   d, $)
S’   S, $ I1
S   C C, $ S (S’  S  , $
C   c C, c/d
C   d, c/d
C I5
I0 S  C  C, $ C
C   c C, $ S  C C , $
C   d, $
I2
c I6
c
C  c  C, $ C
C   c C, $
d C   d, $ I9
c
d C  cC , $
I7
C  d , $
d c
C  c  C, c/d C I8
C   c C, c/d
C   d, c/d C  c C , c/d
I3
I4 d
C  d , c/d 62
An Example

c d $ S C
0 s3 s4 g1 g2
1 a
2 s6 s7 g5
3 s3 s4 g8
4 r3 r3
5 r1
6 s6 s7 g9
7 r3
8 r2 r2
9 r2

63
Construction of LR(1) Parsing Tables
1. Construct C’{I0,...,In} the canonical collection of sets of LR(1) items for G’.
2. Create the parsing action table as follows

.
If a is a terminal, A a,b in Ii and goto(Ii,a)=Ij then action[i,a] is shift j.

.
If A ,a is in Ii , then action[i,a] is reduce A where AS’.

.
If S’S ,$ is in Ii , then action[i,$] is accept.
• If any conflicting actions generated by these rules, the grammar is not LR(1).

3. Create the parsing goto table


• for all non-terminals A, if goto(Ii,A)=Ij then goto[i,A]=j

4. All entries not defined by (2) and (3) are errors.

5. Initial state of the parser contains S’.S,$

64
Construction of LALR Parsing Tables
1. Create the canonical LR(1) collection of the sets of LR(1) items for the given
grammar.
2. For each core present; find all sets having that same core; replace those sets
having same cores with a single set which is their union. C={I0,...,In} 
C’={J1,...,Jm} where m  n
3. Create the parsing tables (action and goto tables) same as the construction of the
parsing tables of LR(1) parser.
Note that: If J=I1  ...  Ik since I1,...,Ik have same cores
 cores of goto(I1,X),...,goto(I2,X) must be same.
So, goto(J,X)=K where K is the union of all sets of items having same cores as goto(I1,X).

4. If no conflict is introduced, the grammar is LALR(1) grammar.


65
Error Handling
• A good compiler should assist in identifying and locating errors
• Lexical errors
• Syntactic errors
• Semantic errors
• Logical errors

66
Error Recovery Strategies
• Panic mode
• Phrase-level recovery
• Error productions
• Global correction

67
Error Recovery in Predictive Parsing
• An error may occur in the predictive parsing (LL(1) parsing)
• if the terminal symbol on the top of stack does not match with the current
input symbol.
• if the top of stack is a non-terminal A, the current input symbol is a, and the
parsing table entry M[A,a] is empty.
• What should the parser do in an error case?
• The parser should be able to give an error message (as much as possible
meaningful error message).
• It should be recover from that error case, and it should be able to continue
the parsing with the rest of the input.

68
Panic-Mode Error Recovery in LL(1)
Parsing
• In panic-mode error recovery, we skip all the input symbols until a
synchronizing token is found.
• What is the synchronizing token?
• All the terminal-symbols in the follow set of a non-terminal can be used as a
synchronizing token set for that non-terminal.
• So, a simple panic-mode error recovery for the LL(1) parsing:
• All the empty entries are marked as synch to indicate that the parser will skip all the
input symbols until a symbol in the follow set of the non-terminal A which on the top
of the stack. Then the parser will pop that non-terminal A from the stack. The parsing
continues from that state.
• To handle unmatched terminal symbols, the parser pops that unmatched terminal
symbol from the stack and it issues an error message saying that unmatched terminal
is inserted.
69
70
Phrase-Level Error Recovery
• Each empty entry in the parsing table is filled with a pointer to a
special error routine which will take care that error case.
• These error routines may:
• change, insert, or delete input symbols.
• issue appropriate error messages
• pop items from the stack.
• We should be careful when we design these error routines, because
we may put the parser into an infinite loop.

71
72
Error Recovery in LR Parsing
• An LR parser will detect an error when it consults the parsing action
table and finds an error entry. All empty entries in the action table are
error entries.
• Errors are never detected by consulting the goto table.
• A canonical LR parser (LR(1) parser) will never make even a single
reduction before announcing an error.
• The SLR and LALR parsers may make several reductions before
announcing an error.
• But, all LR parsers (LR(1), LALR and SLR parsers) will never shift an
erroneous input symbol onto the stack.
73
Panic Mode Error Recovery in LR
Parsing
• Scan down the stack until a state s with a goto on a particular nonterminal A is
found.
• Discard zero or more input symbols until a symbol a is found that can legitimately
follow A.
• The symbol a is simply in FOLLOW(A), but this may not work for all situations.
• The parser stacks the nonterminal A and the state goto[s,A], and it resumes the
normal parsing.
T→T+F|F
F → ( E ) | id
• It scans the stack to find a state that has a "goto" action for some nonterminal (T in this case).
• It discards * (the unexpected symbol) and looks for the next symbol (id) that can follow T.
• The parser resumes by pushing T onto the stack, as if the invalid input * had never been encountered,
and continues parsing from there.
74
Phrase-Level Error Recovery in LR
Parsing
• Each empty entry in the action table is marked with a specific error
routine.
• An error routine reflects the error that the user most likely will make
in that case.
• An error routine inserts the symbols into the stack or the input (or it
deletes the symbols from the stack and the input, or it can do both
insertion and deletion).
• missing operand(e1)
• unbalanced right parenthesis(e2)
• Missing operator(e3)
• Missing right parenthesis(e4)

75
Example
EE+E
|E*E action goto
|(E)
| id
id + * ( ) $ E
missing operand(e1) 0 s3 e1 e1 s2 e2 e1 1
unbalanced right parenthesis(e2) 1 e3 s4 s5 e3 e2 acc
Missing operator(e3)
Missing right parenthesis(e4) 2 s3 e1 e1 s2 e2 e1 6
3 r4 r4 r4 r4 r4 r4
4 s3 e1 e1 s2 e2 e1 7
5 s3 e1 e1 s2 e2 e1 8
6 e3 s4 s5 e3 s9 e4
7 r1 r1 s5 r1 r1 r1
8 r2 r2 r2 r2 r2 r2 76
YACC
• LALR parser generator Yacc
• “Yet another compiler-compiler”
• Available on different platforms
• UNIX, Linux

77
Creating an Input/Output Translator with
Yacc
Yacc specification
Yacc compiler y.tab.c
translate.y

y.tab.c C compiler a.out

Input a.out output

78
Linking lex&yacc

79
• A Yacc source program has three parts
• declarations
%%
translation rules
%%
supporting C functions
• Ex:
• EE+T|T
TT*F|F
F(E)|digit

80
• %{
#include <stdio.h>
%}
%token DIGIT
%%
expr : expr ‘+’ term { $$=$1+$3; }
| term
;
term : term ‘*’ factor { $$=$1*$3; }
| factor
;
factor : ‘(‘ expr ‘)’ { $$ = $2; }
| DIGIT
;
%%
Auxiliary procedures

81

You might also like