0% found this document useful (0 votes)

189 views115 pages

Syntax Analysis

The document discusses syntax analysis, which involves checking the syntax of a program and constructing an abstract syntax tree. It covers topics such as context-free grammars, pushdown automata, parsing, ambiguity, and techniques to address issues like left recursion and the dangling else problem. Specifically, it defines context-free grammars using production rules, describes how they are used to define programming language syntax, and explains how parsers like recursive descent parsers use grammars to analyze syntax and construct parse trees. It also discusses limitations of context-free grammars and how techniques like precedence rules, left factorization, and rewriting grammars can address problems like ambiguity.

Uploaded by

Pavan Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

189 views115 pages

Syntax Analysis

Uploaded by

Pavan Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 115

Syntax Analysis

Check syntax and construct abstract

syntax tree

if
=

==
b

;
b

Error reporting and recovery

Model using context free grammars
Recognize using Push down
automata/Table Driven Parsers
1

What syntax analysis can not

do!
To check whether variables are of types
on which operations are allowed
To check whether a variable has been
declared before use
To check whether a variable has been
initialized
These issues will be handled in
semantic analysis
2

Limitations of regular languages

How to describe language syntax precisely and
conveniently. Can regular expressions be used?
Many languages are not regular for example
string of balanced parentheses
(((())))
{ ( i )i | i 0 }
There is no regular expression for this language

A finite automata may repeat states, however,

it can not remember the number of times it has
been to a particular state
A more powerful language is needed to
describe valid string of tokens
3

Syntax definition
Context free grammars
a set of tokens (terminal symbols)
a set of non terminal symbols
a set of productions of the form
nonterminal String of terminals & non terminals
a start symbol

<T, N, P, S>
A grammar derives strings by beginning with start
symbol and repeatedly replacing a non terminal by
the right hand side of a production for that non
terminal.
The strings that can be derived from the start
symbol of a grammar G form the language L(G)
defined by the grammar.
4

Examples
String of balanced parentheses
S(S)S|

Derivation
list list + digit
list digit + digit
digit digit + digit
9 digit + digit
9 5 + digit
95+2
Therefore, the string 9-5+2 belongs to the
language specified by the grammar
The name context free comes from the fact
that use of a production X does not
depend on the context of X
6

Examples
Grammar for Pascal block
block begin statements end
statements stmt-list |
stmtlist stmt-list ; stmt
| stmt
7

Syntax analyzers
Testing for membership whether w
belongs to L(G) is just a yes or no
answer
However the syntax analyzer
Must generate the parse tree
Handle errors gracefully if string is not in
the language

Form of the grammar is important

Many grammars generate the same
language
Tools are sensitive to the grammar
8

Derivation
If there is a production A then we say that A
derives and is denoted by A
A if A is a production
If 1 2 n then 1 n +
Given a grammar G and a string w of terminals
in L(G) we can write+ S w
If S * where is a string of terminals and non
terminals of G then we say that is a sentential
form of G
9

Derivation

If in a sentential form only the leftmost non

terminal is replaced then it becomes leftmost
derivation
Every leftmost step can be written as
wA lm* w
where w is a string of terminals and A is a
production
Similarly, right most derivation can be
defined
An ambiguous grammar is one that produces
more than one leftmost/rightmost derivation
of a sentence
10

Parse tree
It shows how the start symbol of a grammar
derives a string in the language
root is labeled by the start symbol
leaf nodes are labeled by tokens
Each internal node is labeled by a non
terminal
if A is a non-terminal labeling an internal
node and x1, x2, xn are labels of children of
that node then A x1 x2 xn is a production
11

Example
Parse tree for 9-5+2
list

list
digit

list

digit

9
12

Ambiguity
A Grammar can have more than
one parse tree for a string
Consider grammar
string string + string
| string string
|0|1||9
String 9-5+2 has two parse trees
13

string

string
9

string

string
2

Ambiguity

Ambiguity is problematic because meaning of

the programs can be incorrect
Ambiguity can be handled in several ways
Enforce associativity and precedence
Rewrite the grammar (cleanest way)

There are no general techniques for handling

ambiguity
It is impossible to convert automatically an
ambiguous grammar to an unambiguous one
15

Associativity

If an operand has operator on both the sides,

the side on which operator takes this operand
is the associativity of that operator
In a+b+c b is taken by left +
+, -, *, / are left associative
^, = are right associative
Grammar to generate strings with right
associative operators
right letter = right | letter
letter a| b || z
16

Precedence
String a+5*2 has two possible
interpretations because of two
different parse trees
corresponding to
(a+5)*2 and a+(5*2)
Precedence determines the
correct interpretation.
17

Parsing

Process of determination whether a string

can be generated by a grammar

Parsing falls in two categories:

Top-down parsing:
Construction of the parse tree starts at the root
(from the start symbol) and proceeds towards
leaves (token or terminals)
Bottom-up parsing:
Constructions of the parse tree starts from the leaf
nodes (tokens or terminals of the grammar) and
proceeds towards root (start symbol)
18

Example: Top down Parsing

Following grammar generates
types of Pascal
type simple
| id
| array [ simple] of type
simple integer
| char
| num dotdot num
19

Example
Construction of parse tree is done by
starting root labeled by start symbol
repeat following two steps
at node labeled with non terminal A select
one of the production of A and construct
children nodes
(Which production?)
find the next node at which subtree is
Constructed (Which node?)
20

Parse
array [ num dotdot num ] of integer

Start symbol

type
Expanded using the
rule type simple

simple
Can not proceed as non terminal simple never
generates a string beginning with token array.
Therefore, requires back-tracking.

Back-tracking is not desirable therefore, take help of

a look-ahead token. The current token is treated as
look-ahead token. (restricts the class of grammars)
21

array [ num dotdot num ] of integer

Start symbol

look-ahead

array

Expand using the rule

type array [ simple ] of
type

type
[

simple

type

Left most non terminal

Expand using the

num
rule
Simple num
dotdot num
all the tokens exhausted
Parsing completed

dotdot

num

Left most non terminal

Expand using the rule
type simple

simple
integer

Left most non terminal

Expand using the rule

simple integer

Recursive descent parsing

First set:

Let there be a production

A
then First() is set of tokens that appear
as the first token in the strings generated
from
For example :
First(simple) = {integer, char, num}
First(num dotdot num) = {num}
23

Define a procedure for each non terminal

procedure type;
if lookahead in {integer, char, num}
then simple
else if lookahead =
then begin match( );
match(id)
end
else if lookahead = array
then begin match(array);
match([);
simple;
match(]);
match(of);
type
end
else error;
24

procedure simple;
if lookahead = integer
then match(integer)
else if lookahead = char
then match(char)
else if lookahead = num
then begin match(num);
match(dotdot);
match(num)
end
else
error;
procedure match(t:token);
if lookahead = t
then lookahead = next token
else error;

Ambiguity
Dangling else problem
Stmt if expr then stmt
| if expr then stmt else stmt
according to this grammar, string
if el then if e2 then S1 else S2
has two parse trees
26

if e1
then if e2
then s1
else s2

stmt
if

expr
e1

if e1
then if e2
then s1
else s2

then
if

expr

else

then

stmt
if

stmt

expr

then

stmt

expr

then

stmt
s1

stmt

else

stmt
s2

Resolving dangling else problem

General rule: match each else with the closest previous
then. The grammar can be rewritten as
stmt matched-stmt
| unmatched-stmt
| others
matched-stmt if expr then matched-stmt
else matched-stmt
| others
unmatched-stmt if expr then stmt
| if expr then matched-stmt
else unmatched-stmt

Left recursion
A top down parser with production
A A may loop forever
From the grammar A A |
left recursion may be eliminated by
transforming the grammar to
A R
RR|
29

Parse tree
corresponding
to left recursive
grammar

Parse tree corresponding

to the modified grammar
A
R

Both the trees generate string *

Example
Consider grammar for arithmetic expressions
EE+T|T
TT*F|F
F ( E ) | id
After removal of left recursion the grammar becomes
E T E
E + T E |
T F T
T * F T |
F ( E ) | id

Removal of left recursion

In general
A A 1 | A 2 | .. |A m
| 1 | 2 | | n
transforms to
A 1A' | 2A' | ..| nA'
A' 1A' | 2A' |..| mA' |
32

Left recursion hidden due to

many productions

Left recursion may also be introduced by two or more

grammar rules. For example:
S Aa | b
A Ac | Sd |
there is a left recursion because
S Aa Sda

In such cases, left recursion is removed systematically

Starting from the first rule and replacing all the
occurrences of the first non terminal symbol
Removing left recursion from the modified grammar
33

Removal of left recursion due

to many productions

After the first step (substitute S by its rhs in the

rules) the grammar becomes

S Aa | b
A Ac | Aad | bd |
After the second step (removal of left recursion)
the grammar becomes

S Aa | b
A bdA' | A'
A' cA' | adA' |
34

Left factoring
In top-down parsing when it is not clear which
production to choose for expansion of a symbol
defer the decision till we have seen enough input.
In general if A 1 | 2
defer decision by expanding A to A'
we can then expand A to 1 or 2
Therefore A 1 | 2
transforms to
A A
A 1 | 2
35

Dangling else problem again

Dangling else problem can be handled by left
factoring
stmt if expr then stmt else stmt
| if expr then stmt
can be transformed to
stmt if expr then stmt S'
S' else stmt |
36

Predictive parsers
A non recursive top down parsing method
Parser predicts which production to use
It removes backtracking by fixing one
production for every non-terminal and input
token(s)
Predictive parsers accept LL(k) languages
First L stands for left to right scan of input
Second L stands for leftmost derivation
k stands for number of lookahead token
In practice LL(1) is used
37

Predictive parsing
Predictive parser can be implemented
by maintaining an external stack
input

stack

parser

output

Parse table is a
two dimensional array
M[X,a] where X is a
non terminal and a is
a terminal of the grammar

Parse
table
38

Parsing algorithm
The parser considers 'X' the symbol on top of stack, and 'a'
the current input symbol
These two symbols determine the action to be taken by the
parser
Assume that '$' is a special token that is at the bottom of
the stack and terminates the input string
if X = a = $ then halt
if X = a $ then pop(x) and ip++
if X is a non terminal
then if M[X,a] = {X UVW}
then begin pop(X); push(W,V,U)
end
else error
39

Example
Consider the grammar
E T E
E' +T E' |
T F T'
T' * F T' |
F ( E ) | id
40

Parse table for the grammar

id
E

T
F

ETE

E
T

ETE

E+TE

TFT

Fid

T*FT
F(E
)

Blank entries are error states. For example

E can not derive a string starting with +
41

Example
Stack

input

action

id + id * id $

expand by ETE

$ET

id + id * id $

expand by TFT

$ETF

id + id * id $

expand by Fid

$ETid

id + id * id $

$ET

+ id * id $

expand by T

+ id * id $

expand by E+TE

$ET+

+ id * id $

pop + and ip++

$ET

id * id $

pop id and ip++

expand by TFT

Example
Stack

input

action

$ETF

id * id $

expand by Fid

$ETid

id * id $

pop id and ip++

$ET

* id $

expand by T*FT

$ETF*

* id $

pop * and ip++

$ETF

id $

expand by Fid

$ETid

id $

pop id and ip++

$ET

expand by T

expand by E

halt
43

Constructing parse table

Table can be constructed if for every non terminal, every
lookahead symbol can be handled by at most one
production
First() for a string of terminals and non terminals is
Set of symbols that might begin the fully expanded
(made of only tokens) version of
Follow(X) for a non terminal X is
set of symbols that might follow the derivation of X
in the input stream

first

Compute first sets

If X is a terminal symbol then First(X) = {X}

If X is a production then is in First(X)
If X is a non terminal
and X YlY2 Yk is a production
then
if for some i, a is in First(Yi)
and is in all of First(Yj) (such that j<i)
then a is in First(X)
If is in First (Y1) First(Yk) then is in
First(X)
45

Example
For the expression grammar
E T E
E' +T E' |
T F T'
T' * F T' |
F ( E ) | id
First(E) = First(T) = First(F) = { (, id }
First(E') = {+, }
First(T') = { *, }
46

Compute follow sets

1. Place $ in follow(S)
2. If there is a production A B
then everything in first() (except ) is in follow(B)
3. If there is a production A B
then everything in follow(A) is in follow(B)
4. If there is a production A B
and First() contains
then everything in follow(A) is in follow(B)
Since follow sets are defined in terms of follow sets last
two steps have to be repeated until follow sets converge

Example
For the expression grammar
E T E
E' + T E' |
T F T'
T' * F T' |
F ( E ) | id

follow(E) = follow(E) = { $, ) }
follow(T) = follow(T) = { $, ), + }
follow(F) = { $, ), +, *}
48

Construction of parse table

for each production A do
for each terminal a in first()
M[A,a] = A
If is in First()
M[A,b] = A
for each terminal b in follow(A)
If is in First() and $ is in follow(A)
M[A,$] = A
A grammar whose parse table has no multiple entries is
called LL(1)
49

Practice Assignment
Construct LL(1) parse table for the expression
grammar
bexpr bexpr or bterm | bterm
bterm bterm and bfactor | bfactor
bfactor not bfactor | ( bexpr ) | true | false
Steps to be followed
Remove left recursion
Compute first sets
Compute follow sets
Construct the parse table
Not to be submitted
50

Error handling
Stop at the first error and print a message
Compiler writer friendly
But not user friendly
Every reasonable compiler must recover from error and identify as
many errors as possible
However, multiple error messages due to a single fault must be
avoided
Error recovery methods
Panic mode
Phrase level recovery
Error productions
Global correction
51

Panic mode
Simplest and the most popular method
Most tools provide for specifying panic
mode recovery in the grammar
When an error is detected
Discard tokens one at a time until a set of
tokens is found whose role is clear
Skip to the next token that can be placed
reliably in the parse tree
52

Panic mode
Consider following code
begin
a = b + c;
x=pr;
h = x < 0;
end;
The second expression has syntax error
Panic mode recovery for begin-end block
skip ahead to next ; and try to parse the next expression
It discards one expression and tries to continue parsing
May fail if no further ; is found
53

Phrase level recovery

Make local correction to the input
Works only in limited situations
A common programming error which is
easily detected
For example insert a ; after closing } of
a class definition

Does not work very well!

Error productions
Add erroneous constructs as productions in the
grammar
Works only for most common mistakes which
can be easily identified
Essentially makes common errors as part of the
grammar
Complicates the grammar and does not work
very well
55

Global corrections
Considering the program as a whole find a
correct nearby program
Nearness may be measured using certain
metric
PL/C compiler implemented this scheme:
anything could be compiled!
It is complicated and not a very good idea!
56

Error Recovery in LL(1) parser

Error occurs when a parse table entry M[A,a]
is empty
Skip symbols in the input until a token in a
selected set (synch) appears
Place symbols in follow(A) in synch set. Skip
tokens until an element in follow(A) is seen.
Pop(A) and continue parsing
Add symbol in first(A) in synch set. Then it
may be possible to resume parsing according
to A if a symbol in first(A) appears in input.
57

Assignment
Reading assignment: Read about error
recovery in LL(1) parsers
Assignment to be submitted:
introduce synch symbols (using both follow
and first sets) in the parse table created for
the boolean expression grammar in the
previous assignment
Parse not (true and or false) and show
how error recovery works
Due on todate+10
58

Bottom up parsing
Construct a parse tree for an input string beginning at
leaves and going towards root
OR
Reduce a string w of input to start symbol of grammar
Consider a grammar
S aABe
A Abc | b
Bd
And reduction of a string
a bbcde
aAbcde
aAde
aABe
S

Right most derivation

SaABe
aAde
aAbcde
abbcde
59

Shift reduce parsing

Split string being parsed into two parts
Two parts are separated by a special
character .
Left part is a string of terminals and non
terminals
Right part is a string of terminals

Initially the input is

Shift reduce parsing

Bottom up parsing has two actions

Shift: move terminal symbol from right string

to left string
if string before shift is
then string after shift is

.pqr
p.qr

Reduce: immediately on the left of . identify

a string same as RHS of a production and
replace it by LHS
if string before reduce action is .pqr
and A is a production
then string after reduction is A.pqr
61

Example
Assume grammar is
Parse id*id+id

E E+E | E*E | id

String
action
.id*id+id shift
id.*id+id reduce Eid
E.*id+id shift
E*.id+id shift
E*id.+id reduce Eid
E*E.+id reduce EE*E
E.+id
shift
E+.id
shift
E+id.
Reduce Eid
E+E.
Reduce EE+E
E. ACCEPT
62

Shift reduce parsing

Symbols on the left of . are kept on a stack
Top of the stack is at .
Shift pushes a terminal on the stack
Reduce pops symbols (rhs of production) and pushes
a non terminal (lhs of production) onto the stack

The most important issue: when to shift and

when to reduce
Reduce action should be taken only if the
result can be reduced to the start symbol
63

Bottom up parsing
A more powerful parsing technique
LR grammars more expensive than LL
Can handle left recursive grammars
Can handle virtually all the programming languages
Natural expression of programming language syntax
Automatic generation of parsers (Yacc, Bison etc.)
Detects errors as soon as possible
Allows better error recovery
64

Issues in bottom up parsing

How do we know which action to take
whether to shift or reduce
Which production to use for reduction?
Sometimes parser can reduce but it should not:
X can always be reduced!
Sometimes parser can reduce in different ways!
Given stack and input symbol a, should the parser
Shift a onto stack (making it a)
Reduce by some production A assuming that
stack has form (making it A)
Stack can have many combinations of
How to keep track of length of ?
65

Handle
A string that matches right hand side of a production and
whose replacement gives a step in the reverse right most
derivation
If S rm* Aw rm w then (corresponding to production
A ) in the position following is a handle of w. The
string w consists of only terminal symbols
We only want to reduce handle and not any rhs
Handle pruning: If is a handle and A is a production
then replace by A
A right most derivation in reverse can be obtained by
handle pruning.

Handles
Handles always appear at the top of the stack
and never inside it
This makes stack a suitable data structure
Consider two cases of right most derivation to
verify the fact that handle appears on the top of
the stack
S Az Byz yz
S BxAz Bxyz xyz

Bottom up parsing is based on recognizing

handles
67

Handle always appears on the top

Case I: S Az Byz yz
stack
yz
B yz
By
A z

input action
reduce by B
shift y
z
reduce by A By

Case II: S BxAz Bxyz xyz

stack
xyz
B xyz
Bx yz
Bxy
BxA

input action
reduce by B
shift x
shift y
z
reduce Ay
z

Conflicts
The general shift-reduce technique is:
if there is no handle on the stack then shift
If there is a handle then reduce

However, what happens when there is a

choice
What action to take in case both shift and reduce are
valid?
shift-reduce conflict
Which rule to use for reduction if reduction is
possible by more than one rule?
reduce-reduce conflict

Conflicts come either because of ambiguous

grammars or parsing method is not powerful
enough
69

Shift reduce conflict

Consider the grammar E E+E | E*E | id
and input
id+id*id

stack input action

stack input action
E+E
E
E*
E*id
E*E

*id
*id
id

reduce by EE+E
shift
shift
reduce by Eid
reduce byEE*EE

E+E
*id
E+E* id
E+E*id
E+E*E
E+E

shift
shift
reduce by Eid
reduce byEE*E
reduce byEE+EE

Reduce reduce conflict

Consider the grammar M R+R | R+c | R

Rc
and input
c+c
Stack

c
R
R+
R+c
R+R

input

c+c
+c
+c
c

action

shift
reduce by Rc
shift
shift
reduce by Rc
reduce by R+RM

Stack
c
R
R+
R+c

input action
c+c
shift
+c
reduce by Rc
+c
shift
c
shift
reduce by MR+cM

LR parsing
input

stack

parser

Input contains the input

string.

Stack contains a string of the

form S0X1S1X2XnSn
where each Xi is a grammar
symbol and each Si is a state.

Tables contain action and goto

parts.

action table is indexed by state

and terminal symbols.

goto table is indexed by state

and non terminal symbols.

output

action goto

Parse table

Actions in an LR (shift reduce)

parser

Assume Si is top of stack and ai is

current input symbol

Action [Si,ai] can have four values

1.
2.
3.
4.

shift ai to the stack and goto state Sj

reduce by a rule
Accept
error
73

Configurations in LR parser
Stack: S0X1S1X2XmSm

Input: aiai+1an$

If action[Sm,ai] = shift S
Then the configuration becomes
Stack: S0X1S1XmSmaiS Input: ai+1an$
If action[Sm,ai] = reduce A
Then the configuration becomes
Stack: S0X1S1Xm-rSm-r AS Input: aiai+1an$
Where r = || and S = goto[Sm-r,A]
If action[Sm,ai] = accept
Then parsing is completed. HALT
If action[Sm,ai] = error
Then invoke error recovery routine.
74

LR parsing Algorithm
Initial state:

Stack: S0

Input: w$

Loop{
if action[S,a] = shift S
then push(a); push(S); ip++
else if action[S,a] = reduce A
then pop (2*||) symbols;
push(A); push (goto[S,A])
(S is the state after popping symbols)

else if action[S,a] = accept

then exit
else error
}
75

Example

EE+T | T
TT*F | F
F ( E ) | id

Consider the grammar

And its parse table
State

s4
r6

acc

r6
10

s11

Parse id + id * id
Stack
0
0 id 5
0F3
0T2
0E1
0E1+6
0 E 1 + 6 id 5
0E1+6F3
0E1+6T9
0E1+6T9*7
0 E 1 + 6 T 9 * 7 id 5
0 E 1 + 6 T 9 * 7 F 10
0E1+6T9
0E1

Input
id+id*id$
+id*id$
+id*id$
+id*id$
+id*id$
id*id$
*id$
*id$
*id$
id$
$
$
$
$

Action
shift 5
reduce by Fid
reduce by TF
reduce by ET
shift 6
shift 5
reduce by Fid
reduce by TF
shift 7
shift 5
reduce by Fid
reduce by TT*F
reduce by EE+T
ACCEPT

Parser states
Goal is to know the valid reductions at any
given point
Summarize all possible stack prefixes as a
parser state
Parser state is defined by a DFA state that
reads in the stack
Accept states of DFA are unique reductions
78

Constructing parse table

Augment the grammar
G is a grammar with start symbol S
The augmented grammar G for G
has a new start symbol S and an
additional production S S
When the parser reduces by this
rule it will stop with accept
79

Viable prefixes
is a viable prefix of the grammar if
There is a w such that w is a right sentential form
.w is a configuration of the shift reduce parser

As long as the parser has viable prefixes on

the stack no parser error has been seen
The set of viable prefixes is a regular language
(not obvious)
Construct an automaton that accepts viable
prefixes
80

LR(0) items
An LR(0) item of a grammar G is a production of G with
a special symbol . at some position of the right side
Thus production AXYZ gives four LR(0) items
A .XYZ
A X.YZ
A XY.Z
A XYZ.
An item indicates how much of a production has been
seen at a point in the process of parsing
Symbols on the left of . are already on the stacks
Symbols on the right of . are expected in the input
81

Start state
Start state of DFA is empty stack
corresponding to S.S item
This means no input has been seen
The parser expects to see a string derived from S

Closure of a state adds items for all

productions whose LHS occurs in an item in
the state, just after .
Set of possible productions to be reduced next
Added items have . located at the beginning
No symbol of these items is on the stack as yet

Closure operation
If I is a set of items for a grammar G then
closure(I) is a set constructed as follows:
Every item in I is in closure (I)
If A .B is in closure(I) and B is a
production then B . is in closure(I)

Intuitively A .B indicates that we might

see a string derivable from B as input
If input B is a production then we might
see a string derivable from at this point
83

Example
Consider the grammar
E E
EE+T | T
TT*F | F
F ( E ) | id
If I is { E .E } then closure(I) is
E .E
E .E + T
E .T
T .T * F
T .F
F .id
F .(E)
84

Applying symbols in a state

In the new state include all the
items that have appropriate input
symbol just after the .
Advance . in those items and
take closure

Goto operation
Goto(I,X) , where I is a set of items and X is a grammar
symbol,
is closure of set of item A X.
such that A .X is in I

Intuitively if I is set of items for some valid prefix

then goto(I,X) is set of valid items for prefix X
If I is { EE. , EE. + T } then goto(I,+) is
E E + .T
T .T * F
T .F
F .(E)
F .id

Sets of items
C : Collection of sets of LR(0) items for
grammar G
C = { closure ( { S .S } ) }
repeat
for each set of items I in C
and each grammar symbol X
such that goto (I,X) is not empty and not
in C
ADD goto(I,X) to C
until no more additions
87

Example
Grammar:
E E
E E+T | T
T T*F | F
F (E) | id
I0: closure(E.E)
E .E
E .E + T
E .T
T .T * F
T .F
F .(E)
F .id
I1: goto(I0,E)
E E.
E E. + T

I2: goto(I0,T)
E T.
T T. *F
I3: goto(I0,F)
T F.
I4: goto( I0,( )
F (.E)
E .E + T
E .T
T .T * F
T .F
F .(E)
F .id
I5: goto(I0,id)
F id.

I6: goto(I1,+)
E E + .T
T .T * F
T .F
F .(E)
F .id
I7: goto(I2,*)
T T * .F
F .(E)
F .id
I8: goto(I4,E)
F (E.)
E E. + T
goto(I4,T) is I2
goto(I4,F) is I3
goto(I4,( ) is I4
goto(I4,id) is I5

I9: goto(I6,T)
E E + T.
T T. * F
goto(I6,F) is I3
goto(I6,( ) is I4
goto(I6,id) is I5
I10: goto(I7,F)
T T * F.
goto(I7,( ) is I4
goto(I7,id) is I5
I11: goto(I8,) )
F (E).
goto(I8,+) is I6
goto(I9,*) is I7
89

I6
(

(
(

*
)

I8
I2

I11
id

I10

I3
90

I6
E

I4
T

I11

I10

I3
91

(
T

I4
F

I6
(

*
)

I11
id

I10

I3
92

Construct SLR parse table

Construct C={I0, , In} the collection of sets of LR(0)

items

If A.a is in Ii and goto(Ii,a) = Ij

then action[i,a] = shift j

If A. is in Ii
then action[i,a] = reduce A for all a in follow(A)

If S'S. is in Ii then action[i,$] = accept

If goto(Ii,A) = Ij
then goto[i,A]=j for all non terminals A

All entries not defined are errors

Notes
This method of parsing is called SLR (Simple LR)
LR parsers accept LR(k) languages
L stands for left to right scan of input
R stands for rightmost derivation
k stands for number of lookahead token

SLR is the simplest of the LR parsing methods. It is too weak to

handle most languages!
If an SLR parse table for a grammar does not have multiple
entries in any cell then the grammar is unambiguous
All SLR grammars are unambiguous
Are all unambiguous grammars in SLR?
94

Assignment
Construct SLR parse table for following grammar
E E + E | E - E | E * E | E / E | ( E ) | digit
Show steps in parsing of string
9*5+(2+3*7)
Steps to be followed

Augment the grammar

Construct set of LR(0) items
Construct the parse table
Show states of parser as the given string is parsed

Due on todate+5

Example
Consider following grammar and its SLR parse table:
S S
SL=R
SR
L *R
L id
RL
I0: S .S
S .L=R
S .R
L .*R
L .id
R .L

I1: goto(I0, S)
S S.
I2: goto(I0, L)
S L.=R
R L.

Assignment (not
to be
submitted):
Construct rest
of the items and
the parse table.
96

SLR parse table for the grammar

=
0

1
2

s6,r6

r6
r3

r5
s4

acc

3
5

The table has multiple entries in action[2,=]

There is both a shift and a reduce entry in action[2,=]. Therefore state 2 has
a shift-reduce conflict on symbol =, However, the grammar is not
ambiguous.

Parse id=id assuming reduce action is taken in [2,=]

Stack
input
action
0 id=id
shift 5
0 id 5
=id
reduce by Lid
0L2
=id
reduce by RL
0R3
=id
error

if shift action is taken in [2,=]

Stackinput
action
0
id=id$ shift 5
0 id 5
=id$
reduce by Lid
0L2
=id$
shift 6
0L2=6
id$
shift 5
0 L 2 = 6 id 5
$
reduce by Lid
0L2=6L8$
reduce by RL
0 L 2 = 6 R 9$
reduce by SL=R
0 S 1$
ACCEPT
98

Problems in SLR parsing

No sentential form of this grammar can start with R=

However, the reduce action in action[2,=] generates a

sentential form starting with R=

Therefore, the reduce action is incorrect

In SLR parsing method state i calls for reduction on symbol

a, by rule A if Ii contains [A.] and a is in follow(A)

However, when state I appears on the top of the stack, the

viable prefix on the stack may be such that A can not be
followed by symbol a in any right sentential form

Thus, the reduction by the rule A on symbol a is invalid

SLR parsers can not remember the left context

Canonical LR Parsing
Carry extra information in the state so that
wrong reductions by A will be ruled out
Redefine LR items to include a terminal
symbol as a second component (look ahead
symbol)
The general form of the item becomes [A
., a] which is called LR(1) item.
Item [A ., a] calls for reduction only if next
input is a. The set of symbols as will be a
subset of Follow(A).
100

Closure(I)
repeat
for each item [A .B, a] in I
for each production B in G'
and for each terminal b in First(a)
add item [B ., b] to I
until no more additions to I

101

Example
Consider the following grammar
S S
S CC
C cC | d
Compute closure(I) where I={[S .S, $]}
S .S,
S .CC,
C .cC,
C .cC,
C .d,
C .d,

$
$
c
d
c
d
102

Example
Construct sets of LR(1) items for the grammar on previous
slide
I0: S .S,
S .CC,
C .cC,
C .d,

$
$
c/d
c/d

I4: goto(I0,d)
C d.,

c/d

I5: goto(I2,C)
S CC.,

$
$
$
$

I1: goto(I0,S)
S S.,

I2: goto(I0,C)
S C.C,
C .cC,
C .d,

I6: goto(I2,c)
C c.C,
C .cC,
C .d,

$
$
$

I7: goto(I2,d)
C d.,

I8: goto(I3,C)
C cC.,

c/d

I9: goto(I6,C)
C cC.,

I3: goto(I0,c)
C c.C,
C .cC,
C .d,

c/d
c/d
c/d

103

Construction of Canonical LR
parse table
Construct C={I0, ,In} the sets of LR(1) items.

If [A .a, b] is in Ii and goto(Ii, a)=Ij

then action[i,a]=shift j

If [A ., a] is in Ii
then action[i,a] reduce A

If [S S., $] is in Ii
then action[i,$] = accept

If goto(Ii, A) = Ij then goto[i,A] = j for all non

terminals A
104

Parse table
State

acc

5
6

r1
s6

7
8
9

9
r3

r2
r2
105

Notes on Canonical LR Parser

Consider the grammar discussed in the previous two
slides. The language specified by the grammar is
c*dc*d.
When reading input ccdccd the parser shifts cs into
stack and then goes into state 4 after reading d. It then
calls for reduction by Cd if following symbol is c or d.
IF $ follows the first d then input string is c*d which is
not in the language; parser declares an error
On an error canonical LR parser never makes a wrong
shift/reduce move. It immediately declares an error
Problem: Canonical LR parse table has a large number
of states
106

LALR Parse table

Look Ahead LR parsers
Consider a pair of similar looking states
(same kernel and different lookaheads) in the
set of LR(1) items
I4: C d. , c/d I7: C d., $
Replace I4 and I7 by a new state I47 consisting
of
(C d., c/d/$)
Similarly I3 & I6 and I8 & I9 form pairs
Merge LR(1) items having the same core
107

Construct LALR parse table

Construct C={I0,,In} set of LR(1) items

For each core present in LR(1) items find all sets having
the same core and replace these sets by their union

Let C' = {J0,.,Jm} be the resulting set of items

Construct action table as was done earlier

Let J = I1 U I2.U Ik
since I1 , I2., Ik have same core, goto(J,X) will have
he same core
Let K=goto(I1,X) U goto(I2,X)goto(Ik,X) the
goto(J,X)=K
108

LALR parse table

State

s36

s47

acc

s36

s47

s36

s47

5
89

r3
r1

109

Notes on LALR parse table

Modified parser behaves as original except
that it will reduce Cd on inputs like ccd. The
error will eventually be caught before any
more symbols are shifted.
In general core is a set of LR(0) items and
LR(1) grammar may produce more than one
set of items with the same core.
Merging items never produces shift/reduce
conflicts but may produce reduce/reduce
conflicts.
SLR and LALR parse tables have same number of states.
110

Notes on LALR parse table

Merging items may result into conflicts in LALR
parsers which did not exist in LR parsers
New conflicts can not be of shift reduce kind:
Assume there is a shift reduce conflict in some state of
LALR parser with items
{[X.,a],[Y.a,b]}
Then there must have been a state in the LR parser with
the same core
Contradiction; because LR parser did not have conflicts

LALR parser can have new reduce-reduce conflicts

Assume states
{[X., a], [Y., b]} and {[X., b], [Y., a]}
Merging the two states produces
{[X., a/b], [Y., a/b]}
111

Notes on LALR parse table

LALR parsers are not built by first making
canonical LR parse tables
There are direct, complicated but efficient
algorithms to develop LALR parsers
Relative power of various classes
SLR(1) LALR(1) LR(1)
SLR(k) LALR(k) LR(k)
LL(k) LR(k)
112

Error Recovery
An error is detected when an entry in the action table is
found to be empty.
Panic mode error recovery can be implemented as
follows:
scan down the stack until a state S with a goto on a
particular nonterminal A is found.
discard zero or more input symbols until a symbol a is
found that can legitimately follow A.
stack the state goto[S,A] and resume parsing.

Choice of A: Normally these are non terminals

representing major program pieces such as an
expression, statement or a block. For example if A is
the nonterminal stmt, a might be semicolon or end.
113

Parser Generator
Some common parser generators

YACC: Yet Another Compiler Compiler

Bison: GNU Software

ANTLR: ANother Tool for Language Recognition

Yacc/Bison source program specification

(accept LALR grammars)
declaration
%%
translation rules
%%
supporting C routines
114

Yacc and Lex schema

Token
specifications
Grammar
specifications

Lex
Yacc

Lex.yy.c

C code for
parser

C code for lexical analyzer

y.tab.c
C
Compiler
Object code

Input
program

Parser

Abstract
Syntax tree

Refer to YACC Manual

115

Mens Health Big Book of Exercises Issue 2013 Preview
100% (1)
Mens Health Big Book of Exercises Issue 2013 Preview
10 pages
18CVP-0325 Chase Complaint
No ratings yet
18CVP-0325 Chase Complaint
14 pages
Top Down Parsing
No ratings yet
Top Down Parsing
22 pages
Sentence Structure
No ratings yet
Sentence Structure
2 pages
Construction of Syntax Trees
67% (3)
Construction of Syntax Trees
7 pages
We asked you to let Bradley REDEEM HIS OWN BID BOND AND STOP RETALIATING AGIANST HIM IN YOUR ARTICLE I FAKE COURT USING BABY CLERKS TO SIGN FAKE ORDERS Attached is what we gave you and everyone got a copy!!!!
No ratings yet
We asked you to let Bradley REDEEM HIS OWN BID BOND AND STOP RETALIATING AGIANST HIM IN YOUR ARTICLE I FAKE COURT USING BABY CLERKS TO SIGN FAKE ORDERS Attached is what we gave you and everyone got a copy!!!!
201 pages
A Treatise On The Law of Obligations
100% (1)
A Treatise On The Law of Obligations
718 pages
FDIC v. Fedders Air Cond., 35 F.3d 18, 1st Cir. (1994)
No ratings yet
FDIC v. Fedders Air Cond., 35 F.3d 18, 1st Cir. (1994)
11 pages
Three Certainties
No ratings yet
Three Certainties
7 pages
Complaints For Defamation and Commercial Disparagement
50% (2)
Complaints For Defamation and Commercial Disparagement
54 pages
Merchant Banking
No ratings yet
Merchant Banking
10 pages
Amended Complaint in Equity For A Bill of Pure Discovery, Marc J. Randazza Harassment Suit
No ratings yet
Amended Complaint in Equity For A Bill of Pure Discovery, Marc J. Randazza Harassment Suit
13 pages
Summary Brief of The Known Actions of The United States Government in Re: The Chinese Government's Defaulted Sovereign Bonds
100% (1)
Summary Brief of The Known Actions of The United States Government in Re: The Chinese Government's Defaulted Sovereign Bonds
239 pages
Damages of $37,822,100 Demanded of 31 Federal Actors in The Houston Case Criminal Complaint Filed With Military
No ratings yet
Damages of $37,822,100 Demanded of 31 Federal Actors in The Houston Case Criminal Complaint Filed With Military
36 pages
Motu Propio
No ratings yet
Motu Propio
2 pages
Vocabulary 1
No ratings yet
Vocabulary 1
185 pages
Memorandum Opinion and Order: Alabama Health Care Authority D/b/a East Alabama Medical Center and Terry Andrus
No ratings yet
Memorandum Opinion and Order: Alabama Health Care Authority D/b/a East Alabama Medical Center and Terry Andrus
13 pages
E-Sign Act: Electronic Promissory Notes Electronic Security Instruments
No ratings yet
E-Sign Act: Electronic Promissory Notes Electronic Security Instruments
18 pages
Souvenir Product Order Search Form
No ratings yet
Souvenir Product Order Search Form
1 page
Name Change Lower Case - Decembre 1441
No ratings yet
Name Change Lower Case - Decembre 1441
2 pages
Equity
100% (1)
Equity
2 pages
Chap22 AfterProtest 2003 08 18
100% (2)
Chap22 AfterProtest 2003 08 18
2 pages
Mead V Sallie Mae Pennsylvania PLUS Loans
No ratings yet
Mead V Sallie Mae Pennsylvania PLUS Loans
27 pages
Affidavits - Writing Them
100% (3)
Affidavits - Writing Them
3 pages
Courtdoc PDF
No ratings yet
Courtdoc PDF
709 pages
NRA's Guns For Felons Program: How The NRA Works To Rearm Criminals
No ratings yet
NRA's Guns For Felons Program: How The NRA Works To Rearm Criminals
34 pages
BKS Unit II-Syntax Directed Definitions New
No ratings yet
BKS Unit II-Syntax Directed Definitions New
35 pages
3rd Notice To Cure 1
No ratings yet
3rd Notice To Cure 1
2 pages
Doctrine of Ultra Vires
0% (1)
Doctrine of Ultra Vires
2 pages
Chapter 18
No ratings yet
Chapter 18
15 pages
Basic Finance - Credit Instruments P9
100% (1)
Basic Finance - Credit Instruments P9
29 pages
The Power of Congress To Limit The Jurisdiction of Federal Courts: An Exercise in Dialectic
No ratings yet
The Power of Congress To Limit The Jurisdiction of Federal Courts: An Exercise in Dialectic
41 pages
Abandon 190321 Mike
No ratings yet
Abandon 190321 Mike
9 pages
US Constitution
No ratings yet
US Constitution
21 pages
15 Syntax Parsing
No ratings yet
15 Syntax Parsing
30 pages
Ethics RA 6713
No ratings yet
Ethics RA 6713
30 pages
The Seven Point Content of A Commercial Process or Instrument
No ratings yet
The Seven Point Content of A Commercial Process or Instrument
1 page
Alpha Anywhere QuickStartGuide v12
No ratings yet
Alpha Anywhere QuickStartGuide v12
139 pages
Unconsionable Conduct in Equity
100% (1)
Unconsionable Conduct in Equity
3 pages
Void Judgement
No ratings yet
Void Judgement
11 pages
Lawsuit Against Atlantic Co. Prosecutor Damon Tyner
100% (2)
Lawsuit Against Atlantic Co. Prosecutor Damon Tyner
100 pages
Blank Endorsement of A Financial Instrument Such As A Check Is Only A Signature
No ratings yet
Blank Endorsement of A Financial Instrument Such As A Check Is Only A Signature
1 page
From Jobless To Jackpot 2nd
No ratings yet
From Jobless To Jackpot 2nd
13 pages
Document - 2023-11!10!112246-2nd Affi Hudson Hyundia
No ratings yet
Document - 2023-11!10!112246-2nd Affi Hudson Hyundia
5 pages
Abstract Syntax: CMSC CS431
No ratings yet
Abstract Syntax: CMSC CS431
109 pages
NetUP IPTV Guide en 2 PDF - 2 PDF
No ratings yet
NetUP IPTV Guide en 2 PDF - 2 PDF
170 pages
Driving
50% (2)
Driving
2 pages
Document - 2023-01!26!152311unrebutted Affidavit To Mayor J Christian Bolwage
No ratings yet
Document - 2023-01!26!152311unrebutted Affidavit To Mayor J Christian Bolwage
6 pages
Clean PSQrevg
No ratings yet
Clean PSQrevg
5 pages
Trusts Divide and Conquer PDF
No ratings yet
Trusts Divide and Conquer PDF
10 pages
Document - 2024-10!26!150709-All Debts Are Fraud
No ratings yet
Document - 2024-10!26!150709-All Debts Are Fraud
3 pages
Legal Dictionary
No ratings yet
Legal Dictionary
16 pages
Syntax Analysis
No ratings yet
Syntax Analysis
115 pages
Top Down Parsing
No ratings yet
Top Down Parsing
37 pages
Parsing - 1
No ratings yet
Parsing - 1
59 pages
L4 Formal Grammers
No ratings yet
L4 Formal Grammers
23 pages
Syntax Analysis: - Check Syntax and Construct Abstract Syntax Tree
No ratings yet
Syntax Analysis: - Check Syntax and Construct Abstract Syntax Tree
22 pages
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
No ratings yet
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
44 pages
Pec 31 Acd Material
No ratings yet
Pec 31 Acd Material
12 pages
Chapter 2 - Simple Syntax Directed Translator
No ratings yet
Chapter 2 - Simple Syntax Directed Translator
39 pages
L5 TopDownParsing
No ratings yet
L5 TopDownParsing
30 pages
Lecture4 - Problems On TRANSITIVE RELATION
No ratings yet
Lecture4 - Problems On TRANSITIVE RELATION
5 pages
Lung Cancer Detection Using CT Scan Images: Sciencedirect
No ratings yet
Lung Cancer Detection Using CT Scan Images: Sciencedirect
8 pages
Lecture 1
No ratings yet
Lecture 1
23 pages
Computing System 1
No ratings yet
Computing System 1
8 pages
C Operator Precedence Table
No ratings yet
C Operator Precedence Table
1 page
06.intermediate Code Generation
No ratings yet
06.intermediate Code Generation
41 pages
CRP Roster Points-Kadapa District
No ratings yet
CRP Roster Points-Kadapa District
51 pages
SLR (1) PARSER/LR (0) Parser: Example:1
No ratings yet
SLR (1) PARSER/LR (0) Parser: Example:1
4 pages
Modul GUI Java
No ratings yet
Modul GUI Java
19 pages
CD Questions (Unit-3)
No ratings yet
CD Questions (Unit-3)
5 pages
2023 24 PCD UNIT 2 Modified
No ratings yet
2023 24 PCD UNIT 2 Modified
40 pages
MODULE 3 Syntax Analysis
100% (1)
MODULE 3 Syntax Analysis
182 pages
Introduction To Bottom Up Parser
No ratings yet
Introduction To Bottom Up Parser
75 pages
Compiler Design Notes CSE
No ratings yet
Compiler Design Notes CSE
79 pages
Compiler Design: 7. Top-Down Table-Driven Parsing
No ratings yet
Compiler Design: 7. Top-Down Table-Driven Parsing
9 pages
Local and Global Sequence Alignment 5+5 Examples
No ratings yet
Local and Global Sequence Alignment 5+5 Examples
10 pages
A Ad - A - Ab - Abc - B: Generate The SLR Parsing Table For The Following Grammar
0% (1)
A Ad - A - Ab - Abc - B: Generate The SLR Parsing Table For The Following Grammar
7 pages
Aditya CD Notes
No ratings yet
Aditya CD Notes
22 pages
Assignment No 1 Compiler Design Assignment
No ratings yet
Assignment No 1 Compiler Design Assignment
18 pages
Chapter 6 - Compiler Construction
No ratings yet
Chapter 6 - Compiler Construction
13 pages
Unit4 Notes
No ratings yet
Unit4 Notes
32 pages
Compiler Construction CHAPTER 3
No ratings yet
Compiler Construction CHAPTER 3
15 pages
SLR Parser
No ratings yet
SLR Parser
15 pages
CD GTU Study Material Presentations Unit-3 15092020080346AM
No ratings yet
CD GTU Study Material Presentations Unit-3 15092020080346AM
128 pages
Programming Languages With Compiler FQuiz 1 PDF
No ratings yet
Programming Languages With Compiler FQuiz 1 PDF
3 pages
Workbook Workbook Workbook Workbook Workbook: Try Yourself Questions
No ratings yet
Workbook Workbook Workbook Workbook Workbook: Try Yourself Questions
9 pages
Syntax Analysis I 2022 Class
No ratings yet
Syntax Analysis I 2022 Class
33 pages
Lecture3 Parser Full
No ratings yet
Lecture3 Parser Full
30 pages
Chapter 3 - Syntax Analyzer
No ratings yet
Chapter 3 - Syntax Analyzer
28 pages
Experiment 10:: AIM: WAP To Implement Shift Reduce Parser
No ratings yet
Experiment 10:: AIM: WAP To Implement Shift Reduce Parser
5 pages
Patternmatching
No ratings yet
Patternmatching
29 pages
Learning Materials, CD, Unit-3 (Syntax Analysis)
No ratings yet
Learning Materials, CD, Unit-3 (Syntax Analysis)
42 pages
AhkameJindegi MaulanaHemayetUddin
No ratings yet
AhkameJindegi MaulanaHemayetUddin
279 pages
Module3 PPT
No ratings yet
Module3 PPT
78 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
17 pages