0% found this document useful (0 votes)
5 views

syntax analysis

The document provides an overview of syntax analysis, focusing on context-free grammar (CFG), its components, and the role of parsers in analyzing source code. It discusses various parsing techniques, including top-down and bottom-up parsing, as well as specific methods like recursive descent and predictive parsing. Additionally, it covers concepts such as derivation, parse trees, ambiguity, and the relationship between context-free languages and regular expressions.

Uploaded by

22dce033
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

syntax analysis

The document provides an overview of syntax analysis, focusing on context-free grammar (CFG), its components, and the role of parsers in analyzing source code. It discusses various parsing techniques, including top-down and bottom-up parsing, as well as specific methods like recursive descent and predictive parsing. Additionally, it covers concepts such as derivation, parse trees, ambiguity, and the relationship between context-free languages and regular expressions.

Uploaded by

22dce033
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 90

Syntax Analysis

By:
Trusha R. Patel
Asst. Prof.
CE Dept., CSPIT, CHARUSAT
Role of Parser

token Parse
Lexical Syntax tree
Source Rest of
program Analyzer Analyzer
Front end
(Scanner) getNextToke (Parser)
n

Symbol table

2
CFG (Context Free Grammar)
CFG consists of terminals, nonterminals, start symbols and productions

❖ Terminals
Basic symbols from which strings are formed
“token name” is synonym for “terminal”

❖ Nonterminals
Syntactic variable that denote sets of strings

3
CFG (Context Free Grammar)
CFG consists of terminals, nonterminals, start symbols and productions

❖ Start symbol
One nonterminal different from other
Set of strings it denotes is the language generated by the grammar
Its productions are listed first

❖ Production
Specify the manner in which the terminal and nonterminal can combine to form strings

4
CFG (Context Free Grammar)
Production consist of
❖ Nonterminal called the “head” or “left side”
❖ Symbol
❖ “body” or “right side” consisting of zero or more terminals and nonterminals
CFG (Context Free Grammar)
Grammar for arithmetic expression

expression expression + term


expression expression - term
expression term
term term * factor
term term / factor
term factor
factor ( expression )
factor id
Notational convention for grammar
Symbols for terminals
❖ Lowercase letters a,b,c,…,z
❖ Operator symbols + * / etc.
❖ Punctuation symbols , ; etc.
❖ Digits 0,1,2,…,9
❖ Boldface strings id , if etc.

Symbols for nonterminals


❖ Uppercase letters A,B,C,…,Z
❖ S usually indicated start symbol
❖ Lowercase, italic names expr , stmt etc.

7
Notational convention for grammar
X , Y , Z represents grammar symbols
either nonterminal or terminal

u , v , … , z represents strings of terminals

α , β , γ , represents strings of grammar symbols (terminal and/or nonterminal)

A α1 , A α2 , … , A αk may be written as
A α1 | α2 | … | αk

Unless stated, head of first production is start symbol

8
Language generated by grammar
G : Grammar
L(G) : Language generated by grammar G

A language generated by CFG is called CFL (Context Free Language)

Two grammar generate the same language, the grammars are said to be
equivalent

9
Derivation
Beginning with the start symbol, each rewriting step replaces a nonterminal by
the body of one of its production

Grammar: E E + E | id
E
String: id + id +id
Derivation: E
E+E E + E
E+E+E
id + E + E id
E E
id + id + E +
id + id + id
id id

10
Derivation

11
Derivation
Lest most derivation
❖ Left most nonterminal will be first replace by its production
Right most derivation (canonical derivation)
❖ Right most nonterminal will be first replace by its production

Grammar : E E + E | E * E | - E | ( E ) | id
String : - ( id + id )

Left most derivation


E ⇒ –E ⇒ –(E) ⇒ –(E+E) ⇒ –(id+E) ⇒ –(id+id)
Right most derivation
E ⇒ –E ⇒ –(E) ⇒ –(E+E) ⇒ –(E+id) ⇒ –(id+id)

12
Reduction
Specific substring matching with the production of nonterminal will be
replaced by that nonterminal

Grammar: E E + E | id E
String: id + id +id
Derivation: id + id + id
E + id + id
E + E + id
E E
E + id
E+E
E E
E
id + id + id

13
Parse tree
Graphical representation of derivation
Parse tree for the string - ( id + id ) is

- E

( E )

E + E

id id

14
Ambiguity
A grammar that produce more than one parse tree for some string is said to
be ambiguous grammar
more than one left most derivation or more than one right most derivation

E E

* E E + E
E
Grammar:
E E + E | E * E | id E E
id id
E E *
String : +
id + id * id id id
id id

15
CFG vs. RE
Grammar are more powerful than RE

Everything that can described by a RE can be described by a Grammar, but


not vice-versa

Every regular language is context free language but not vice-versa

16
CFG vs. RE
RE : (a|b)*abb

Grammar :
S aX | aS | bS
X bY
Y bZ
Z ϵ

17
Left recursion

18
Left factoring
Left factoring is a grammar transformation that is useful for producing a
grammar suitable for predictive parsing.

If A αβ1 | αβ2 are two productions of A and


the input string begins with a nonempty string derived from α,
we do not know whether to expand A to αβ1 or αβ2.

Left factoring

i/p : Non left factored grammar : A α β1 | α β2

o/p : Left factored grammar : A α A’


A’ β1 | β2

19
General types of parser

Universal Top-down Bottom-up


Parser Parser Parser

• General method
• Can parse any grammar
• Methods such as
• Cocke-Younger-Kasami algorithm
• Earley’s algorithm

20
General types of parser

Universal Top-down Bottom-up


Parser Parser Parser

• Scan string from left to right


• Build parse tree from top (root) to the bottom (leaves)
• Perform derivation

21
General types of parser

Universal Top-down Bottom-up


Parser Parser Parser

• Scan string from left to right


• Start from leaves and word up to root
• Perform reduction

22
Top-Down Parsing
Construct parse tree for the input string starting from root and creating the
nodes of parse tree in preorder (derivation)

Grammar ( G ) : E E + E | E * E | - E | ( E ) | id
String : id + id * id
E

E + E

id E * E

id id

23
Different Top-Down Parsing Techniques
1. Recursive-Decent Parsing ( RDP )
2. Predictive Parsing

24
1. Recursive-Decent Parsing ( RDP )
Require backtracking to find correct production to be applied
Left recursive grammar can cause RDP to go into an infinite loop

25
1. Recursive-Decent Parsing ( RDP )
Algorithm
void A( )
{
choose an A-production, A X1,X2,…,Xk ;
for ( i = 1 to k)
{
if ( Xi is a nonterminal)
call procedure Xi( );
else if ( Xi equals the current input symbol α )
advance the input to the next symbol;
else
/* error occurred */ ;
}
}

26
1. Recursive-Decent Parsing ( RDP )
Process:
❖ Maintain 2 pointer
Lookahead pointer (LP) (point to top element of stack)
Input pointer (IP) (point to symbol in input string)
❖ If nonterminal in stack (pointed by LP) then
replace it by its production, and LP point to left most symbol in production
❖ If terminal in stack (pointed by LP) then
compare stack and input (pointed by LP and IP)
If match then
advance both pointers (LP and IP)
If not match then
backtrack

27
1. Recursive-Decent Parsing ( RDP )
S S S
S S

LP c A d c A d c A d c A d

LP LP a b a
LP LP

LP LP LP

Grammar:
String Match
S cAd String : c a d
backtrack
A ab|a
IP IP IP
28
FIRST and FOLLOW
Used to construct top-down and bottom-up parser

FIRST ( α ) :
Set of terminals that begin strings derived from α

FOLLOW ( α ) :
Set of terminals that can appear immediately to the right of α

29
FIRST
FIRST ( α )

Termina Non
l Terminal Look production
FIRST ( α ) = { α } of α
α ϵ α βγ

FIRST ( α ) = { ϵ }
Termina Non
l Terminal
FIRST ( α ) = { β } FIRST ( α ) = FIRST ( β )

Contain ϵ

FIRST ( α ) = FIRST ( β ) U FIRST ( γ )


30
FOLLOW
FOLLOW ( α )

start Non
Terminal
Find α in RHS of
FOLLOW ( α ) {$} Grammar
β αγ

ϵ Termina Non
l Terminal
FOLLOW ( α ) FOLLOW ( β ) FOLLOW ( α ) {γ} FOLLOW ( α ) FIRST ( γ )

Contain ϵ

FOLLOW ( α ) FIRST ( γ ) U FOLLOW ( γ )


31
2. Predictive Parsing
Specific case of RDP
No backtracking is required
Choose the correct production by looking ahead at the input a fixed number
of symbols
A class of grammar for which predictive parser can be constructed with
looking k symbols ahead in the input is called LL(k ) class

“k” input symbols of lookahead


Left most derivation
Left to right scan of input string

32
2. Predictive Parsing
LL(1) grammar
❖ Cover most programming constructs
❖ Properties
Unambiguous
No left-recursion

33
2. Predictive Parsing

34
2. Predictive Parsing
Grammar: FIRST( E ) = { id , ( } FOLLOW ( E ) = { $ , ) }
E T E’ FIRST ( E’ ) = { + , ϵ } FOLLOW ( E’ ) = { $ , ) }
E’ + T E’ | ϵ FIRST ( T ) = { id , ( } FOLLOW ( T ) = { $, ) , + }
FIRST ( T’ ) = { * , ϵ } FOLLOW ( T’ ) = { $ , ) , + }
T F T’
FIRST ( F ) = { id , ( } FOLLOW ( F ) = { $ , ) , + , *
T’ * F T’ | ϵ
}
F ( E ) | id

Terminal
Nonterminal
id + * ( ) $
E TE’ TE’
E’ +TE’ ϵ ϵ
T FT’ FT’
T’ ϵ *FT’ ϵ ϵ
F id (E)
All cell contain one and only one production so grammar is LL(1)
35
2. Predictive Parsing
(1) Parse the string id+id STACK INPUT OUTPUT
$E id + id $
$ E’ T id + id $ E T E’
$ E’ T’ F id + id $ T F T’
$ E’ T’ id id + id $ F id
$ E’ T’ + id $
$ E’ + id $ T’ ϵ
$ E’ E’ T + + id $ E’ + T E’
$ E’ E’ T id $
$ E’ E’ T’ F id $ T F T’
$ E’ E’ T’ id id $ F id
$ E’ E’ T’ $
$ E’ E’ $ T’ ϵ
$ E’ $ E’ ϵ
$ $ E’ ϵ
36
2. Predictive Parsing
(2) Parse the string (id+id)*id $ E’ T’ ) E’ T id ) * id$
STACK INPUT OUTPUT $ E’ T’) E’ T’ F id ) * id$ T FT’
$E ( id + id ) * id $ $ E’ T’ ) E’ T’ id id ) * id$ F id
$ E’ T ( id + id ) * id $ E TE’ $ E’ T’ ) E’ T’ ) * id$
$ E’ T’ F ( id + id ) * id $ T FT’ $ E’ T’) E’ ) * id$ T’ ϵ
$ E’ T’ ) E ( ( id + id ) * id $ F (E) $ E’ T’ ) ) * id$ E’ ϵ
$ E’ T’ ) E id + id ) * id $ $ E’ T’ * id$
$ E’ T’ ) E’ T id + id ) * id $ (E) TE’ $ E’ T’ F* * id$ T’ *FT’
$ E’ T’ ) E’ T’ F id + id ) * id $ T FT’ $ E’ T’ F id$
$ E’ T’ ) E’ T’ id id + id ) * id $ F id $ E’ T’ id id$ F id
$ E’ T’ ) E’ T’ + id ) * id $ $ E’ T’ $
$ E’ T’ ) E’ + id ) * id $ T’ ϵ $ E’ $ T’ ϵ
$ E’ T’ ) E’ T + + id ) * id $ E’ +TE’ $ $ E’ ϵ
37
2. Predictive Parsing
Grammar: FIRST( be ) = { not , ( , true , false FOLLOW ( be ) = { $ , ) }
be be or bt | bt } FOLLOW ( B’ ) = { $ , ) }
bt bt and bf | bf FIRST ( B’ ) = { or , ϵ } FOLLOW ( bt ) = { $ , ) , or }
FIRST ( bt ) = { not , ( , true , false FOLLOW ( A’ ) = { $ , ) , or }
bf not bf | ( be ) | true | false
} FOLLOW ( bf ) = { $ , ) , or , and
FIRST ( A’ ) = { and , ϵ } }
Remove left recursion
FIRST ( bf ) = {not , ( , true , false }
Nonterm Terminal
Grammar: inal or and not ( ) true false $
be bt B’ be bt B’ bt B’ bt B’ bt B’
B’ or bt B’ | ϵ B’ or bt B’ ϵ ϵ
bt bf A’
bt bf A’ bf A’ bf A’ bf A’
A’ and bf A’ | ϵ
bf not bf | ( be ) | true | false A’ ϵ and bf A’ ϵ ϵ
bf not bf (be) true false

All cell contain one and only one production so grammar is LL(1)
38
2. Predictive Parsing
Grammar: FIRST ( S ) = { i , a } FOLLOW ( S ) = { e , $ }
S i E t S S’ | a FIRST ( S’ ) = { e , ϵ } FOLLOW ( S’ ) = { e , $ }
S’ eS|ϵ FIRST ( E ) = { b } FOLLOW ( E ) = { t }
E b

Terminal
Nonterminal
i t a e b $

S i E t S S’ a

eS
S’ ϵ
ϵ

E b

Multiple production in cell so grammar is not LL(1)

39
2. Predictive Parsing
Grammar: FIRST ( S ) = { ( , a } FOLLOW ( S ) = { $ , , , )
S (L)|a FIRST ( L ) = { ( , a } }
L L,S|S FIRST ( L’ ) = { , , ϵ } FOLLOW ( L ) = { ) }
FOLLOW ( L’ ) = { ) }

Remove left recursion


Terminal
Nonterminal
( ) a , $
Grammar:
S (L)|a S (L) a
L S L’ L S L’ S L’
L’ , S L’ | ϵ L’ ϵ , S L’

All cell contain one and only one production so grammar is LL(1)

40
2. Predictive Parsing
Grammar: FIRST ( D ) = { int , float} FOLLOW ( D ) = { $ }
Space Space
D type list ; FIRST ( list ) = { id } FOLLOW ( list ) = { ; }
list list , id | id FIRST ( L’ ) = { , , ϵ} FOLLOW ( L’ ) = { ; }
FIRST ( type ) = { int , float } FOLLOW ( type ) = { ‘ ’ }
type int | float

Remove left recursion Nontermina Terminal


l ; id , int float ‘ ’ $
Grammar: D type list ; type list ;
D type list ; list id L’
list id L’
L’ ϵ , id L’
L’ , id L’ | ϵ
type int | type int float
float
All cell contain one and only one production so grammar is LL(1)

41
Bottom-Up Parsing
Construct parse tree for the input string starting at the leaves (bottom) and
working up towards the root (top) (reduction)

Grammar ( G ) : E E + E | E * E | - E | ( E ) | id
String : id + id + id
E

E E

E E

id + id + id

42
Bottom-Up Parsing

String Handle Reducing production


id * id id F id
Grammar : E E+T|
T F * id F T F
T T*F| T * id id F id
F
F ( E ) | id T*F T*F T T*F
T T E T
String: id * id
43
Bottom-Up Parsing
Handle Pruning
❖ A right most derivation in reverse can be obtain by handle pruning

44
Different Bottom-Up Parsing Techniques
1. Shift-Reduce Parsing
2. Operator Precedence Parsing
3. LR Parsing
1) Simple LR ( SLR or LR(0) )
2) Canonical LR ( CLR or LR(1) )
3) Lookahead LR ( LALR )

45
1. Shift-Reduce Parsing
Stack holds grammar symbols
Input buffer holds the string to be parsed
Handle always appears at the top of the stack
Use $ to mark bottom of the stack and also the right end of the input
Process:
❖ During left to right scan of input string, shift zero or more input symbols onto the
stack, until it is ready to reduce a string β
❖ The reduce β to the head (LHS) of the appropriate production
❖ Repeats this cycle until detect error or until stack contain start symbol and input is
empty

46
1. Shift-Reduce Parsing
There are 4 possible actions

❖ Shift
Shift the next input symbol onto the top of the stack

❖ Reduce
Replace handle with LHS in the stack

❖ Accept
Parsing complete successfully

❖ Error
Discover a syntax error and call an error recovery routine

47
1. Shift-Reduce Parsing
Stack Input Action

Grammar :

E E + E | E * E | id

String:

id + id * id

48
1. Shift-Reduce Parsing
Stack Input Action
$ id + id * id $ Shift
$ id + id * id $ Reduce E id
Grammar :
$E + id * id $ Shift
E E + E | E * E | id $E+ id * id $ Shift
$ E + id * id $ Reduce E id
String:
$E+E * id $ Shift
id + id * id $E+E* id $ Shift
$ E + E * id $ Reduce E id
$E+E*E $ Reduce E E*E
$E+E $ Reduce E E+E
$E $ Accept

49
1. Shift-Reduce Parsing
Conflict during shift reduce parsing

❖ Shift / reduce conflict


Cannot decide whether to shift or to reduce

❖ Reduce / reduce conflict


Cannot decide which of several reduction to make

50
2. Operator Precedence Parsing
Operator grammar
❖ The grammar has the property (among other essential requirements) that no
production right side is ϵ or has two adjacent nonterminals.

E.g. E EAE | (E) | -E | id


Not a operator grammar as EAE as consecutive nonterminals
A +|-|*|/|^

E E+E | E-E | E*E | E/E | E^E | (E) | -E | id Equivalent operator grammar

51
2. Operator Precedence Parsing
Define precedence relation between pair of terminals
by disjoint relation symbols ⋗,⋖ and ≐

❖ t1 ≐ t2 t1 has same priority as t2


❖ t1 ⋖ t2 t1 has less priority than t2
❖ t1 ⋗ t2t1 has high priority than t2

t1 ⋖ t2 and t2 ⋗ t1 are not same always

52
2. Operator Precedence Parsing
How to parse string (using operator precedence table)

1. Construct operator precedence table


2. Place $ (imaginary terminal marking) at staring and ending of string (mark each end
of string)
3. Put relation between each symbols in string
4. Scan the string form left to right until first ⋗ is encounter
5. Then scan back over any ≐ until ⋖ is encounter
6. Handle is every thing between ⋖ and ⋗ reduce to LHS of appropriate production

53
2. Operator Precedence Parsing
Grammar: String:
E E + E | E * E | id
$ ⋖ id ⋗ + ⋖ id ⋗ * ⋖ id ⋗ $
id is replaced with E
Now compare $ + id * id $ $ ⋖ E + ⋖ id ⋗ * ⋖ id ⋗ $
operator precedence table

Right side $ ⋖ E + E ⋖ * ⋖ id ⋗ $
id + * $
Lef id ⋗ ⋗ ⋗ $ ⋖ E + E ⋖ * E ⋗ $
t Left + has high priority
+ ⋖ ⋗ ⋖ ⋗
sid then right + $ ⋖ E + E ⋗ $
e * ⋖ ⋗ ⋗ ⋗
$ ⋖ ⋖ ⋖ $ E $

54
2. Operator Precedence Parsing
Algorithm : operator precedence parsing

❖ Method :
Initially the stack contains $ and the input buffer the string w$. To parse we execute the below
program
1. Set ip to point to the first symbol of w$:
2. Repeat forever
3. if $ is on top of the stack and ip points to $ then
4. return
5. else begin
6. let a be the topmost terminal symbol on the stack and b be the symbol pointed by ip
7. if a ⋖ b or a ≐ b the begin
8. push b onto the stack

55
2. Operator Precedence Parsing
Algorithm : operator precedence parsing

❖ Method :
9. advance ip to the next input symbol
10. end
11. else if a ⋗ b then
12. repeat
13. pop the stack
14. until the top stack terminal is related by ⋖ to the terminal most recently popped
15. else
16. error()
17. end

56
2. Operator Precedence Parsing
Operator precedence function
❖ Precedence between “a” and “b” can be determined by numerical comparison
function f and g
❖ f(a) = g(b) if a ≐ b
❖ f(a) < g(b) if a ⋖ b
❖ f(a) > g(b) if a ⋗ b

57
2. Operator Precedence Parsing
Algorithm : construct precedence functions

❖ Input :
An operator precedence matrix

❖ Output :
Precedence functions representing the input matrix, or an indication that none exist

58
2. Operator Precedence Parsing
Algorithm : construct precedence functions

❖ Method :
1. Create symbol “fa” and ga” for each terminal “a” and $
2. Partitions the created symbols into as many group as possible in a such a way that if a ≐ b then
“fa” & “gb” are in same group
3. Create a directed graph whose nodes are the groups found in step-2
for any “a” and “b”
if a ⋖ b then place an edge from group “gb” to group “fa”
if a ⋗ b then place an edge from group “fa” to group “gb”
4. If graph is constructed in step-3 has a cycle then no precedence function exist. If there are no
cycle then let f(a) be the length of the longest path beginning at the group of “fa” and g(a) be
the length of the longest path from beginning at the group of “ga”

59
2. Operator Precedence Parsing

gid fid
Right side
------- g -------
id + * $
f* g*
Left id ⋗ ⋗ ⋗
side
+ ⋖ ⋗ ⋖ ⋗
-----
-- f * ⋖ ⋗ ⋗ ⋗ Find max path to reach
----- g+ f+ either f$ or g$
-- $ ⋖ ⋖ ⋖

id + * $
Draw edge from grater to less f$ g$ f 4 2 4 0
e.g. F(+) > g(+) so edge from f+ to g+
g 5 1 3 0

60
2. Operator Precedence Parsing
Parse string id + id * id

$ id + id * id $
0 5 2 5 4 5 0

$ E + E * E $
0 2 4 0

$ E + E $
0 2 0

$ E $ id + * $
0 0 f 4 2 4 0
g 5 1 3 0

61
2. Operator Precedence Parsing
operator precedence table
Right side
Grammar: Lef id + - * / ^ ( ) $
t id ⋗ ⋗ ⋗ ⋗ ⋗ ⋗ ⋗
E E + E | E – E | E * E | E / E | E ^ E | ( E ) | idside
+ ⋖ ⋗ ⋗ ⋖ ⋖ ⋖ ⋖ ⋗ ⋗
- ⋖ ⋗ ⋗ ⋖ ⋖ ⋖ ⋖ ⋗ ⋗
* ⋖ ⋗ ⋗ ⋗ ⋗ ⋖ ⋖ ⋗ ⋗
/ ⋖ ⋗ ⋗ ⋗ ⋗ ⋖ ⋖ ⋗ ⋗
^ ⋖ ⋗ ⋗ ⋗ ⋗ ⋖ ⋖ ⋗ ⋗
( ⋖ ⋖ ⋖ ⋖ ⋖ ⋖ ⋖ ≐
) ⋗ ⋗ ⋗ ⋗ ⋗ ⋗ ⋗
$ ⋖ ⋖ ⋖ ⋖ ⋖ ⋖ ⋖

62
3. LR parsing
LR : left to right scan Right most derivation

Can handle left recursive grammar

Can not handle the ambiguous grammar

63
3. LR parsing
3 types

❖ Simple LR (SLR)
Can solve LR(0) grammars

❖ Canonical LR (CLR)
Can solve LR(1) grammars

❖ Lookahead LR (LALR)

64
3. LR parsing : (1) SLR
Weakest methods from all 3 LR methods

Process
❖ Convert grammar in augmented grammar by adding one more starting symbol S’ S
❖ Construct LR(0) item sets
❖ Construct parsing table

65
3. LR parsing : (1) SLR
How to construct item set

❖ Production A XYZ can have 4 forms


A . XYZ
A X.YZ
A XY.Z
A XYZ.

❖ Production A ϵ can generate only one item A .

66
3. LR parsing : (1) SLR
How to construct item set

❖ Closure function
If I is a set of items for a grammar G then closure(I) is constructed by two rules
1. Initially every item in I is in closure of I
2. A α . B β is in closure(I) & B γ is the production rule then B . γ is added in closure(I)
continue this rule till no new item can be added

❖ GOTO function
goto( I , x ) where “I” is item set and “x” is grammar symbol
goto( I , x ) = closure of the set of all items [ A α x . β ] such that [ A α . x β ] is in I

67
3. LR parsing : (1) SLR
How to construct parsing table

1. If S’ S. is in Ii then set action[ i , $ ] to accept

2. If A α . x β is in Ii , where “x” is terminal , and goto( Ii , x )= Ij then


set action[ i , x ] to “shift j”

3. If A α . B β is in Ii, where B is nonterminal then


set goto[ i , B ] to the j where we are having first production A α B . β in Ij

4. If A α . is in Ii then set action[ i , x ] to “reduce A α” for all “x” in follow(A)

68
3. LR parsing : (1) SLR
Grammar: Make augmented Augmented Grammar:
grammar
E E+T| E’ E
T E E+T|T
T T*F|F T T*F|F
F ( E ) | id F ( E ) | id
Added new start symbol
E’

Give numbers to production 0) E ’ E Find follow set for FOLLOW( E ) = { $ , ) , +}


(will need it in table generation) 1) E E+ all nonterminal FOLLOW( T ) = { $ , ) , + , *
T }
2) E T FOLLOW( F ) = { $ , ) , + , *
3) T T*F }
4) T F
5) F (E)
6) F id

69
E’ After
Its same .E
I2 .has . E so add production ofT EFname
Its is
After new so don’t
there
item
. there are
isset
give
EFT so
new
5give name,
possibilities
new
(nonterminal) nameEgive
I1 ( andI2
Same EProduction
when .Titem
idadd
Same way
with
has .T
set
when
so “.
add
match E”productions
find
become
with any “E of T item set
previous
So production
(nonterminal) of new
EF item set give new name
3. LR parsing : (1) SLR T.” Prepare
give same .F hasgoto
name .F sofor
So add production
addallproductions of F
of T
Construct LR(0) item sets I4 = goto( I0 , ( ) = F (.E I8 = goto( I4 , E ) = F (E.)
Io = E ’ . )E .E+ E E.+
E T T
.E+
ET .. TT * I2 = goto( I4 , T ) E T.
T = T T.*
TE .T* F
FT .(. FE ) F
F I3 = goto( I4 , F ) = T F.
FT .. (FE F . id
) I5 = goto( I0 , id ) = F id . I4 = goto( I4 , ( ) = F (.E
F . id )E .E+
I1 = goto( I0 , E ) = E’ E. I6 = goto( I1 , + ) = E E+. T
E E.+ TT .T* ET .. TT *
T F F
I2 = goto( I0 , T ) E T. TF . FE )
.( FT .(. FE )
= T T.* F . id F . id
F
I7 = goto( I2 , * ) = T T*. I5 = goto( I4 , id ) = F id .
I3 = goto( I0 , F ) = T F. FF .( E )
F . id
70
New item set so give new name
3. LR parsing : (1) SLR
Construct LR(0) item sets
I4 = goto( I7 , ( ) = F (.E) I6 = goto( I8 , + ) = E E+.
E .E+ T
I9 = goto( I6 , T ) E E+T
T T .T*
= .
E .T F
T T.*F
I3 = goto( I6 , F ) = T F. T .T* T .F
F F .(E)
I4 = goto( I6 , ( ) = F (.E) T .F F . id
E .E+ F .(E) I7 = goto( I9 , * ) = T T*.
T I5 = goto( I7 , id ) = F . id.
id F
E .T
F .(E)
T .T*
F . id
F I11 = goto( I8 , ) ) = F (E).
T .F
F .(E)
I5 = goto( I6 , id ) = FF .idid.

I10 = goto( I7 , F ) = T T*F


.
71
3. LR parsing : (1) SLR
E + T *
I0 I1 I6 I9 I7
( F
I3

I ) E ( (
I8 I4 I4
11
+ id
I6 I5

T
I2
T * F I
F I2 I7
10
I3
(
id I4
I5
id
I5

id F
I5 I3

72
I3 has item T F. (dot I5 =atgoto
I1 the (end)
I0,id ))
I0,E
Production numberinof0,E T do
0,id F entry
do is 4 of
entry of 1S5 (shift 5)
3. LR parsing : (1) SLR Follow(T)={+,*,),$}
In (3,+) (3,*) (3,)) (3,$) do entry of R4 (reduce 4)
action goto
Item set
id + * ( ) $ E T F
0 S5 S4 1 2 3
1 S6 Acc
2 R2 S7 R2 R2
3 R4 R4 R4 R4
4 S5 S4 8 2 3
5 R6 R6 R6 R6
6 S5 S4 9 3
7 S5 S4 10
8 S6 S11
9 R1 S7 R1 R1
10 R3 R3 R3 R3
11 R5 R5 R5 R5
73
3. LR parsing : (1) SLR
Stack Input Action
0 id * id + id $ Shift
0 id 5 * id + id $ Reduce by F id
0F3 * id + id $ Reduce by T F
0T2 * id + id $ Shift
In parsing table entry of (0,id) is S5
So Action is Shift 0T2*7 id + id $ shift
Shift one element from input to stack 0 T 2 * 7 id 5 + id $ Reduce by F id
Place 5 after that in stack 0 T 2 * 7 F 10 + id $ Reduce by T T*F
In parsing table entry of (3,*) is R6 0T2 + id $ Reduce by E T
So Action is Reduce with production 6 (F id) 0E1 + id $ Shift
In stack find id and replace with F
0E1+6 id $ Shift
In stack it become 0F
In parsing table entry of (0,F) is 3 0 E 1 + 6 id 5 $ Reduce by F id
Place 3 in stack 0E1+6F5 $ Reduce by T F
0E1+6T9 $ Reduce by E E+T
0E1 $ Accept
74
3. LR parsing : (2) CLR
Process
❖ Convert grammar in augmented grammar by adding one more starting symbol S’ S
❖ Construct LR(1) item sets
❖ Construct parsing table

75
3. LR parsing : (2) CLR
How to construct item set

❖ Closure function
If I is a set of items for a grammar G then closure(I) is constructed by two rules
1. Initially every item in I is in closure of I
2. A α . B β , a is in closure(I) & B γ is the production rule
then B . γ , FIRST(βa) is added in closure(I) continue this rule till no new item can be added

❖ GOTO function
goto( I , x ) where “I” is item set and “x” is grammar symbol
goto( I , x ) = closure of the set of all items [ A α x . β , a ] such that [A α . x β , a ] is in
I

76
3. LR parsing : (2) CLR
How to construct parsing table

1. If S’ S. , $ is in Ii then set action[ i , $ ] to accept

2. If A α . x β , a is in Ii , where “x” is terminal , and goto( Ii , x )= Ij then


set action[ i , x ] to “shift j”

3. If A α . B β is in Ii, where B is nonterminal then


set goto[ i , B ] to the j where we are having first production A α B . β , a in Ij

4. If A α . , a is in Ii then set action[ i , a ] to “reduce A α”

77
3. LR parsing : (2) CLR
Grammar: Make augmented Augmented Grammar:
grammar
S CC S’ S
C cC | d S CC
C cC | d

Give numbers to production 0) S’ S Find first set for FIRST ( S ) = { c , d }


(will need it in table generation) 1) S CC all nonterminal FIRST ( C ) = { c , d }
2) C cC
3) C d

78
S’ .S need to compare with A α.Bβ , a
here β is ϵ and a is $ , so FIRST(βa) = {$}
3. LR parsing : (2) CLR $ is added in look ahead of S .CC

Construct LR(1) item sets I4 = goto( I0 , d ) = C d. , I3 = goto( I3 , c ) = C c.C , c/d


c/d C .cC , c/d
Io = S ’ .S, C .d , c/d
S$ . CC , I5 = goto( I2 , C ) = S CC. ,
C$ .cC , $ I4 = goto( I3 , d ) = C d. ,
c/d c/d
C .d , c/d I6 = goto( I2 , c ) = C c.C , $
C .cC , $ I9 = goto( I6 , C ) = C cC. ,
I1 = goto( I0 , S ) = S ’ S. , $
C .d , $
$
I6 = goto( I6 , c ) = C c.C , $
I2 = goto( I0 , C ) = S C.C , $ C .cC , $
C .cC , I7 = goto( I2 , d ) = C d. , C .d , $
$ $
C .d , $ I8 = goto( I3 , C ) = C cC. , I7 = goto( I6 , d ) = C d. ,
I3 = goto( I0 , c ) = C c.C , c/d c/d $
C .cC ,
c/d
C .d , c/d
79
3. LR parsing : (2) CLR

S
I0 I1

C C
I2 I5
c c

C c c C
I8 I3 I6 I9

d d d
I4 I7 I7

d
I4

80
3. LR parsing : (2) CLR
I3 = goto (I0,c) action goto
Item set
So do entry of shift3(S3) in (0,c) c d $ S C
0 S3 S4 1 2
I2 = goto (I0,C)
So do entry of 2 in (0,C) 1 Acc
2 S6 S7 5
I4 contain production C d.,c/d 3 S3 S4 8
Dot at the end so need to do reduce entry 4 R3 R3
C d has production number 3
5 R1
Look ahead are c and d
So do entry of R3 in (4,c0 and (4,d) 6 S6 S7 9
7 R3
8 R2 R2
9 R2

81
3. LR parsing : (3) LALR
Process
❖ Convert grammar in augmented grammar by adding one more starting symbol S’ S
❖ Construct LR(1) item sets
❖ Combine the item sets having same core but different lookahead
❖ Construct parsing table

82
3. LR parsing : (3) LALR
Grammar: Make augmented Augmented Grammar:
grammar
S CC S’ S
C cC | d S CC
C cC | d

Give numbers to production 0) S’ S Find first set for FIRST ( S ) = { c , d }


(will need it in table generation) 1) S CC all nonterminal FIRST ( C ) = { c , d }
2) C cC
3) C d

83
3. LR parsing : (3) LALR
Construct LR(1) item sets
I6 = C c.C , $
Io = S ’ .S, I3 = C c.C , c/d I36 = C c.C , c/d/$
C .cC , $
S$ . CC , C .cC , C .cC , C .d , $
C$ .cC , c/d c/d/$
c/d C .d , c/d C .d , c/d/$
C .d , c/d
I7 = C d. ,
I47 = C d. , $
I1 = S ’ S. , I4 = C d. , c/d/$
$
c/d
I8 = C cC. ,
c/d
I2 = S C.C , $
C .cC , I5 = S CC. , I89 = C cC. ,
$ $ c/d/$
C .d , $ I9 = C cC. ,
$

84
3. LR parsing : (3) LALR

Item action goto


set c d $ S C
0 S36 S47 1 2
1 Acc
2 S36 S47 5
36 S36 S47 89
47 R3 R3 R3
5 R1
89 R2 R2 R2

85
Using Ambiguous Grammar
Every ambiguous grammar fails to be LR

Certain types of ambiguous grammars are useful in the specification and


implementation of languages

Ambiguous grammar provides a shorter, more natural specification than any


equivalent unambiguous grammar

86
Using Ambiguous Grammar
Ambiguous grammar Equivalent unambiguous grammar
E E + E | E * E | (E) | id E E+T|T
T T*F|F
F (E) | id

❖ Grammar is ambiguous because it does ❖ Generates same language but give + a


not specify the associativity and lower precedence than * and makes
precedence of the operators + and * both operators left associative

87
Using Ambiguous Grammar
Reasons : might want to use ambiguous grammar instead of unambiguous

1. Can easily change the associativities and precedence levels of the operators without
disturbing the production of ambiguous grammar or the number of states in the
resulting parser.

2. Unambiguous grammar will spend a large fraction of time reducing by the


productions E T and T F.
The parser for ambiguous grammar will not waste time reducing by these single
productions.

88
Syntax Analyzer Generator YACC

Yacc source Yacc


y.tab.
file Compiler c
(.y)

C
y.tab. a.out
c Compiler

Input stream a.out Sequence of token

89
Structure of Yacc Program
declarations
%%
translation rules
%%
Supporting c-routines

90

You might also like