0% found this document useful (0 votes)

12 views90 pages

syntax analysis

The document provides an overview of syntax analysis, focusing on context-free grammar (CFG), its components, and the role of parsers in analyzing source code. It discusses various parsing techniques, including top-down and bottom-up parsing, as well as specific methods like recursive descent and predictive parsing. Additionally, it covers concepts such as derivation, parse trees, ambiguity, and the relationship between context-free languages and regular expressions.

Uploaded by

22dce033

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views90 pages

syntax analysis

Uploaded by

22dce033

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 90

Syntax Analysis

By:
Trusha R. Patel
Asst. Prof.
CE Dept., CSPIT, CHARUSAT
Role of Parser

token Parse
Lexical Syntax tree
Source Rest of
program Analyzer Analyzer
Front end
(Scanner) getNextToke (Parser)
n

Symbol table

2
CFG (Context Free Grammar)
CFG consists of terminals, nonterminals, start symbols and productions

❖ Terminals
Basic symbols from which strings are formed
“token name” is synonym for “terminal”

❖ Nonterminals
Syntactic variable that denote sets of strings

3
CFG (Context Free Grammar)
CFG consists of terminals, nonterminals, start symbols and productions

❖ Start symbol
One nonterminal different from other
Set of strings it denotes is the language generated by the grammar
Its productions are listed first

❖ Production
Specify the manner in which the terminal and nonterminal can combine to form strings

4
CFG (Context Free Grammar)
Production consist of
❖ Nonterminal called the “head” or “left side”
❖ Symbol
❖ “body” or “right side” consisting of zero or more terminals and nonterminals
CFG (Context Free Grammar)
Grammar for arithmetic expression

expression expression + term

expression expression - term
expression term
term term * factor
term term / factor
term factor
factor ( expression )
factor id
Notational convention for grammar
Symbols for terminals
❖ Lowercase letters a,b,c,…,z
❖ Operator symbols + * / etc.
❖ Punctuation symbols , ; etc.
❖ Digits 0,1,2,…,9
❖ Boldface strings id , if etc.

Symbols for nonterminals

❖ Uppercase letters A,B,C,…,Z
❖ S usually indicated start symbol
❖ Lowercase, italic names expr , stmt etc.

7
Notational convention for grammar
X , Y , Z represents grammar symbols
either nonterminal or terminal

u , v , … , z represents strings of terminals

α , β , γ , represents strings of grammar symbols (terminal and/or nonterminal)

A α1 , A α2 , … , A αk may be written as
A α1 | α2 | … | αk

Unless stated, head of first production is start symbol

8
Language generated by grammar
G : Grammar
L(G) : Language generated by grammar G

A language generated by CFG is called CFL (Context Free Language)

Two grammar generate the same language, the grammars are said to be
equivalent

9
Derivation
Beginning with the start symbol, each rewriting step replaces a nonterminal by
the body of one of its production

Grammar: E E + E | id
E
String: id + id +id
Derivation: E
E+E E + E
E+E+E
id + E + E id
E E
id + id + E +
id + id + id
id id

10
Derivation

11
Derivation
Lest most derivation
❖ Left most nonterminal will be first replace by its production
Right most derivation (canonical derivation)
❖ Right most nonterminal will be first replace by its production

Grammar : E E + E | E * E | - E | ( E ) | id
String : - ( id + id )

Left most derivation

E ⇒ –E ⇒ –(E) ⇒ –(E+E) ⇒ –(id+E) ⇒ –(id+id)
Right most derivation
E ⇒ –E ⇒ –(E) ⇒ –(E+E) ⇒ –(E+id) ⇒ –(id+id)

12
Reduction
Specific substring matching with the production of nonterminal will be
replaced by that nonterminal

Grammar: E E + E | id E
String: id + id +id
Derivation: id + id + id
E + id + id
E + E + id
E E
E + id
E+E
E E
E
id + id + id

13
Parse tree
Graphical representation of derivation
Parse tree for the string - ( id + id ) is

- E

( E )

E + E

id id

14
Ambiguity
A grammar that produce more than one parse tree for some string is said to
be ambiguous grammar
more than one left most derivation or more than one right most derivation

E E

* E E + E
E
Grammar:
E E + E | E * E | id E E
id id
E E *
String : +
id + id * id id id
id id

15
CFG vs. RE
Grammar are more powerful than RE

Everything that can described by a RE can be described by a Grammar, but

not vice-versa

Every regular language is context free language but not vice-versa

16
CFG vs. RE
RE : (a|b)*abb

Grammar :
S aX | aS | bS
X bY
Y bZ
Z ϵ

17
Left recursion

18
Left factoring
Left factoring is a grammar transformation that is useful for producing a
grammar suitable for predictive parsing.

If A αβ1 | αβ2 are two productions of A and

the input string begins with a nonempty string derived from α,
we do not know whether to expand A to αβ1 or αβ2.

Left factoring

i/p : Non left factored grammar : A α β1 | α β2

o/p : Left factored grammar : A α A’

A’ β1 | β2

19
General types of parser

Universal Top-down Bottom-up

Parser Parser Parser

• General method
• Can parse any grammar
• Methods such as
• Cocke-Younger-Kasami algorithm
• Earley’s algorithm

20
General types of parser

Universal Top-down Bottom-up

Parser Parser Parser

• Scan string from left to right

• Build parse tree from top (root) to the bottom (leaves)
• Perform derivation

21
General types of parser

Universal Top-down Bottom-up

Parser Parser Parser

• Scan string from left to right

• Start from leaves and word up to root
• Perform reduction

22
Top-Down Parsing
Construct parse tree for the input string starting from root and creating the
nodes of parse tree in preorder (derivation)

Grammar ( G ) : E E + E | E * E | - E | ( E ) | id
String : id + id * id
E

E + E

id E * E

id id

23
Different Top-Down Parsing Techniques
1. Recursive-Decent Parsing ( RDP )
2. Predictive Parsing

24
1. Recursive-Decent Parsing ( RDP )
Require backtracking to find correct production to be applied
Left recursive grammar can cause RDP to go into an infinite loop

25
1. Recursive-Decent Parsing ( RDP )
Algorithm
void A( )
{
choose an A-production, A X1,X2,…,Xk ;
for ( i = 1 to k)
{
if ( Xi is a nonterminal)
call procedure Xi( );
else if ( Xi equals the current input symbol α )
advance the input to the next symbol;
else
/* error occurred */ ;
}
}

26
1. Recursive-Decent Parsing ( RDP )
Process:
❖ Maintain 2 pointer
Lookahead pointer (LP) (point to top element of stack)
Input pointer (IP) (point to symbol in input string)
❖ If nonterminal in stack (pointed by LP) then
replace it by its production, and LP point to left most symbol in production
❖ If terminal in stack (pointed by LP) then
compare stack and input (pointed by LP and IP)
If match then
advance both pointers (LP and IP)
If not match then
backtrack

27
1. Recursive-Decent Parsing ( RDP )
S S S
S S

LP c A d c A d c A d c A d

LP LP a b a
LP LP

LP LP LP

Grammar:
String Match
S cAd String : c a d
backtrack
A ab|a
IP IP IP
28
FIRST and FOLLOW
Used to construct top-down and bottom-up parser

FIRST ( α ) :
Set of terminals that begin strings derived from α

FOLLOW ( α ) :
Set of terminals that can appear immediately to the right of α

29
FIRST
FIRST ( α )

Termina Non
l Terminal Look production
FIRST ( α ) = { α } of α
α ϵ α βγ

FIRST ( α ) = { ϵ }
Termina Non
l Terminal
FIRST ( α ) = { β } FIRST ( α ) = FIRST ( β )

Contain ϵ

FIRST ( α ) = FIRST ( β ) U FIRST ( γ )

30
FOLLOW
FOLLOW ( α )

start Non
Terminal
Find α in RHS of
FOLLOW ( α ) {$} Grammar
β αγ

ϵ Termina Non
l Terminal
FOLLOW ( α ) FOLLOW ( β ) FOLLOW ( α ) {γ} FOLLOW ( α ) FIRST ( γ )

Contain ϵ

FOLLOW ( α ) FIRST ( γ ) U FOLLOW ( γ )

31
2. Predictive Parsing
Specific case of RDP
No backtracking is required
Choose the correct production by looking ahead at the input a fixed number
of symbols
A class of grammar for which predictive parser can be constructed with
looking k symbols ahead in the input is called LL(k ) class

“k” input symbols of lookahead

Left most derivation
Left to right scan of input string

32
2. Predictive Parsing
LL(1) grammar
❖ Cover most programming constructs
❖ Properties
Unambiguous
No left-recursion

33
2. Predictive Parsing

34
2. Predictive Parsing
Grammar: FIRST( E ) = { id , ( } FOLLOW ( E ) = { $ , ) }
E T E’ FIRST ( E’ ) = { + , ϵ } FOLLOW ( E’ ) = { $ , ) }
E’ + T E’ | ϵ FIRST ( T ) = { id , ( } FOLLOW ( T ) = { $, ) , + }
FIRST ( T’ ) = { * , ϵ } FOLLOW ( T’ ) = { $ , ) , + }
T F T’
FIRST ( F ) = { id , ( } FOLLOW ( F ) = { $ , ) , + , *
T’ * F T’ | ϵ
}
F ( E ) | id

Terminal
Nonterminal
id + * ( ) $
E TE’ TE’
E’ +TE’ ϵ ϵ
T FT’ FT’
T’ ϵ *FT’ ϵ ϵ
F id (E)
All cell contain one and only one production so grammar is LL(1)
35
2. Predictive Parsing
(1) Parse the string id+id STACK INPUT OUTPUT
$E id + id $
$ E’ T id + id $ E T E’
$ E’ T’ F id + id $ T F T’
$ E’ T’ id id + id $ F id
$ E’ T’ + id $
$ E’ + id $ T’ ϵ
$ E’ E’ T + + id $ E’ + T E’
$ E’ E’ T id $
$ E’ E’ T’ F id $ T F T’
$ E’ E’ T’ id id $ F id
$ E’ E’ T’ $
$ E’ E’ $ T’ ϵ
$ E’ $ E’ ϵ
$ $ E’ ϵ
36
2. Predictive Parsing
(2) Parse the string (id+id)*id $ E’ T’ ) E’ T id ) * id$
STACK INPUT OUTPUT $ E’ T’) E’ T’ F id ) * id$ T FT’
$E ( id + id ) * id $ $ E’ T’ ) E’ T’ id id ) * id$ F id
$ E’ T ( id + id ) * id $ E TE’ $ E’ T’ ) E’ T’ ) * id$
$ E’ T’ F ( id + id ) * id $ T FT’ $ E’ T’) E’ ) * id$ T’ ϵ
$ E’ T’ ) E ( ( id + id ) * id $ F (E) $ E’ T’ ) ) * id$ E’ ϵ
$ E’ T’ ) E id + id ) * id $ $ E’ T’ * id$
$ E’ T’ ) E’ T id + id ) * id $ (E) TE’ $ E’ T’ F* * id$ T’ *FT’
$ E’ T’ ) E’ T’ F id + id ) * id $ T FT’ $ E’ T’ F id$
$ E’ T’ ) E’ T’ id id + id ) * id $ F id $ E’ T’ id id$ F id
$ E’ T’ ) E’ T’ + id ) * id $ $ E’ T’ $
$ E’ T’ ) E’ + id ) * id $ T’ ϵ $ E’ $ T’ ϵ
$ E’ T’ ) E’ T + + id ) * id $ E’ +TE’ $ $ E’ ϵ
37
2. Predictive Parsing
Grammar: FIRST( be ) = { not , ( , true , false FOLLOW ( be ) = { $ , ) }
be be or bt | bt } FOLLOW ( B’ ) = { $ , ) }
bt bt and bf | bf FIRST ( B’ ) = { or , ϵ } FOLLOW ( bt ) = { $ , ) , or }
FIRST ( bt ) = { not , ( , true , false FOLLOW ( A’ ) = { $ , ) , or }
bf not bf | ( be ) | true | false
} FOLLOW ( bf ) = { $ , ) , or , and
FIRST ( A’ ) = { and , ϵ } }
Remove left recursion
FIRST ( bf ) = {not , ( , true , false }
Nonterm Terminal
Grammar: inal or and not ( ) true false $
be bt B’ be bt B’ bt B’ bt B’ bt B’
B’ or bt B’ | ϵ B’ or bt B’ ϵ ϵ
bt bf A’
bt bf A’ bf A’ bf A’ bf A’
A’ and bf A’ | ϵ
bf not bf | ( be ) | true | false A’ ϵ and bf A’ ϵ ϵ
bf not bf (be) true false

All cell contain one and only one production so grammar is LL(1)
38
2. Predictive Parsing
Grammar: FIRST ( S ) = { i , a } FOLLOW ( S ) = { e , $ }
S i E t S S’ | a FIRST ( S’ ) = { e , ϵ } FOLLOW ( S’ ) = { e , $ }
S’ eS|ϵ FIRST ( E ) = { b } FOLLOW ( E ) = { t }
E b

Terminal
Nonterminal
i t a e b $

S i E t S S’ a

eS
S’ ϵ
ϵ

E b

Multiple production in cell so grammar is not LL(1)

39
2. Predictive Parsing
Grammar: FIRST ( S ) = { ( , a } FOLLOW ( S ) = { $ , , , )
S (L)|a FIRST ( L ) = { ( , a } }
L L,S|S FIRST ( L’ ) = { , , ϵ } FOLLOW ( L ) = { ) }
FOLLOW ( L’ ) = { ) }

Remove left recursion

Terminal
Nonterminal
( ) a , $
Grammar:
S (L)|a S (L) a
L S L’ L S L’ S L’
L’ , S L’ | ϵ L’ ϵ , S L’

All cell contain one and only one production so grammar is LL(1)

40
2. Predictive Parsing
Grammar: FIRST ( D ) = { int , float} FOLLOW ( D ) = { $ }
Space Space
D type list ; FIRST ( list ) = { id } FOLLOW ( list ) = { ; }
list list , id | id FIRST ( L’ ) = { , , ϵ} FOLLOW ( L’ ) = { ; }
FIRST ( type ) = { int , float } FOLLOW ( type ) = { ‘ ’ }
type int | float

Remove left recursion Nontermina Terminal

l ; id , int float ‘ ’ $
Grammar: D type list ; type list ;
D type list ; list id L’
list id L’
L’ ϵ , id L’
L’ , id L’ | ϵ
type int | type int float
float
All cell contain one and only one production so grammar is LL(1)

41
Bottom-Up Parsing
Construct parse tree for the input string starting at the leaves (bottom) and
working up towards the root (top) (reduction)

Grammar ( G ) : E E + E | E * E | - E | ( E ) | id
String : id + id + id
E

E E

id + id + id

42
Bottom-Up Parsing

String Handle Reducing production

id * id id F id
Grammar : E E+T|
T F * id F T F
T T*F| T * id id F id
F
F ( E ) | id T*F T*F T T*F
T T E T
String: id * id
43
Bottom-Up Parsing
Handle Pruning
❖ A right most derivation in reverse can be obtain by handle pruning

44
Different Bottom-Up Parsing Techniques
1. Shift-Reduce Parsing
2. Operator Precedence Parsing
3. LR Parsing
1) Simple LR ( SLR or LR(0) )
2) Canonical LR ( CLR or LR(1) )
3) Lookahead LR ( LALR )

45
1. Shift-Reduce Parsing
Stack holds grammar symbols
Input buffer holds the string to be parsed
Handle always appears at the top of the stack
Use $ to mark bottom of the stack and also the right end of the input
Process:
❖ During left to right scan of input string, shift zero or more input symbols onto the
stack, until it is ready to reduce a string β
❖ The reduce β to the head (LHS) of the appropriate production
❖ Repeats this cycle until detect error or until stack contain start symbol and input is
empty

46
1. Shift-Reduce Parsing
There are 4 possible actions

❖ Shift
Shift the next input symbol onto the top of the stack

❖ Reduce
Replace handle with LHS in the stack

❖ Accept
Parsing complete successfully

❖ Error
Discover a syntax error and call an error recovery routine

47
1. Shift-Reduce Parsing
Stack Input Action

Grammar :

E E + E | E * E | id

String:

id + id * id

48
1. Shift-Reduce Parsing
Stack Input Action
$ id + id * id $ Shift
$ id + id * id $ Reduce E id
Grammar :
$E + id * id $ Shift
E E + E | E * E | id $E+ id * id $ Shift
$ E + id * id $ Reduce E id
String:
$E+E * id $ Shift
id + id * id $E+E* id $ Shift
$ E + E * id $ Reduce E id
$E+E*E $ Reduce E E*E
$E+E $ Reduce E E+E
$E $ Accept

49
1. Shift-Reduce Parsing
Conflict during shift reduce parsing

❖ Shift / reduce conflict

Cannot decide whether to shift or to reduce

❖ Reduce / reduce conflict

Cannot decide which of several reduction to make

50
2. Operator Precedence Parsing
Operator grammar
❖ The grammar has the property (among other essential requirements) that no
production right side is ϵ or has two adjacent nonterminals.

E.g. E EAE | (E) | -E | id

Not a operator grammar as EAE as consecutive nonterminals
A +|-|*|/|^

E E+E | E-E | E*E | E/E | E^E | (E) | -E | id Equivalent operator grammar

51
2. Operator Precedence Parsing
Define precedence relation between pair of terminals
by disjoint relation symbols ⋗,⋖ and ≐

❖ t1 ≐ t2 t1 has same priority as t2

❖ t1 ⋖ t2 t1 has less priority than t2
❖ t1 ⋗ t2t1 has high priority than t2

t1 ⋖ t2 and t2 ⋗ t1 are not same always

52
2. Operator Precedence Parsing
How to parse string (using operator precedence table)

1. Construct operator precedence table

2. Place $ (imaginary terminal marking) at staring and ending of string (mark each end
of string)
3. Put relation between each symbols in string
4. Scan the string form left to right until first ⋗ is encounter
5. Then scan back over any ≐ until ⋖ is encounter
6. Handle is every thing between ⋖ and ⋗ reduce to LHS of appropriate production

53
2. Operator Precedence Parsing
Grammar: String:
E E + E | E * E | id
$ ⋖ id ⋗ + ⋖ id ⋗ * ⋖ id ⋗ $
id is replaced with E
Now compare $ + id * id $ $ ⋖ E + ⋖ id ⋗ * ⋖ id ⋗ $
operator precedence table

Right side $ ⋖ E + E ⋖ * ⋖ id ⋗ $
id + * $
Lef id ⋗ ⋗ ⋗ $ ⋖ E + E ⋖ * E ⋗ $
t Left + has high priority
+ ⋖ ⋗ ⋖ ⋗
sid then right + $ ⋖ E + E ⋗ $
e * ⋖ ⋗ ⋗ ⋗
$ ⋖ ⋖ ⋖ $ E $

54
2. Operator Precedence Parsing
Algorithm : operator precedence parsing

❖ Method :
Initially the stack contains $ and the input buffer the string w$. To parse we execute the below
program
1. Set ip to point to the first symbol of w$:
2. Repeat forever
3. if $ is on top of the stack and ip points to $ then
4. return
5. else begin
6. let a be the topmost terminal symbol on the stack and b be the symbol pointed by ip
7. if a ⋖ b or a ≐ b the begin
8. push b onto the stack

55
2. Operator Precedence Parsing
Algorithm : operator precedence parsing

❖ Method :
9. advance ip to the next input symbol
10. end
11. else if a ⋗ b then
12. repeat
13. pop the stack
14. until the top stack terminal is related by ⋖ to the terminal most recently popped
15. else
16. error()
17. end

56
2. Operator Precedence Parsing
Operator precedence function
❖ Precedence between “a” and “b” can be determined by numerical comparison
function f and g
❖ f(a) = g(b) if a ≐ b
❖ f(a) < g(b) if a ⋖ b
❖ f(a) > g(b) if a ⋗ b

57
2. Operator Precedence Parsing
Algorithm : construct precedence functions

❖ Input :
An operator precedence matrix

❖ Output :
Precedence functions representing the input matrix, or an indication that none exist

58
2. Operator Precedence Parsing
Algorithm : construct precedence functions

❖ Method :
1. Create symbol “fa” and ga” for each terminal “a” and $
2. Partitions the created symbols into as many group as possible in a such a way that if a ≐ b then
“fa” & “gb” are in same group
3. Create a directed graph whose nodes are the groups found in step-2
for any “a” and “b”
if a ⋖ b then place an edge from group “gb” to group “fa”
if a ⋗ b then place an edge from group “fa” to group “gb”
4. If graph is constructed in step-3 has a cycle then no precedence function exist. If there are no
cycle then let f(a) be the length of the longest path beginning at the group of “fa” and g(a) be
the length of the longest path from beginning at the group of “ga”

59
2. Operator Precedence Parsing

gid fid
Right side
------- g -------
id + * $
f* g*
Left id ⋗ ⋗ ⋗
side
+ ⋖ ⋗ ⋖ ⋗
-----
-- f * ⋖ ⋗ ⋗ ⋗ Find max path to reach
----- g+ f+ either f$ or g$
-- $ ⋖ ⋖ ⋖

id + * $
Draw edge from grater to less f$ g$ f 4 2 4 0
e.g. F(+) > g(+) so edge from f+ to g+
g 5 1 3 0

60
2. Operator Precedence Parsing
Parse string id + id * id

$ id + id * id $
0 5 2 5 4 5 0

$ E + E * E $
0 2 4 0

$ E + E $
0 2 0

$ E $ id + * $
0 0 f 4 2 4 0
g 5 1 3 0

61
2. Operator Precedence Parsing
operator precedence table
Right side
Grammar: Lef id + - * / ^ ( ) $
t id ⋗ ⋗ ⋗ ⋗ ⋗ ⋗ ⋗
E E + E | E – E | E * E | E / E | E ^ E | ( E ) | idside
+ ⋖ ⋗ ⋗ ⋖ ⋖ ⋖ ⋖ ⋗ ⋗
- ⋖ ⋗ ⋗ ⋖ ⋖ ⋖ ⋖ ⋗ ⋗
* ⋖ ⋗ ⋗ ⋗ ⋗ ⋖ ⋖ ⋗ ⋗
/ ⋖ ⋗ ⋗ ⋗ ⋗ ⋖ ⋖ ⋗ ⋗
^ ⋖ ⋗ ⋗ ⋗ ⋗ ⋖ ⋖ ⋗ ⋗
( ⋖ ⋖ ⋖ ⋖ ⋖ ⋖ ⋖ ≐
) ⋗ ⋗ ⋗ ⋗ ⋗ ⋗ ⋗
$ ⋖ ⋖ ⋖ ⋖ ⋖ ⋖ ⋖

62
3. LR parsing
LR : left to right scan Right most derivation

Can handle left recursive grammar

Can not handle the ambiguous grammar

63
3. LR parsing
3 types

❖ Simple LR (SLR)
Can solve LR(0) grammars

❖ Canonical LR (CLR)
Can solve LR(1) grammars

❖ Lookahead LR (LALR)

64
3. LR parsing : (1) SLR
Weakest methods from all 3 LR methods

Process
❖ Convert grammar in augmented grammar by adding one more starting symbol S’ S
❖ Construct LR(0) item sets
❖ Construct parsing table

65
3. LR parsing : (1) SLR
How to construct item set

❖ Production A XYZ can have 4 forms

A . XYZ
A X.YZ
A XY.Z
A XYZ.

❖ Production A ϵ can generate only one item A .

66
3. LR parsing : (1) SLR
How to construct item set

❖ Closure function
If I is a set of items for a grammar G then closure(I) is constructed by two rules
1. Initially every item in I is in closure of I
2. A α . B β is in closure(I) & B γ is the production rule then B . γ is added in closure(I)
continue this rule till no new item can be added

❖ GOTO function
goto( I , x ) where “I” is item set and “x” is grammar symbol
goto( I , x ) = closure of the set of all items [ A α x . β ] such that [ A α . x β ] is in I

67
3. LR parsing : (1) SLR
How to construct parsing table

1. If S’ S. is in Ii then set action[ i , $ ] to accept

2. If A α . x β is in Ii , where “x” is terminal , and goto( Ii , x )= Ij then

set action[ i , x ] to “shift j”

3. If A α . B β is in Ii, where B is nonterminal then

set goto[ i , B ] to the j where we are having first production A α B . β in Ij

4. If A α . is in Ii then set action[ i , x ] to “reduce A α” for all “x” in follow(A)

Give numbers to production 0) E ’ E Find follow set for FOLLOW( E ) = { $ , ) , +}

(will need it in table generation) 1) E E+ all nonterminal FOLLOW( T ) = { $ , ) , + , *
T }
2) E T FOLLOW( F ) = { $ , ) , + , *
3) T T*F }
4) T F
5) F (E)
6) F id

69
E’ After
Its same .E
I2 .has . E so add production ofT EFname
Its is
After new so don’t
there
item
. there are
isset
give
EFT so
new
5give name,
possibilities
new
(nonterminal) nameEgive
I1 ( andI2
Same EProduction
when .Titem
idadd
Same way
with
has .T
set
when
so “.
add
match E”productions
find
become
with any “E of T item set
previous
So production
(nonterminal) of new
EF item set give new name
3. LR parsing : (1) SLR T.” Prepare
give same .F hasgoto
name .F sofor
So add production
addallproductions of F
of T
Construct LR(0) item sets I4 = goto( I0 , ( ) = F (.E I8 = goto( I4 , E ) = F (E.)
Io = E ’ . )E .E+ E E.+
E T T
.E+
ET .. TT * I2 = goto( I4 , T ) E T.
T = T T.*
TE .T* F
FT .(. FE ) F
F I3 = goto( I4 , F ) = T F.
FT .. (FE F . id
) I5 = goto( I0 , id ) = F id . I4 = goto( I4 , ( ) = F (.E
F . id )E .E+
I1 = goto( I0 , E ) = E’ E. I6 = goto( I1 , + ) = E E+. T
E E.+ TT .T* ET .. TT *
T F F
I2 = goto( I0 , T ) E T. TF . FE )
.( FT .(. FE )
= T T.* F . id F . id
F
I7 = goto( I2 , * ) = T T*. I5 = goto( I4 , id ) = F id .
I3 = goto( I0 , F ) = T F. FF .( E )
F . id
70
New item set so give new name
3. LR parsing : (1) SLR
Construct LR(0) item sets
I4 = goto( I7 , ( ) = F (.E) I6 = goto( I8 , + ) = E E+.
E .E+ T
I9 = goto( I6 , T ) E E+T
T T .T*
= .
E .T F
T T.*F
I3 = goto( I6 , F ) = T F. T .T* T .F
F F .(E)
I4 = goto( I6 , ( ) = F (.E) T .F F . id
E .E+ F .(E) I7 = goto( I9 , * ) = T T*.
T I5 = goto( I7 , id ) = F . id.
id F
E .T
F .(E)
T .T*
F . id
F I11 = goto( I8 , ) ) = F (E).
T .F
F .(E)
I5 = goto( I6 , id ) = FF .idid.

I10 = goto( I7 , F ) = T T*F

.
71
3. LR parsing : (1) SLR
E + T *
I0 I1 I6 I9 I7
( F
I3

I ) E ( (
I8 I4 I4
11
+ id
I6 I5

T
I2
T * F I
F I2 I7
10
I3
(
id I4
I5
id
I5

id F
I5 I3

72
I3 has item T F. (dot I5 =atgoto
I1 the (end)
I0,id ))
I0,E
Production numberinof0,E T do
0,id F entry
do is 4 of
entry of 1S5 (shift 5)
3. LR parsing : (1) SLR Follow(T)={+,*,),$}
In (3,+) (3,*) (3,)) (3,$) do entry of R4 (reduce 4)
action goto
Item set
id + * ( ) $ E T F
0 S5 S4 1 2 3
1 S6 Acc
2 R2 S7 R2 R2
3 R4 R4 R4 R4
4 S5 S4 8 2 3
5 R6 R6 R6 R6
6 S5 S4 9 3
7 S5 S4 10
8 S6 S11
9 R1 S7 R1 R1
10 R3 R3 R3 R3
11 R5 R5 R5 R5
73
3. LR parsing : (1) SLR
Stack Input Action
0 id * id + id $ Shift
0 id 5 * id + id $ Reduce by F id
0F3 * id + id $ Reduce by T F
0T2 * id + id $ Shift
In parsing table entry of (0,id) is S5
So Action is Shift 0T2*7 id + id $ shift
Shift one element from input to stack 0 T 2 * 7 id 5 + id $ Reduce by F id
Place 5 after that in stack 0 T 2 * 7 F 10 + id $ Reduce by T T*F
In parsing table entry of (3,*) is R6 0T2 + id $ Reduce by E T
So Action is Reduce with production 6 (F id) 0E1 + id $ Shift
In stack find id and replace with F
0E1+6 id $ Shift
In stack it become 0F
In parsing table entry of (0,F) is 3 0 E 1 + 6 id 5 $ Reduce by F id
Place 3 in stack 0E1+6F5 $ Reduce by T F
0E1+6T9 $ Reduce by E E+T
0E1 $ Accept
74
3. LR parsing : (2) CLR
Process
❖ Convert grammar in augmented grammar by adding one more starting symbol S’ S
❖ Construct LR(1) item sets
❖ Construct parsing table

75
3. LR parsing : (2) CLR
How to construct item set

❖ Closure function
If I is a set of items for a grammar G then closure(I) is constructed by two rules
1. Initially every item in I is in closure of I
2. A α . B β , a is in closure(I) & B γ is the production rule
then B . γ , FIRST(βa) is added in closure(I) continue this rule till no new item can be added

❖ GOTO function
goto( I , x ) where “I” is item set and “x” is grammar symbol
goto( I , x ) = closure of the set of all items [ A α x . β , a ] such that [A α . x β , a ] is in
I

76
3. LR parsing : (2) CLR
How to construct parsing table

1. If S’ S. , $ is in Ii then set action[ i , $ ] to accept

2. If A α . x β , a is in Ii , where “x” is terminal , and goto( Ii , x )= Ij then

set action[ i , x ] to “shift j”

3. If A α . B β is in Ii, where B is nonterminal then

set goto[ i , B ] to the j where we are having first production A α B . β , a in Ij

4. If A α . , a is in Ii then set action[ i , a ] to “reduce A α”

77
3. LR parsing : (2) CLR
Grammar: Make augmented Augmented Grammar:
grammar
S CC S’ S
C cC | d S CC
C cC | d

Give numbers to production 0) S’ S Find first set for FIRST ( S ) = { c , d }

(will need it in table generation) 1) S CC all nonterminal FIRST ( C ) = { c , d }
2) C cC
3) C d

78
S’ .S need to compare with A α.Bβ , a
here β is ϵ and a is $ , so FIRST(βa) = {$}
3. LR parsing : (2) CLR $ is added in look ahead of S .CC

Construct LR(1) item sets I4 = goto( I0 , d ) = C d. , I3 = goto( I3 , c ) = C c.C , c/d

c/d C .cC , c/d
Io = S ’ .S, C .d , c/d
S$ . CC , I5 = goto( I2 , C ) = S CC. ,
C$ .cC , $ I4 = goto( I3 , d ) = C d. ,
c/d c/d
C .d , c/d I6 = goto( I2 , c ) = C c.C , $
C .cC , $ I9 = goto( I6 , C ) = C cC. ,
I1 = goto( I0 , S ) = S ’ S. , $
C .d , $
$
I6 = goto( I6 , c ) = C c.C , $
I2 = goto( I0 , C ) = S C.C , $ C .cC , $
C .cC , I7 = goto( I2 , d ) = C d. , C .d , $
$ $
C .d , $ I8 = goto( I3 , C ) = C cC. , I7 = goto( I6 , d ) = C d. ,
I3 = goto( I0 , c ) = C c.C , c/d c/d $
C .cC ,
c/d
C .d , c/d
79
3. LR parsing : (2) CLR

S
I0 I1

C C
I2 I5
c c

C c c C
I8 I3 I6 I9

d d d
I4 I7 I7

d
I4

80
3. LR parsing : (2) CLR
I3 = goto (I0,c) action goto
Item set
So do entry of shift3(S3) in (0,c) c d $ S C
0 S3 S4 1 2
I2 = goto (I0,C)
So do entry of 2 in (0,C) 1 Acc
2 S6 S7 5
I4 contain production C d.,c/d 3 S3 S4 8
Dot at the end so need to do reduce entry 4 R3 R3
C d has production number 3
5 R1
Look ahead are c and d
So do entry of R3 in (4,c0 and (4,d) 6 S6 S7 9
7 R3
8 R2 R2
9 R2

81
3. LR parsing : (3) LALR
Process
❖ Convert grammar in augmented grammar by adding one more starting symbol S’ S
❖ Construct LR(1) item sets
❖ Combine the item sets having same core but different lookahead
❖ Construct parsing table

82
3. LR parsing : (3) LALR
Grammar: Make augmented Augmented Grammar:
grammar
S CC S’ S
C cC | d S CC
C cC | d

Give numbers to production 0) S’ S Find first set for FIRST ( S ) = { c , d }

(will need it in table generation) 1) S CC all nonterminal FIRST ( C ) = { c , d }
2) C cC
3) C d

83
3. LR parsing : (3) LALR
Construct LR(1) item sets
I6 = C c.C , $
Io = S ’ .S, I3 = C c.C , c/d I36 = C c.C , c/d/$
C .cC , $
S$ . CC , C .cC , C .cC , C .d , $
C$ .cC , c/d c/d/$
c/d C .d , c/d C .d , c/d/$
C .d , c/d
I7 = C d. ,
I47 = C d. , $
I1 = S ’ S. , I4 = C d. , c/d/$
$
c/d
I8 = C cC. ,
c/d
I2 = S C.C , $
C .cC , I5 = S CC. , I89 = C cC. ,
$ $ c/d/$
C .d , $ I9 = C cC. ,
$

84
3. LR parsing : (3) LALR

Item action goto

set c d $ S C
0 S36 S47 1 2
1 Acc
2 S36 S47 5
36 S36 S47 89
47 R3 R3 R3
5 R1
89 R2 R2 R2

85
Using Ambiguous Grammar
Every ambiguous grammar fails to be LR

Certain types of ambiguous grammars are useful in the specification and

implementation of languages

Ambiguous grammar provides a shorter, more natural specification than any

equivalent unambiguous grammar

86
Using Ambiguous Grammar
Ambiguous grammar Equivalent unambiguous grammar
E E + E | E * E | (E) | id E E+T|T
T T*F|F
F (E) | id

❖ Grammar is ambiguous because it does ❖ Generates same language but give + a

not specify the associativity and lower precedence than * and makes
precedence of the operators + and * both operators left associative

87
Using Ambiguous Grammar
Reasons : might want to use ambiguous grammar instead of unambiguous

1. Can easily change the associativities and precedence levels of the operators without
disturbing the production of ambiguous grammar or the number of states in the
resulting parser.

2. Unambiguous grammar will spend a large fraction of time reducing by the

productions E T and T F.
The parser for ambiguous grammar will not waste time reducing by these single
productions.

88
Syntax Analyzer Generator YACC

Yacc source Yacc

y.tab.
file Compiler c
(.y)

C
y.tab. a.out
c Compiler

Input stream a.out Sequence of token

89
Structure of Yacc Program
declarations
%%
translation rules
%%
Supporting c-routines

SMART Vocabulary
0% (1)
SMART Vocabulary
49 pages
SVOMPT: Word Order in English
50% (2)
SVOMPT: Word Order in English
1 page
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
ACTIVITY 3 Identifying Sentence Errors
No ratings yet
ACTIVITY 3 Identifying Sentence Errors
4 pages
Syntax Analysis
No ratings yet
Syntax Analysis
90 pages
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
No ratings yet
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
44 pages
CS6109-MODULE-5
No ratings yet
CS6109-MODULE-5
117 pages
CD Unit-3 Part-1
No ratings yet
CD Unit-3 Part-1
99 pages
u2ppt
No ratings yet
u2ppt
91 pages
Compiler CH-3
No ratings yet
Compiler CH-3
6 pages
PPT Lecture 1.9 Top Down Parsing and Lecture 1.10 Recursive Descent Parsing (1)
No ratings yet
PPT Lecture 1.9 Top Down Parsing and Lecture 1.10 Recursive Descent Parsing (1)
21 pages
Unit - Ii 2.1 Syntax Analysis
No ratings yet
Unit - Ii 2.1 Syntax Analysis
122 pages
Module-2 1
No ratings yet
Module-2 1
51 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
82 pages
M2 Compiler Design
No ratings yet
M2 Compiler Design
51 pages
2-Role of Parser and Parse Tree-02!08!2024
No ratings yet
2-Role of Parser and Parse Tree-02!08!2024
69 pages
unit-3
No ratings yet
unit-3
117 pages
51114. Compiler Design Syntax Analysis Top Down
No ratings yet
51114. Compiler Design Syntax Analysis Top Down
34 pages
td2-ll_1-parsing
No ratings yet
td2-ll_1-parsing
45 pages
Syntax Analyzer
No ratings yet
Syntax Analyzer
38 pages
ACD-UNIT-4 Notes
No ratings yet
ACD-UNIT-4 Notes
32 pages
Compiler Design Syntax Analysis Top Down
No ratings yet
Compiler Design Syntax Analysis Top Down
34 pages
Chapter 8 - Syntax Analysis
No ratings yet
Chapter 8 - Syntax Analysis
92 pages
Unit II PDF
No ratings yet
Unit II PDF
7 pages
Top Down PDF
No ratings yet
Top Down PDF
49 pages
Unit 3
No ratings yet
Unit 3
37 pages
Chapter 3
No ratings yet
Chapter 3
180 pages
Parsing
No ratings yet
Parsing
158 pages
CD Chapter-3
No ratings yet
CD Chapter-3
105 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
73 pages
Top Down Parser
No ratings yet
Top Down Parser
111 pages
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
No ratings yet
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
44 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
68 pages
Chapter # 5 Parsing Mechanisms. Chapter # 5 Parsing Mechanisms
No ratings yet
Chapter # 5 Parsing Mechanisms. Chapter # 5 Parsing Mechanisms
31 pages
FIRST Set in Syntax Analysis: Lecture-05
No ratings yet
FIRST Set in Syntax Analysis: Lecture-05
14 pages
3 Syntax Analysis
No ratings yet
3 Syntax Analysis
42 pages
Parsing
No ratings yet
Parsing
38 pages
Chapter 3 (2)
No ratings yet
Chapter 3 (2)
41 pages
Chapter-3 so far
No ratings yet
Chapter-3 so far
50 pages
APznzaYtAWjYy0s_GBEoizaF1ROv5e2pS_Nl6BcNYabrBN8gt4KeYj7LFiXdkYVxT_V92vXdgLmWE0ZcbyVltch5fozoqQQ4KdG766DLjO8aJsMIPKjEjniZOjL0qtNhMykCRh_ohPtDpZvrHNBAvbbZBhvxDpVEqpjDluyzuJGi-VI3NuG46DY_24QwGBEoRdfQYjfevW6tvweeRG (1)
No ratings yet
APznzaYtAWjYy0s_GBEoizaF1ROv5e2pS_Nl6BcNYabrBN8gt4KeYj7LFiXdkYVxT_V92vXdgLmWE0ZcbyVltch5fozoqQQ4KdG766DLjO8aJsMIPKjEjniZOjL0qtNhMykCRh_ohPtDpZvrHNBAvbbZBhvxDpVEqpjDluyzuJGi-VI3NuG46DY_24QwGBEoRdfQYjfevW6tvweeRG (1)
100 pages
Week 10 - Non Recursive Predictive Parsor
0% (1)
Week 10 - Non Recursive Predictive Parsor
41 pages
Unit-II CD
No ratings yet
Unit-II CD
81 pages
Compiler Design Lec-Three Syntax Analysis
No ratings yet
Compiler Design Lec-Three Syntax Analysis
60 pages
parsing technique baar baar
No ratings yet
parsing technique baar baar
29 pages
Cdeprt
No ratings yet
Cdeprt
12 pages
Chapter4-1
No ratings yet
Chapter4-1
61 pages
Syntax Analysis I 2022 Class
No ratings yet
Syntax Analysis I 2022 Class
33 pages
Predictive Parsing and LL (1) - Compiler Design - Dr. D. P. Sharma - NITK Surathkal by Wahid311
100% (2)
Predictive Parsing and LL (1) - Compiler Design - Dr. D. P. Sharma - NITK Surathkal by Wahid311
56 pages
CD UNIT II
No ratings yet
CD UNIT II
11 pages
2.2 - Syntax Analysis (Upto Top-down Parsing)
No ratings yet
2.2 - Syntax Analysis (Upto Top-down Parsing)
91 pages
Top to Bottom (1)
No ratings yet
Top to Bottom (1)
31 pages
Syntax Analysis Till Internal Exam
No ratings yet
Syntax Analysis Till Internal Exam
67 pages
CD Unit3
No ratings yet
CD Unit3
74 pages
Scanner Parser
No ratings yet
Scanner Parser
62 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
Chapter – three
No ratings yet
Chapter – three
139 pages
Chapter-3-Syntax Analysis
No ratings yet
Chapter-3-Syntax Analysis
126 pages
Top Down Parsing
No ratings yet
Top Down Parsing
38 pages
Mod 2.1 - (Lec 8) - Syntax Analyzer and CFG
No ratings yet
Mod 2.1 - (Lec 8) - Syntax Analyzer and CFG
39 pages
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
No ratings yet
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
19 pages
Top-Down and Bottom-Up Parsing
No ratings yet
Top-Down and Bottom-Up Parsing
23 pages
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
No ratings yet
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
19 pages
Chapter 3-Syntax Analysis-II
No ratings yet
Chapter 3-Syntax Analysis-II
28 pages
FULL NOTE
No ratings yet
FULL NOTE
45 pages
Unit 8 Cryptographic Hash Function
No ratings yet
Unit 8 Cryptographic Hash Function
38 pages
Code Optimization_Expert Talk
No ratings yet
Code Optimization_Expert Talk
58 pages
Lexical analysis
No ratings yet
Lexical analysis
62 pages
Replacing A Relative Clause by A Participle Construction
No ratings yet
Replacing A Relative Clause by A Participle Construction
2 pages
Preposition Worksheet
No ratings yet
Preposition Worksheet
1 page
2030 Size
No ratings yet
2030 Size
444 pages
FCE Writing Plan Template
No ratings yet
FCE Writing Plan Template
1 page
Edited Year 8 Workbook-1
No ratings yet
Edited Year 8 Workbook-1
116 pages
PhilRice Style Guide
No ratings yet
PhilRice Style Guide
49 pages
The Preposition Aus German
No ratings yet
The Preposition Aus German
4 pages
Conjunctions - Exercises
No ratings yet
Conjunctions - Exercises
2 pages
Test Reported Speech 9 2
No ratings yet
Test Reported Speech 9 2
1 page
Narative Text
No ratings yet
Narative Text
23 pages
Types of Translation, by Dr. Shadia Yousef Banjar
No ratings yet
Types of Translation, by Dr. Shadia Yousef Banjar
13 pages
Enclosure No. 2 To Division Memorandum No.
No ratings yet
Enclosure No. 2 To Division Memorandum No.
8 pages
Repaso II de Ingles - Admision Ps
No ratings yet
Repaso II de Ingles - Admision Ps
9 pages
English HSSC-I Pre-board exam 2025
No ratings yet
English HSSC-I Pre-board exam 2025
6 pages
The Semantic Representation of Causation and Agentivity Thomason
No ratings yet
The Semantic Representation of Causation and Agentivity Thomason
16 pages
Writing Simple Sentences Correctly Using Colourful Substitution Table
No ratings yet
Writing Simple Sentences Correctly Using Colourful Substitution Table
20 pages
Cavite State University: College of Arts and Sciences Department of Languages and Mass Communication
No ratings yet
Cavite State University: College of Arts and Sciences Department of Languages and Mass Communication
9 pages
VALYRIAN
100% (1)
VALYRIAN
9 pages
Grammar Booklet
No ratings yet
Grammar Booklet
40 pages
Preterite Stem-Changing Verbs II
No ratings yet
Preterite Stem-Changing Verbs II
17 pages
Courseguide Junior IELTS 5.0
No ratings yet
Courseguide Junior IELTS 5.0
6 pages
The Students' Ability To Identify Nominal and Verbal Sentences in English of Grade Viii
No ratings yet
The Students' Ability To Identify Nominal and Verbal Sentences in English of Grade Viii
6 pages
Capítulo 6 SPAN
No ratings yet
Capítulo 6 SPAN
22 pages
Intensifiers: Quite
No ratings yet
Intensifiers: Quite
3 pages
Actividad 1.1 English IV
No ratings yet
Actividad 1.1 English IV
4 pages
Descriptive Text
No ratings yet
Descriptive Text
3 pages
Gerund and Infinitive Forms: C. They Will Consider Granting You Money
No ratings yet
Gerund and Infinitive Forms: C. They Will Consider Granting You Money
25 pages

syntax analysis

Uploaded by

syntax analysis

Uploaded by

Syntax Analysis

expression expression + term

Symbols for nonterminals

u , v , … , z represents strings of terminals

α , β , γ , represents strings of grammar symbols (terminal and/or nonterminal)

Unless stated, head of first production is start symbol

A language generated by CFG is called CFL (Context Free Language)

Left most derivation

Everything that can described by a RE can be described by a Grammar, but

Every regular language is context free language but not vice-versa

If A αβ1 | αβ2 are two productions of A and

i/p : Non left factored grammar : A α β1 | α β2

o/p : Left factored grammar : A α A’

Universal Top-down Bottom-up

Universal Top-down Bottom-up

• Scan string from left to right

Universal Top-down Bottom-up

• Scan string from left to right

FIRST ( α ) = FIRST ( β ) U FIRST ( γ )

FOLLOW ( α ) FIRST ( γ ) U FOLLOW ( γ )

“k” input symbols of lookahead

Multiple production in cell so grammar is not LL(1)

Remove left recursion

Remove left recursion Nontermina Terminal

String Handle Reducing production

❖ Shift / reduce conflict

❖ Reduce / reduce conflict

E.g. E EAE | (E) | -E | id

E E+E | E-E | E*E | E/E | E^E | (E) | -E | id Equivalent operator grammar

❖ t1 ≐ t2 t1 has same priority as t2

t1 ⋖ t2 and t2 ⋗ t1 are not same always

1. Construct operator precedence table

Can handle left recursive grammar

Can not handle the ambiguous grammar

❖ Production A XYZ can have 4 forms

❖ Production A ϵ can generate only one item A .

1. If S’ S. is in Ii then set action[ i , $ ] to accept

2. If A α . x β is in Ii , where “x” is terminal , and goto( Ii , x )= Ij then

3. If A α . B β is in Ii, where B is nonterminal then

4. If A α . is in Ii then set action[ i , x ] to “reduce A α” for all “x” in follow(A)

Give numbers to production 0) E ’ E Find follow set for FOLLOW( E ) = { $ , ) , +}

I10 = goto( I7 , F ) = T T*F

1. If S’ S. , $ is in Ii then set action[ i , $ ] to accept

2. If A α . x β , a is in Ii , where “x” is terminal , and goto( Ii , x )= Ij then

3. If A α . B β is in Ii, where B is nonterminal then

4. If A α . , a is in Ii then set action[ i , a ] to “reduce A α”

Give numbers to production 0) S’ S Find first set for FIRST ( S ) = { c , d }

Construct LR(1) item sets I4 = goto( I0 , d ) = C d. , I3 = goto( I3 , c ) = C c.C , c/d

Give numbers to production 0) S’ S Find first set for FIRST ( S ) = { c , d }

Item action goto

Certain types of ambiguous grammars are useful in the specification and

Ambiguous grammar provides a shorter, more natural specification than any

❖ Grammar is ambiguous because it does ❖ Generates same language but give + a

2. Unambiguous grammar will spend a large fraction of time reducing by the

Yacc source Yacc

Input stream a.out Sequence of token

You might also like