100%(1)100% found this document useful (1 vote) 512 views170 pagesCD 2,3 Unit's Material
By mani gopinadh
TextBook problem..
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
; Syntax Analysis
Syllabus
Syntax analysis - Role of a parser - Classification
and Follow- LL(1) Grammars, No
parsing.
of parsing techniques - Top down parsing - First
n-recursive predictive parsing - Error recovery in predictive
Contents
3.1 Introduction
3.2. Role of Parser
3.3 Context Free Grammar (CFG) .
cece Aug/Sept.-07, Set-2, ----- Marks 8
. May-08,09, Set-2,4,
- Aug/Sept.-08, Set-1;
a 5 -- April-09, Set-2, +++ Marks 8
34. Classification of Parsing Techniques
35 Top Down Parsing vesessse-.May-08,09, Set-1,2,4,
- Aug/Sept.-08, Set-1;
- April-09, Set-2,
- January-10, Set-4, ----- + Marks 8
36 Recursive Descent Parser..............-+ May-05,09, Set-4,3;
Bo seesnoanacenodos Nov.-03, Set-3, +++ +++++ Marks 8
37 LL(1) Grammars
48 Non Recursive Predictive Parsing
49° Strategies to Recover from Syntactic Errors
a1
0 Exror Recovery in Predictive Parsing
Scanned with CamScanner§
3-2 nt
EE
Ty D
[EM Introduction
‘The syntax analysis
asically checks for the syn
feyntax) can be recognized After grouping “
recognized then syntactic error will be genera m
checking of the language. 7 a
" ———— alysis is a process which takes th ing,
actic structure) o generates the sym
phase in compilation. The syntax analyzer *
fax of the language: ‘A syntax analyzer takes the tokeng
euch a way that some Programming si,
he tokens if at all, any syntax can
ted. This overall process is called
is the second
A parsing or syntax an
|
Definition of parser =
either a parse tree (synt
|string w and produces
errors.
For exampl
b+10;
|
|
|
The above programming - |
statement is first given to lexical 2 j
analyzer. The lexical analyzer
will divide it into group of a +.
tokens. The syntax analyzer / \ |
takes the tokens as input, and r \o |
generates a tree like structure |
|
|
called parse tree. Fig, 3.1.1 Parse tree for a = b + 10
The parse tree drawn above is for some programming statement. It shows how &
statement gets parsed according to their syntactic specification.
[SEE Basic Issues in Parsing
«There are two important issues in parsing :
i) Specification of syntax
|
|
|
|
|
i
|
|
ii) Representation of input after parsing,
ee Paneer pers yi
fey pera ce) in parsing is specification of syntax in progra!
- Speci of syntax means how to writ mning
There are certain characteristic of specification of eyntae ee oe
i) This specification should be precise and unambigu ;
ii) This specification should be i bas
e ii il, Le. it i
progtnming galt Bim etal iit shout cover all te deals of
iif) This specification should be complete
Such a specification is called “Context F,
ntext Free Gram:
mar”,
TECHNICAL PUBLICATIONS”
* An up thrust for,
knowodgo
Scanned with CamScannerSyntax Analysis
Compiler Design _3:3 =
* Another important issue in parsing is representation of the input after parsing.
This is important because all the subsequent phases of compiler take the
information from the parse tree being generated. This is important because the
information suggested by any input programming, statement should not. be
differed after building the syntax tree for it.
* Lastly the mos
crucial issue is the parsing algorithm based on which we get the
Parse tree for the given input. We es t0 parsing :
Top-down and bottom-up, And we will study parsing algorithms concerning to
these approaches. We are mainly interested in following i
will discuss different approach
ues -
* How these algorithms work ?
* Are they efficient in nature?
* What are their merits and limitations ?
+ Kind of input they require.
Keep one thing in
d that we are now interes
We are not interested in meaning right now.
this phase. Checking of the meaning (semant
studied in the next phase of compilation,
ted in simply syntax of the language
Hence we will do only syntax analysis at
ic) from syntactically correct input will be
‘The modified view of front end is as shown below.
Module
Error handler
interface
Parser
Module [Semantic] Intermediate
interface | analyzer |~coae
Fig. 3.1.2 Front end of compiler
ESHA] Why Lexical and Syntax Analyzer are Separated Out 2
lees Demand token
Input
Source program’
‘Supply token
LE ROA
The lexical analyzer scans the input program and collects the tokens from it, On the
other hand parser builds a’parser tree using these tokens, These are two important
activities and these activities are independently carried out by these two phases.
Separating out these two phases has two advantages - Firstly it accelerates the process
of compilation and secondly the errors in the source input can be identified precisely.
TECHNICAL PUBLICATIONS”. An p trust for knowledge
ss Seared with (am ScannerCompiler Design
(¢) Role of Parser
In the process of compilation the par
s string of tol
means, when parser requires s voles
lexical analyzer supplies tokens to syntax analyzer (parser),
Syntax Aly,
a
ser and lexical analyzer work together Te
Kens it invokes lexical analyzer. In tye,
» the
Error handler
Source
program
Lexical
analyzer
Parse Rest of
compiler
‘Supply for tokens
Symbol table
Fig. 3.2.1 Role of parser
The parser collects sufficient number of tokens and builds a parse tree. Thus by
building the parse tree, parser smartly finds the syntactical errors if any. It is also
necessary that the parser should recover from commonly occurring errors so that
remaining task of process the input can be continued.
(228 Context Free Grammar (CFG)
XE ee
The context free grammar G is a collection of following things.
1. V is a set of non-terminals.
2. T is a set of terminals.
3. S is a start symbol.
4. P is a set of production rules,
Thus G can be represented as G = (V,T,P)
The production rules are given in following form-
Non-terminal — (V U T)*
Let the language L
V,TS,P )
Wheré V = (SI,
T= {a,b}
And is a start symbol then,
= a"b" where n> 1
sive the production rules,
Solution :
P=
prt Scanned with CamScannerCompiler Design 3-5 Syntax Analysis
Sab
a
The production rules actually defines the language ab",
The non-terminal symbol occurs
need to be expanded. The terminal s
language. Thus any language construct can be defined by the context free grammar. For
example if we want to define the declarative sentence using context free grammar then
it could be as follows,
State — Type List Terminator
Type — int | float
List > Listid
List - id
Terminator + ;
Using above rules we can derive,
State
Type List Terminator
int Nh i
Fig. 3.3.1 Parse treo for derivation of int id, id, id;
Hence int id, id, id; can be defined by means of above context free grammar.
Following rules to be followed while writing a CFG.
1. A single non-terminal should be at LHS.
2. The rule should be always in the form of LHS. > RHS. where RHS. may be the
combination of non-terminal and terminal symbols.
3. The NULL derivation can be specified as NT ~ .
4. One of the non-terminals should be start symbol and conventionally we should
write the rules for this non-terminal,
TECHNICAL PUBLICATIONS”. An up tnust fr knowodye
Scanned with CamScannerCompiler Design
[EEEH Derivation and Parse Trees
Derivation from $ means generation of string W from S. For constructing. Aertivaticn
two things are important.
i) Choice of non-terminal from several others.
ii) Choice of rule from production rules for corresponding, non-terminal.
Definition of derivation tree
Let G = (V, T, P, $) be a Context Free Grammar.
Jhich can be constructed by following properties,
The derivation tree is a tree w!
i) The root has label S.
ii) Every vertex can be derived from (V U T Ue).
iii) If there exists a vertex A with children Ry, Ro, ~ Rn then there should be
production A > Ry Ro Rn
iv) The leaf nodes are from set T and interior nodes are from set v.
Instead of choosing the arbitrary non-terminal one can choose.
i) Either leftmost non-terminal in a sentential form then it is called leftmost
derivation.
ii) Or rightmost non-terminal in a sentential form, then it is called rightmost
derivation.
Consider the grammar given below -
E+E+E|E-E|E*E|E/E|alb
Obtain leftmost and rightmost derivation for the stringa +b*a +b.
Solution : Leftmost derivation
E
JN E+E
E+E+E
a+Ete
A+ECEtE
a+beE+E
a+bente
atpraty
S07
c—m
s—m
Scanned with CamScannerSyntax Analysis
Rightmost derivation
: e
Ese
ol E+E+se
AN AN Bet
e+e els sath
E+Eearn
| E+beath
a ba atbeath
Consider the grammar
SoMa
LoLS|S
4) What are the terminals, non-terminals and start symbol ?
b) Find parse trees for the following sentences :
d @a |
ii) (a, (a, a)
iti) (a, ((a, a), (a,a)))
©) Construct a leftmost derivation for each of the sentences in (b).
4) Construct a rightmost derivation for each of the sentences in (b).
e) What language does the grammar generate ?
Solution : a) The terminals are T = {a, (, )}
The non-terminals are V = (L, S}
The start symbol is S.
b) The parse tree for
i) @, a) is
Cu)
“I
Lo]
$ a
y
TECHNICAL PUBLICATIONS". An up tht for hnowedoe
Scanned with CamScannerCompior Dosign a
ii) (a, (a, a) ¢
iii) (a, (Ga, a), (a, a):compiler Design
¢) Leftmost derivations
i)
ii)
iii)
(a, a)
s
@)
(L, S)
5)
(a, 5)
(a, a)
(a, @ a)
s
(Ll)
(L,S)
(L,@)
(©)
(a, (L))
(a, (L, S))
(a, 6, S))
(a, (a, a))
(a (@, a), (a, a)))
s
(18)
(L, S)
(S, 9)
(a, S)
(, @)
(a, (LS)
(a, (S, 8)
(a (), 8)
(a, (L, 8), 8)
{a (S, 8), §))
(a, ((a, a), S))
‘Syntax Analysis
Rosner iran) awe
Scanned with CamScanner——
Compiler Design
(a, ((a, 9), (L)
(a, (a, a), (Ly 9)
(a, (a, a), 5, 9)
(a, (@ a), (@, )))
d) Rightmost derivation
i) @a)
s
()
(L$)
¥ (L, a)
\ )
(a, a)
ii) (a, @,a))
s
(19)
(LS)
(L, ()
(L, (L, §))
(L, a)
(L, (S, a))
(L, (a, a)
(S, (a, a))
(a, (a, a))
iii) (a, (a, a), (a, a)
s
(Ll)
(L$)
(L, @))
(L, (L, §)
@, @, ))
L, , (L, $)
Scanned with CamScanner~ompaer Design
np fa oe Syntax Analysis
(L, (L, (Lay)
(LL (8, ayy
(LL, (a, ayy
(L. (8, fa, ay)
(L, (1), (@, ay)
i «L, CL, 8), (a, a)
(L (CL, a), (@, a)
(L. (Gs, a), (a, a)
(L, (a, a), (a, a)
(S, ((a, a), (a, a)
(a, ((a, a), (a, a)
)_This grammar generates all the strings with of well formed parentheses.
Consider the following grammar,
15) 0)1
0107) 1100101
Solution : i) Leftmost derivation :
s
OA
018
010A
0104 s
|
1
Scanned with CamScannerji) Leftmost derivation + AX
1 /™
1B Va
11s /~\
110A a
11008 -_~
8
110018 ,
1100104 ~
1100101 |
[EEE] Ambiguous Grammar
A grammar G is said to be ambiguous if it generates more than one parse trees fc
sentence of language L(G).
For example
E> E+E] E*E|@lid
Then for id + id * id
m,
&
a——rm,
{a) Parse tree 1
(b) Parse
Fig. 3.3.2 Ambi
Iguous grammar
(EAE show that following grammar is ambi
ss iguous,
S — bSaS
Swe
Solution : Consider a string ‘abab’, We
can construe
ict parse es
Parse trees for deriving ‘abab!
t
TECHNICAL PUBLICATIONS”. Anup sy
or knowledge
Scanned with CamScanner‘Syntax Analy:
Parse tree 1 Parse treo 2
‘The right most derivation for abab_ is -
or
asbs D>——
aSbaSbs a sb ~s.
aSbaSb rf
aSbab
abab
This is a language containing all the
strings with equal number of a's and
bs.
EEBEES Prove that the following grammar is ambiguous.
$3 4B
Boab
Aaa
Ava
Bob
Solution: Ambiguous grammar derives two different parse trees for the same input.
Consider the input aab. It can be represented by -
Parse tree 1 Parse tree 2
As there are two different parse trees for input aab. It is ambiguous grammar.
TECHNICAL PUBLICATIONS”. An up thrust for knowledge
Scanned with CamScannerSyntox Angy
14
Compiter Design
Ss
EY classification of Parsing Technique’
these parsing techniques work on
As we know, there are two parsing techniques,
following principle. | ee
of Ant and identifies th,
1. The parser scans the input string from left to tig
derivation is leftmost or rightmost.
at th
for choosing the appropriate derivation
different approaches in selecting yy.
arse tree is constructed.
2. The parser makes use of production rules
The different parsing techniques use
appropriate rules for derivation. And finally a P+
expanded to leaves then sy,
" ‘an be constructed from root and
When the parse tree can rout te pera cet
type of parser is called Top-down parser. The name itself t
be built from top to bottom.
When the parse tree can be constructed from leaves to root, then such type of parse,
is called as bottom-up parser. Thus the parse tree is built in bottom up manner,
eT]
SLR LALR LR
parser parser parser
Fig. 3.4.1 Parsing techniques
Let us discuss the parsing techniques in detail,
[EJ Top Down Parsing
Top-down parser ]
Scanned with CamScannerconpier 0289 3-18
Consider @ grammar.
S$ > xPz
P= ywly
Consider the input string xyz is as shown below.
Input buffer
Now we will construct the parse tree for above
string, And for this derivation we will make use of to;
Step 1:
The first leftmost leaf of the parse
tree matches with the first input
symbol. Hence we will advance the
input pointer. The next leaf node is P.
We have to expand the node P. After
expansion we get the node y which
matches with the input symbol y.
Step 2:
Now the next node is w= which is not
matching with the input symbol. Hence we go
back to see whether there is another alternative
of P. The another alternative for P is y which
matches with current input symbol. And thus we
could produce a successful parse tree for given
input string.
‘Syntax Analysis
grammar deriving the given input
p down approach.
Fig, 3.5.4 (a)
Fig. 3.5.1 (b)
Scanned with CamScannerstop 3: ne
We halt and_ declare that
ated sutecesstll) :
completed sucesso seo at
In top-down Pa
Fig. 3.5.1 (c)
js very important task,
‘ gues
based on trial and error (ec
or acct a parila FUE an
producing the correct input string the
to backtrack and
production. This process has to
fre get the correct input string:
the productions if we found every P
unsvitable for the string ™ in that case
the parse tree cannot
[ERI Problems wit
‘There ate certain problems
‘we need to eliminate these pro
them.
atch then i
be built.
h Top-down Parsing
down parsing: In ©
ss these P)
in top~ der to implement the parsin
lems. Let us discus roblems and how to saa!
4. Backtracking
Backtré i ii
- jeckrckg is a technique in which for expansion
ative and if some mismatch occurs then we
of non-terminal symbol we
ch
try another alternative if any. os
For example :
S— xPz
Poywly
Then
the
eae mn gtammar.
Seanned with CamScanner3-17 Syntax Analysis
Compiter D259”
2. Left recursion - .
, ‘The left recursive grammar is a grammar which is as given below.
1¢ lef
AL Aa
Here #, means deriving the input in one or more steps. The A can be non-terminal
ere 2,
@ denotes some input string. If left recursion is present in the grammar then it
a serious problems. Because of left recursion the top-down parser can enter in
oe . .
infinite loop. This is as shown in the Fig. 35.3
Fig. 3.5.3 Left recursion
Thus expansion of A causes ly and due to generation of A,
Aa, Auc, Agaa, ..., the input poi i
in top-down parsing and therefore elimination of left recursion is a
To eliminate left recursion we need to modify the
Srammar having a production rule with left recursion.
must.
grammar. Let, G be a context free
A>Aa
AB } +85.)
Then we eliminate left Tecursion by re-writing the production rule as :
Apa’
A’ saa’ +852)
A’se
Thus a new symbol A’ is introduced, We can also verify, whether the modified
Srammar is equivalent to original or not.
TECHNICAL PUBLICATIONS”- An up thrust fr knowledge
Scanned with CamScannerSyntax
Compiler Design Bate “eh
Fig. 3.5.4
For example : Consider the grammar
ESE+T|T
We can map this grammar with the rule A — A |B. Then using equation (2) we
can say,
B=T
Then the rule becomes,
ETE
E+TE’ |e
Similarly for the rule.
TOT*F|F
TECHNICAL Put. ~
Scanned with CamScannercompiler Dost 3:8
Syntax Analysis
We can eliminate left recursion as
TOF
Torr le
The grammar for arithmetic expression can be equivalently written as -
EOE+TI|T ESTE
ES4+TE |e
T Fr’
ToOT*F|E => Torr fe
F>() | id = F>() | id
Consider the following grammar
A= ABd|Aala
Bo Belb
remove left recursion
Solution : Consider the rule,
A— ABd|Aala
We map this grammar with the rule A+ A «|B
|
><>
te
=-[2
is
This can be eliminated by re-writing the production rule as :
A Ba’ A = aA’
Au > GA’ = A’ 5 BAA’
Ao oe Ai’ se
For As Aala
L$ oobdy
A AaB
This left recursion can be eliminated as -
A Ba’ A = ad’
Av + GA’ = A’ & ad’!
Ao oe Ao oe
a
Scanned with CamScanner3-20
Compiter De:
To summarize a’
a AC
A> ABd/Aala ne > RaANaat
Ac ons
B + Belb |
= aap.
We map this grammar with the nile A~* At if
B > Belb
A AaB
use the rule A +fA’, AGA
Bo bB
Bo eB
Bowe
To summarize, the grammar without left recursion will be
Awad’
A’ Bd Alaa’
A’ oe
Bobep
Boe BY
Boe
3. Left factoring
If the grammar is left factored then it becomes sui
the table f . Basically
factoring is used when it is not clear that which of the we | oe ao e
expand the non-terminal. By left factoring we may be able to i > pa ction #
which the decision can be deferred until enough of the j Pena ine :
dics. input is seen to make the rif
In general if
A > af |oB2
is a production then it is not possible for us to
rule or second. In such a situation the above
Aw>GA"
take a decision whether to choose
Srammar can be left factored as
Scanned with CamScannercompiler D081" saat Syntax Analysis
A> BilB2
ror example : Consider the following, grammar.
§ J iEtS | iEtSeS | a
E>b
The left factored grammar becomes,
§ iEtSS'|a
g Sle
Eb
Do left factoring in the following grammar -
A aAB\aA |a
B— bB|b
Solution : If the rule is A > « B,|aB|
left factored. Consider
is a production then the grammar needs to be
A+aAB
ta
L
8
C
=
a
t
a
ge
Pel
We have to convert it to
AwaA AnwaA’
=>
A’BiB2 A’ ABI Ale
Similarly
Bob Blb
Bo be
bouga => Boeke
A a By a Be
To summarize, the grammar with left factor operation will be -
A->aA’
A’ ABIAle
Bo bB
Bs Ble
Scanned with CamScanner4, Ambiguity
° The ambiguous grammar is not desirable in top-down parsin,
he ambiguity from the grammar if it is present.
remove the ambiguity from the grammar if iti
For example:
E+ E4E | Ee E fid
i e will design the parse tree for id aid...
is an ambiguous grammar, We will design the parse tree for id + id « ig * fom, |
(2) Parse tree 1 (0) Parse tree 2
Fig. 3.5.5 Ambiguous grammar
For removing the ambiguity we will apply one rule : If the grammar has lt
Sssociative operator (such as +, -, *, /) then induce the left recursion and if te
Srammar has right associative operator (exponential operator) then induce the right
recursion.
The unambiguous grammar is
E+E+T
E~T
ToT*F
ToF
F id
Note one thing that the grammar ig unambiguous but it is left recursive oa
elimination of such left recursion is again a must.
Scanned with CamScannersemper Ue 2
o Syntax Ana
Top-down parsors
Backtracking
Prodictive parsers
Recursive
descent
Fig. 3.5.6 Types top-down parsors
There are two types by which the top-down p;
1. Backtracking
arsing can be performed,
2, Predictive parsing
A backtracking parser will try different
input string by backtracking each time. The backtracking is powerful than predictive
parsing. But this technique is slower and it
Tequires exponential time in general. Hence
backtracking is not preferred for practical compilers,
Production rules to find the match for the
As the name suggests the predictive Parser tries to predict the next construction using
one or more lookahead symbols from input string, There are two types of predictive
parsers :
1. Recursive descent
2. LL(1) parser
let us discuss these types along with some examples,
EG Recursive Descent Parser
A parser that uses collection of recursive procedures for parsing the given input
String is called Recursive Descent (RD) parser. In this type of parser the CFG is used to
build the recursive routines. The R.HS. of the production rule is directly converted to a
Program. For each not
n-terminal a separate procedure is written and body of the
Procedure (code) is RHS, of the corresponding non-terminal,
Basle steps for construction of RD parser
The RHS. of the rule is directly converted into program code symbol by symbol.
LAE the iny
Put symbol is non-terminal then a call to the procedure corresponding to
the non-terminal is made.
* a 'NPUt symbol is terminal then it is matched with the lookahead from input.
eI
Sokahead pointer has to be advanced on matching of the input symbol
TECHNICAL PUBLICATIONS”. An up thrust fr knowledge
Scanned with CamScanner————— la,
I
alternates then all these alternates ,
3.1f the production rule has- many ni
roceclUire. ty
combined into a single body of proc ;
4. The parser should be activated by a procedure corresponding, to the stay “7
Let us take one example to understand the construction of RD parser, Con
grammar having start symbol E.
> num T
T > *numT |
procedure E
if lookahead = num then
match(num) ;
T /* call to procedure T */
}
else
error;
if lookahead = $
{
declare success; /* Return on success */
else
error;
"end of procedure E*/
procedure T
if lookahead = '*’
match('*");
if lookahead ='num’
match(num);
T;
} /* inner if closed*/
else
enor
Ls Nee /* outer if closed*/
combined into same procedwe ee “Here the other altemate is
} 7" ond of procedure Tr,
procedure match( token t ) ae
if lookahead=t
lookahead = next_token; /ockahead pointer is advanced*/
" ~
TECHNICAL PUBLICATIONS". An on Scanned with CamSecanner= \ =o ee
Compiler Design 3-25 ‘Syntax Analysis
else
error
} /*end of procedure match*/
procedure error
print("Errorll!");
} /*end of procedure error*/
Fig. 3.6.1 Pseudo code for recursive descent parser
The parser constructed above is recursive descent parser for the given grammar. We
can see from above code that the procedures for the non-terminals are written and the
procedure body is simply mapping of RH.S. of the corresponding rule
Consider that the input string is 3«4,
We will parse this string using above
given procedures.
GEEIs E> numT
The parser will be activated by calling
procedure E, Since the first input GEE
character 3 is matching with num the T
procedure match (num) will be invoked
and then the lookahead will point to next
token. And a call to procedure T is given.
T+ numT
A match with “ is found hence ELLs} r
+ numT
lookahead = next_token. t
Now ‘4’ is matching with num hence
again the procedure for match (num) is EEE] Declare success !
fulfilled. Then procedure for T is invoked. t
And T is replaced by e.
As lookahead pointer points to $ we
quit by reporting success. Thus the input string can be parsed using recursive descent
parser,
Construction of recursive descent parser is easy. However the programming language
that we choose for RD parser should support recursion. The internal details are not
accessible in this type of parser. For instance; in this parser we cannot access the current
leftmost sentential form. Secondly at any instant, we cannot access the stack containing
Tecursive calls.
TECHNICAL PUBLICATIONS”. An up thrist for knowledge
Scanned with CamScannera Syntay. Any
mmar (A of Bz) is crucial. Ang,
cursive gt
ao We ave 1 left factor the grammar. We canna)
first WS \a : a ns
: : all types context free grammar. si
owing grammar
Writing procedures
such procedure simple
ent parsers for
consider the fo
ToveTly
void ,
Write down procedures for nontermin
mmmar to make @ recursi
uals of the gra 2 WE descoy
parser.
Solution :
Procedure E()
{
TOK
if (lookahead = '+')
match('+');
EQ):
}
else
error ( );
if (lookahead = '$)) { declare SUCCESS ;}
Procedure T ( )
vO.
if (lookahead = '*')
{
match ('*');
TOs
t
else
error ( );
}
Procedure V ()
{
if (lookahead = ‘id’)
match (‘id’);
else
error ( );
‘ \
Procedure match (token t)
t
if (lookahead = t)
lookahead = next_token;
else
Seared with CamScarmer‘Compiler Design 3-27
Syntax Analysis
error;
i adut ert)
{ Print (" Error !");
}
Advantages of recursive descent parser
1. Recursive descent parsers are simple to build.
2. Recursive descent parsers can be constructed with the help of parse tree.
Limitations of recursive descent parser LOR
1. Recursive descent parsers are not very efficient as compared to other parsing
techniques.
2. There are chances that the program for RD Parser may enter in an infinite loop for
some input.
3. Recursive descent parser can not provide good error messaging,
4. It is difficult to parse the string if lookahead symbol is arbitrarily long.
Eid L(1) Grammars
This top-down parsing algorithm is of non-recursive type. In this type of parsing a
table is built. For LL(1) - The first L means the input is scanned from left to right. The
second L means it uses leftmost derivation for input string. And the number 1 in the
input symbol means it uses only one input symbol (lookahead) to predict the parsing
process.
The simple block diagram for LL(1) parser is as given below.
LL(1) Parser
Output
Parsing table
Fig. 3.7.1 Model for LL(1) parser
TECHNICAL PUBLICATIONS”. An up thrust for knowledge
Scanned with CamScanner
|3-28 8 ‘
“ye
fer i) Sick ity pn et
put buffer ii) Stack iti) Parsing ,
1 atures use LLL) are i)
The data structures used by LL(1) The stack is um ah a
4 or to store the
LL(1) parser uses input buffer OU ecard eat
Joft gentential form.The symbols in RES. of p stack jt
Le. from right to left. Thus use of stack makes this algorithm non-recyr.,
order, Lev Brom aoe ey array, ‘The table has row for non-op.’ Th
table is basically a two dimensional array. be has row for nonemig
omn for terminals, The table can be represented as MIA] where A is a none,
rl Mins
and a is current input symbol. The
Compllar Dosign
input token’
parser works as follows -
the stack and a current input symbol}, Wi
; ee f
‘The parsing program reads top 0 a ;
at ee action is determined. The parsing actions
th
help of these two symbols the parsing
2h be
Parsing action
Top Input token 7 __ Parsing oe
a A rs) Pop nd advai lookahead to next token.
A be a Refer table MiAva) if entry at MIA,a) is error report Error. |
a Refer table M[A.a] if entry at M[A,a] is A> PQR then pop ,
A then push R, then push Q, then push P. j
The parser consults the table M[A,a] each time while taking the parsing actions hene
this type of parsing method is called table driven parsing algorithm. The configuratix
of LL(1) parser can be defined by top of the stack and a lookahead token. One by ae
configuration is performed and the input is successfully parsed if the parser reaches te
halting configuration. When the stack is empty and next token is $ then it correspon’
to successful parse.
[EJ] Non Recursive Predictive Parsing
The construction of predictive LL(1) parser is based on two v
i tant functio®
and those are FIRST and FOLLOW. ery important
For construction of Predictive LL(1) parser we have to follow the following steps~
1. Computation of FIRST and FOLLOW function.
2. Construct the Predictive Parsing Table using FIRST and FOLLOW functions.
3. Parse the input string with the help of Predictive Parsing table.
FIRST function
FIRST(a) is a set of terminal symbols ¢h
derivation of a. If a=9€ then © is also in FIRST ()
hat are first symbols appearing at RUS"
Scanned with CamScanner3-23 Syntax Analysis
are the rules used to compute the FIRST functions.
If the terminal symbol a the FIRST(a) = fa},
there is a rule X—+e then FIRST(X) = £),
Lott Xj, FIRST(A) = (FIRST FIRST(X,) U FIRSTOG)...
vu FIRSTOG). = Ou 0%).
Where k X; Sn such that 1 (E) id.
- + T+ FT —= replace Fby RHS. nile
FIRSI(T) = FIRST(F) }
} T>@)7 | iaT
F>@) :
Foid 1 This can -
Oss FIRST(E) = FIRST(T) = FIRST(F) = (uid | | Peacded in
TEGHUICAL PUBLICATIONS”. An up trast for koowtedge
Scanned with CamScanner§
3-30 YN
FIRST(E’) = (+, ©)
As Ef > ATE’ -
Efe by referring computation Fue
HS. of production rule for E’ is added in the Fis
ae 7
The first symbol appearing at
function.
FIRST(T’) = (9)
As Tot TE’
Boe
earing at RHS. of production rule for T’ is addeg :
. inal symbol aj
The first terminal symbol appensing MT Oy function.
the FIRST function. Now we will comput
FOLLOW(E) -
i) As there is a rule F->(E) the symbol ‘)’ appea!
Hence ‘)' will be in FOLLOWE).
ii) The computation rule is A > 0B}
a=(, B=EB=).
rs immediately to the right of E,
B we can map this rule with F — (E) then A=;
FOLLOW(B) = FIRST(®) = FIRST() ) = {) }
FOLLOW(E) = { )}
Since E is a start symbol, add $ to follow of E.
Hence FOLLOW(E) = { )S)
FOLLOWE) -
i) E> TE’ the computational rule is A >a BB.
A=E,a= = E, B=
a = T, B = E, B=e then by computational rule 3 everything =
FOLLOW(A) is in FOLLOW(B) i. everything i
Le. everything in FOL! ni
FOLLOWE’) = {), $] LOW(E) is in FOLLOWE)-
4i) E'> +TE’ the computational rule is A +. BB,
. A=E,a=4T,B=£,8
ue ’ Bee t
FOLLOW(A) is in FOLLOW(B) ie oe :
; i
FOLLOW(E’) = ( )$ } q
We can observe in the given grammar th
'Y computational rule 3 everything *
8 in FOLLOW(E’) is in FOLLOW(E}
at ) is
FOLLOW(T) - Lerealy following E.
We have to observe two rules
E > TE
Scanned with CamScannerier Design
x 3-31 Syntax Analysis
E’ > +TE’
i) Consider
E> TE’ we will map it with Aa BB
A=E, a=, B=T,=B=E' by computational rule 2 FOLLOW(B) = (FIRST (f) ~ ec}.
That is FOLLOW(T) = (FIRST(E’) ~ e}
= {fe}-e}
= t+}
ii) Consider E'> +TE’ we will map it with A >a BB
A =F, a = 4B T, B
FOLLOW(A) = FOLLOW(B) ie. FOLLOW(E’)
FOLLOW(T) = {)S)
Finally FOLLOW(T)= {+} U ()$}
= {+ ),$)
We can observe in the given grammar that + and ) are really following T.
E’ by computational rule 3
;OLLOW(T)
FOLLOW(T’) -
To Fr
We will map this rule with AaB then A=T, a=F, B=T’, B=e then
FOLLOW(I’) = FOLLOW(T) = {+,)8}
TO FT’
We will map this rule with A—aBB then A=T, a=*F, B=T B= e’ then
FOLLOW(T’) = FOLLOW(T) = {+,),$}
Hence FOLLOW(T’) = {+,),$}
FOLLOW(F) -
Consider T -> FT’ or T’ > *FI’ then by computational rule 2,
(tor TFT |
| AsaBp Asap |
A=Ta=eB=Rp=T A=T,a=*,B=RP=T |
FOLLOW(B) = (FIRST (B) - €] FOLLOW() = (FIRST) - }
FOLLOW(F) = [FIRST(T) - ¢} FOLLOW() = (FIRST(T’) - e}
_ FOLLOW) = F} FOLLOW() = (1
TECHNICAL PUBLICATIONS”. An up thrust for Anowiedga
Scanned with CamScannerFOLLOW(A) =
FOLLOW(T) = FOLLOW(F)
|
| Hence FOLLOWE +S) |
Finally FOLLOW(F)= Ut)
FOLLOW(F) = {+%)5)
To summarize above computation
Symbols FIRST FOLLOW
E (id) 0S)
2a Wee
7 {ee} os |
(Gid}
Algorithm for predictive parsing table -
‘The construction of predictive parsing table is an important activity in predicit} ,
parsing method. This algorithm requires FIRST and FOLLOW functions. ;
Input : The Context Free Grammar G.
Output : Predictive Parsing table M.
Algorithm :
For the rule A >a of grammar G
1. For each a in FIRST (a) create entry
ry MA] = A cathe :
2. For ein FIRST(a) create entry MIA,b] = A “01 where a is terminal sym?
b]= Ao
Where b is the symbols from FOLLOW(A)
TECHNICAL PUBLICATIONS. An up gy
Up thrust for krowiedgo
Scanned with CamScannerDesign 3-39 a Syntax Analysis
gif is in FIRST(@) and $ is in FOL
MIA, $]= A >a.
4, All the remaining entries in the table M are
LOW(A) then create entry in the table
marked as SYNTAX ERROR.
Now we will make use of the above algorithm to create the parsing table for the
grammar ~
E> TE’
E> 4TE’e
T- FI’
T 3 FT' |e
F> @lid
First create the table as follows,
F
Now we will fil up the entries in the table using the above given algorithm. For that
consider each rule one by one.
ESTE
Axa
A=Ea=TE
FIRST(TE’) if E’ = e then FIRST(T) = ((jid}
MIE, (| = ESTE’
MIE, id] = Ere’
Es stp
Asa
TECHNICAL PUBLICATIONS” An up thrust for knowiedgo
Scanned with CamScannerad Val
pow
“se
lea] = B-> +TP"
Hence
Boe
Av
AzE,aze then
FOLLOW(E’) = ()S)
ME’, I= Ee
MIE, $] = Ee
ToFT
Asa
A=F,a=FI’
FIRST(FT’) = FIRST(F) = ((id)
Hence M[F,(] = T — FT’
And MIRid] = T <5 pT’
T srr
A aa
AsTa=
FIRSICET) = (4
Hence M[T,9] =
Topp
T se
A sa
A=T ane
FOLLOW(ry ~
Hence Mtr” 4
MIM) =p, g
MIT’)
)$)
Scanned with CamScannercomple Desion
F 7()
Ao
A=Fa=(E)
FIRST(E)) = (1
Hence MIF,(J = F(E)
F >id
A 3a
A=F,a=id
FIRST(id) = { id }
Hence M[F,id] = F — id
The complete table can be as shown below.
Syntax Analysis
Now the input string id + id * id $ can be parsed using above table. At the initial
configuration the stack will contain start symbol E, in the input buffer input string is
placed.
Stack Input
Action
SE id + id + id $
Ao symbol E is at top of the stack and input pointer is at first
referred. This entry tells us E — TE’, so we will push E’ first then T.
id, hence M[E,id] is
Stack Input __Action |
SeE’'T id + ids id S E> TE
id + id * id $ E- FT
id +id*idS Foid |
+ideid S
TECHNICAL
‘PUBLICATIONS™- An up thrust for kowledae
Scanned with CamScannerThus it is scanned from left to right and we always fol,
eft most derivation while parsing the input string. Also at a time only one input symiy
js referred to taking the parsing action. Hence the name of this parser is LL(1), th
LL(1) Parser is a table driven predictive parser. The left recursion and ambiguoy
grammar is not allowed for LL(1) parser.
EEDELE show that folowing grammar :
observed that the input is
'$ — AaAb| BbBa
Antje
Boe
is LL (D.
Solution : Consider the grammar :
S— AaAb
$— BbBa
Ave
Boe
Now we will compute FIRST and FOLLOW functions. '
FIRST(S) = (a, b) if we put
$— AaAb
S—aAb When Ae
Also S > BbBa
$—bBa When Be
FIRST(A) = FIRST(B) = (e)
Scanned with ee aLyrisen e inng ee
Design
comple
poLLows) (sh
POLLOW(A) = FOLLOW(B) = {a, b}
athe LLC) parsing table is
a b $
S— AaAb S$ BbBa
Ave
Now consider the string "ba". For parsing -
Stack Input Action
$s bas S— BbBa
“sabbB ba$ Boe
ba
a » Sab : = aS Bort
Po ae ee 3s
| = 5 ng $ Accept
This shows that the given grammar is LL(1)
For the following grammar find FIRST and FOLLOW sets for each of non
terminal.
S—aAB|DA |e
AsaAble
BoB |e
Solution: FIRST(S) = The first terminal symbol appearing on R.H.S.
FIRSTS) = {a,b,e}
FIRST(A) = The first terminal symbol appearing on RES.
FIRST(A) = fa, e}
FIRST(B) = The first terminal symbol appearing on RH'S. of production
tule for B.
TerHninAt BURL CATIONS” An up thrust for knowndg®
Scanned with CamScanner8
PBST 6 se} ey
Now, we will eomprte FOLLOW finetion follows:
VOLLOW EO) © {8} se Sv aeatart xymbol,
POLLOW (AD {»} Iecause
Consider the rule,
A a BA whieh can be mapped with
A 9 a AB Then the FIRST()) = FIRST(B) = {be}
Lid
«a Bp
Then without © remains b, Henee b ¢ FOLLOW(A). Now consider the rule 4
then “everything in oy
FOLLEW(A) = FOLLOW(B). This rule can be mapped with
S564
J vy <. everything, in FOLLOW(S) = FOLLOW(A)
A a B
FOLLOW(A) = {$}
To summarize — FOLLOW(A)
"
{b,$}
Now consider the rule,
S 7a AB
Ve!
ifwemap. A> @ B
everything, in FOLLOW(S) = FOLLOW(B)
FOLLOW(B) = {$}.
| =
Then according to this rule,
| HIRSICA) = fase} | FOLLOW(A) = {b,S} i
| = |
SST) © {box} | FOLLOW(B) |
j -
|
HHT) © farbye} | FOLLOWG) = {8}
po Ss
Cas
Scanned with CamScannerSyntax Analysis
Solution : Let the given grammar will be,
|
|
. & ba) are terminal
S + iCiSAla
A+ je
Cab
Now we will compute FIRST and FOLLOW for given nonterminals.
FIRST(S) = {ia} FOLLOWG) = {e,} |
FIRST(A) = {e,e} FOLLOW(A) = {e, S}
FIRST(C) = {b} FOLLOW() = {t}
The predictive parsing table will be
| a b e t i s
Ls Soa S>iCtsa
[a ASS Ane
| Ase
c Cab
As we have got multiple entries in M[A,e] given grammar is not LL(1) grammar.
The (i, t, e, a, b) are the terminal symbols because they do not derive any production
rule.
Consider the grammar
texp > atom | list
atom + number | identifier
list — (textp-seg)
fextp-seg texp, textp-seg | textp
) Left factor this grammar.
4) Construction FIRST and FOLLOW sets for the nonterminals.
iti) Show that resulting grammar is LL(1).
4) Construct LL(1) parsing table for the resulting grammar.
TECHINIGAL PUBLICATIONS- An up thrust for knowledge
Scanned with CamScanner ~~yr
9-40 . wh,
compior Desir st out the set of terminals and TOR;
Ma
ston: For te gen OMIM eit ;
first Set of terminals ie = [nw stp, 207 ih textp-seg]
Vv
sat of nonterials i: ainst A208 108 2-
it ag’
+) Consider the rule and map 4
i) Consider th texip-seg|txP L
L
texip-seg— tex'P,
t J hh a Bie. e)
A a
The grammar after left factorization will be -
Aas’
A’ > Bil Ba b-~
{ 2. textp-seg — texp textp-seg’
textp-seg’ — , textp-seg |e
ii) After left factoring, all the production rules can be enlisted as below -
textp — atom |list
atom —> number | identifier
list — (textp-seg)
texp-seg — textp textp-seg’
textp-seg’ — , textp-seg |e
Now for each non-terminal
ee inal symbol the FIRST and FOLLOW could be computed
FIRST(textp) = First terminal s
Bis 'ymbol appearing on RS. for the rule
‘etp > atom — number| identifier
+. Number and identifier is added
in FIR:
> list > (textp. seg) pees
* (Will be in FIRg:
T(textp))
* FIRST(textp) = ( (, number, identifier)
FIRST(atom) = Fj
Eat terminal symbo} @ppearing at R H,
: S. of rule for
atom — number identifier
aaa ane
Scanned with CamScannersaat Syntax Analysis
‘compiter Design
FIRST(atom) = (number, identifier}
: FIRST(ist) = ((1
RST(textp-seg)= The first symbol appearing at R.HS. of rule for textp-seg.
Fl
textp-seg — toxtp textp-seg’
atom | ist
aes
v
number | identifier |
(textp-seg)
FIRST(textp-seg) = {number, identifier, ( }.
FIRST(textp-seg’) = {¢,£}. Now we will compute FOLLOW.
FOLLOW(texp) = i) As textp is a start symbol add § to it.
textp-seg — textp textp-seg’
textp , textp-seg
tT
, is added in FOLLOW.
FOLLOW(textp) = {$,/}
FOLLOW(atom) = FOLLOW(textp) because textp — atom.
FOLLOW atom) = {g,}
FOLLOWS(ist) =
FOLLOW(textp) because textp — list.
FOLLOW(ist) = {S,}
FOLLOW(textp-seg) = i) list > (textp-seg)
+
ie. ) follows textp-seg
++) is added in FOLLOW
* FOLLOW (text-seg) = ( ) }
FOLLOW(textp-seg') = i) As there exists a rule if A oB
then everything in FOLLOW(A) is in FOLLOW(B).
textp-seg — textp textp-seg’
L L t
A a B
TECHNICAL PUBLICATIONS". An up thrust for knowleigo
Scanned with CamScannerrest) = 1 number, identificr }
FARST(atom) = {number identifier |
roLLow(textp-see) =)
FOLLOW(textp-seg) = {) }
rarstilis) = 10) ae
F pagsttextp-seg) = (¢ numbers identifier)
rinsttestpseg) = {78}
iv) The predictive parsing table can be constructed as follows -
i
number identifier A ( ) ?
= en Gieeses
textp (list)
| textp textp> atom —textp> atom
Ze Pipsestieeese ee A
| atom atom atom
| number identifier
ecaecearrenaens oO ae
; list>
Pe
textp-seg —textp-seg—> text We ats
pseg >
textp a ree ; textp-seg>
textp-seg’ yas textp
LCI seg
| eee ray textp-seg’
textp-seg'
textp-seg’
ae textp-seg’ >
, textp-seg 0
As each cell in th
above table contains unique entry, th
, the given grammar is LL(1).
the following grammar.
Construct the predictive parser for
S3Wla
LL ss
Solution :
$ the given gramma: = i
As 8 e “cursive because
hi ft ve becay
use of
LoL s|s,
We will first eliminat > B car ‘onverter
eliminate das
fe left Fecursion. Ag 4
| Ag
IB can be ¢ te
Scanned with CamScannercompiler Desi” 3-43 ‘Syntax Analysis
We can write L > by 818 a8
Lost!’
p> SL
Now the grammar taken for predictive parsing, is -
sola
L— SL’
Lo, Sh’ |e
Now we will compute FIRST and FOLLOW of non-terminals
FIRSTS) = (a)
FIRST(L) (Ca)
FIRST(L’) = {', €)
FOLLOW(S) = {’,), $}
FOLLOW(L) = FOLLOW(L’ = [ )}
The predictive parsing table can be constructed as
s Sva ss (l)
L L3Sl’ LoS’
oe | L/, SL’
Example: Construct the behaviour of the parser on the sentence (a, a) using the grammar
specified above.
Solution : As we have constructed a predictive parsing table in Example 38.6 we will
Parse the string (a, a) using that table as shown below.
TECHHICAL PUBLICATIONS” An up thrust fr owled?e
Scanned with CamScannerConstruct predictive parsing table for the grammar
E>E+T|T, T-TF|F, FF*alb.
it
Razr Ly
Let
EE4T|T,
TOTF|F
FoFlalb
be the given grammar.
Solution
Step 1: Eliminate left recursion - The formula to eliminate left recursion is
A>Aa|B = ABA‘
1) ESEsT|T .
Lo 1 => ESTE
A Aa § ES+TE|e
2) ToTF|F
dig => Tr
Aa B Toke
A> aA’le
3. FoF jalb = Poor pr
Fo 'Ple
ESTE’
EF’ +TE’Je
TFT
TFT Je
STS ete eeFak [BF
FoF le
vw we will compute FIRST and FOLLOW
No
AS FoaF'|bF
First symbol on rhs.
FIRST(E) = FIRST(T) = FIRST(F)
FiRST(E) = {a,b}
FIRST(T) = {a/D}
FIRST(F) = {a,b}
VE +TEle
First symbol on rhs.
FIRSI(E’) = {+e}
FIRST(T’) = {a,b,e}
Then ToaFT|bFT |e
First symbol on rhs.
because T’-+ FT’ if we replace F by aF’ and bF’
FIRSI(T’) = {a,b,e}
FIRST(F’) = {*,2}
Now we will compute FOLLOW.
FOLLOW()
FOLLOW(E) = {$} > Start symbol
FOLLOW (E’)
As
wn ETE’ is a production which can be mapped against A —>0B. According to
is rule
“verything in FOLLOW(A) is in FOLLOW(B). Hence FOLLOW(E’) = {$}
FOLLOW(T)
Es 4TE’ 47+ TE’
FOLLOW (1) = {+}
ie. + FOLLOWS T.
As
ESTE
TECHNICAL PUBLICATIONS”. An up thrust for knowledge :
Scanned with CamScannerSyntax, A
ee hens
=, FOLLOW(E) = FOLLOW)
. $is added in FOLLOW()
pouowm = (Sh
FOLLOW(T’)
‘The rule TFT resembles AaB. Then
FOLLOW(A) = FOLLOWG®)
. FOLLOW() = FOLLOW)
Follow(F) >
TPT FRT FFT | FOF
aand b comes after F.
« FOLLOW) = (+ $}
. FOLLOW@) = {a,b}
Torr i i i i
ee ae A >oBB. According to this rule everything ix
= FOLLOW(I’) = FOLLOW(F)
Hence
FOLLOW() = {a,b,+, $}
FOLLOW (F’)
Fa’ is one rul i
= we rule which can be mapped with A — aBB. According this rule we
FOLLOW() = FOLLOW(F’)
+ FOLLOW(F) = {a,b,+, $}
The predictive parsing table can be constru
eannaa cted as follows -
Hence we will add a rule ETE’ j
MIE, a] and M [E, b] 7
FIRST(I) = {a,b}
MIT, a] = %
a < x TOPT’ ond MT, by =
sla Torr
MIF, a] = F- ar’
and_ MIF, by =p
TECHNICAL Pusy inn =
Scanned with CamScannera a Syntax Analysis
FIRST) = {+e}
M(E‘+] = E’— 4TE’
fe is in FIRST(E? and FOLLOW(E)
MIE‘S] = E’>e
FIRST’) = {1°}
MIF") = F "Fr"
)={$} then
Hei in FIRST (F?) and FOLLOW (Fy = {
a,b, +, $}
then M{F7 a] = M [Fb] = M[F’,4] =
MIF, SJ= Pe
Hence predictive parsing table will be
a + 5 a
el pee
E ESTE
}—£___EoTe [So |
eB ESTE Ese |
Torr TFT
Toe Toe Toe Toe
Far F- br
Poe Poe Poe PoP Poe |
Compute FIRST and FOLLOW sets Sor all nonterminals in the following
Ov grammar
$+ Aa|bAc|Be| bBa
And
Bod
Solon: FIRST(S) = First terminal symbol appearing on RHS,
= {bd}
FIRST(A) = First terminal symbol appearing on RHS,
= {a}
FIRST(B) = First terminal symbol appearing on RHS.
= {a}
OLLow(s) _ {3} ‘“s Start Symbol.
Rsider the ru
Scanned with CamScannerCompiler Design
SPAT with =e
A BB oak
Hence FOLLOW(A) = FIRST(B) = FIRST a) = (oh sath
Similarly $= bAc Which can be mapped with 4
imilarly S>
A
ett
A aBB
‘Then FOLLOW(A) = FIRST (8) = FIRST(O) = {od
FOLLOW(A) = {2 ¢}
Consider the rule,
S>bBa
youd
A aBB
FOLLOW(B) = FIRST(S) = FIRST(a) = {a}
$3Be
todd with a =e
A BB
FOLLOW(B) = FIRST(®) = FIRST(¢) = {e}
:. FOLLOW(B) = {a,c}
To summerize,
¥ acceptable but
These strategies are given in detail as below
i) Panic mode
* This strategy is used by most parsing Ae
s.
* This is simple to implement,
TECHNICAL PUBLICATIONS™
Scanned with CamScanner‘Syntax Analysis
in this method on SCOVeTiNN, eFrOF, the parser dia
in this :
ime, THis process is continued Until one of ac
titene is found. Synchronizing, tokens are
Frese tokens indicate an end of input state
ards input symbol one at a
nated set of aynchronizing
haa semicolon oF end
delimiters: 9
ment,
Thus in panic mode recovery a considerable amount of input is skipped witheut
* pecking it for additional errors,
hocking
This method guarantees NOt to go in infinite loop,
If there ate Tess number of errors in the
Same statement then this strategy is a
est choice.
jy Phrase level recovery
il oo
+ In this method, on discovering error parser perform local correction on temaining,
input
‘« Itcan replace a prefix of Temaining, input by some string. This actually helps the
parser to continue its job.
The local correction can be replacing comma by semicolon,
deletion of extra
semicolons or inserting missing semicolon;
The type of local correction is decidedt
by compiler designer.
+ While doing the replacement a care should be taken for not going in an infinite
loop.
+ Thi
is method is used in many error-repairing compilers.
+ The drawback of this method is it finds difficult to handle the situations where the
sctual error has occurred before the point of detection.
ii Error production
* Hwe have a knowledge of common errors that can be encountered then we
“corporate these errors by augmenting
language with error productions that generat
can
the grammar of the corresponding,
le the erroneous construc!
‘ror production is used then during Pi
‘message and parsing can be continued.
* This method is extrer
arsing, We can generate appropriate error
mely difficult to maintain. Because if we change the granvmar
then it becomes necessary to change the corresponding error productions,
Global production
, We often want such
Meorrect input string.
* We expect less number
rom troneous
@ compiler that makes very few changes int processing an
of insertions
ens ver
deletions, and changes of tokens to recove
Scanned with CamScannerCompiler Design 35.52
Sia,
and space requirements at parsing time
rease time
theoretical concept.
hus simply 4
Predictive Parsing
Such methods ine!
Global production is #
[EEE Error Recovery in
cted during the predictive parsing when the terminal o,
toy
« Anerror is dete:
stack does not match the next input symbol, or when nonterminal A on ce
stack, a is the next input symbol, and parsing table entry M[A,a] is ered ug
« panie-mode error recovery #s tased on the idea of skipping symbols on ihe
‘a token in a selected set of synchronizing tokens. ig
Following are some ways by which the synchronizing set can be chosen -
place all symbols in FOLLOW(A) into the synchronizing set for nontem
If we skip tokens until an element of FOLLOW(A) is seen and pop Afat
&
stack, it likely that parsing can continue.
ywords that begins statements
until
to the synchronizing sets for,
We might add ke
nonterminals generating expressions.
IRST(A) to
le to resume parsing a
»
We can add symbols in F the synchronizing set for non termina ,
Due to this it may be possi cording to A if a symbdt
FIRST(A) appears in the input.
»
if a terminal on top of stack cannot be matched, a simple idea is to pop t
terminal, issue a message saying that the terminal was inserted.
then the production desig
ror detection, but cam
number of nontermix!
S
1 can generate the empty string,
This may postpone some e
d. This approach reduces the
yr recovery.
If a nonterminal
can be used as a default.
cause an error to be misse
that have to be considered during erro
a
Consider following parsing table.
F
| Non id + C ( ) $
| terminals
synch
Toe
To *FT’
Foid synch synch —-F-9(E)___ synch
FOLLOW st”
#|
e Synch indicates the synchronising tokens obtained from
terminal. The FOLLOW set for the given grammar is
Scanned with CamScanner3-51 syntax Analysis
pier Desion zs
FOLLOW (FE) = FOLLOW (E’)={, 5}
FOLLOW (T) = FOLLOW (1’) = {+,), $}
FOLLOW (F) = {+,.*,),
«If parser looks up entry M [A, a] as blank then the input symbol a is skipped.
+ Ifentry is synch then the non terminal on the top of the stack is popped.
« If token on the top of the stack does not match the input symbol then we pop the
taken from the stack.
Stack Input Comments
|. $B +id** id § skips |
SE id** id $
$ET id**id $
SE T'F id**id$ Ri
$E'T’id id** id $
SET ttids
| seTE* tid $
SET'F *idS Error. M [F, *] = synch.
s:pop F
SET’ rid s
Ee SETF* "id S
Explai ing lexi
‘plain the reasons for separating lexical analysis phase from syntax analysis.
TNTEEE ee
Ans, ; Refer section 3.1.2.
ee
Scanned with CamScannerCompiler Design sae
- iminate iguits,
Q2 — What is ambiguous grammar? Eliminate ambiguities 7-~ is
E-E+E|E*E|(E)lid-
ree
Ans. : Ambiguous grammar : Refer section
Elimination of ambiguity : Refer section 3.5.1(4).
Q3 Consider the following grammar,
S0A] 1B | 01
A305 | 1B |1
B-0A|1S
Construct leftmost derivations and parse tree for the following Senter,
i) 0101 ii) 1100101
Cee RT
<2 Lp
Ans.: Refer example 334
Q4 — Construct predictive parsing table for the following grammar,
ETE’
E’34TE|c
TFT’
T oF |e
F>(E)|id. |
ee 5, Set As epti 06. Seb Ss Markell 6: Janae 101 5<01, Karey
Ans. : Refer example 3.8.1. 1
Q5 What is recursive descent parser ? Construct a recursive descent parser forth
following grammar, i
EEsT|T |
TOTFIE
FoF lalb.
Ans. : Recursive descent parser : Refer section 1.6
The given grammar is
ESET |T
TTF |F
Fa |b.
Which is a left recursive
‘escent parser. The rule to eliminate left
|
|
|
|
Srammar. We will eliminate this left recursion befor
1
{
|
|
|
]
i! recursion is
H ASAGB then A spay
A’ SoA Je
Consider
Lo ESEsT| 7
eet i ESTE
A-Aa 6 ES+TE le
TECHICAL PUBLICATIONS". Ane tent bien
Scanned with CamScannerort = TOF
2 cae TOF e
A
poral = Pab’|br
3 A AGBB PF e
qo summarize,
EoT ET
TE" |e
TFT
Tort |e
Fak’ |bF”
Fo ‘Fle
Now the recursive desent parser is
Procedure E()
{
TO);
Edash ( );
}
Procedure Edash ( )
{
if (lookhead = '+')
match ('+');
TC)
Edash ( );
}
else null;
[eondure 1)
F();
) Tdash ( );
cae ‘Tdash ()
if (true)
{F();
‘Tdash ( );
return;
else null;
TECHNICAL PUBLIGATIONS”- An up thrust for knowiestie
Scanned with CamScannerConsider the following grammar,
EoT+E|T
pow tv |
ta the procedures for the non-terminals of the grammar to mate ¢
Write down
recursive descent parse?
as
rey
Ans. : Refer example 3.6.1 |
Q7 — Construction of predictive parsing table for the following grammar. |
EsE:T|T, ToT+F|F, FF*|alb. |
Nie ae ee eee
Ans. : Refer example 3.88. |
Q8 = What is an LL(1) grammar? Can you convert ever context free grammar into
LL)? a |
i : wo ano: The LL() grammar is a kind of grammar in which the inpt!
m left to right. It uses the le ‘ivati it i
only one input symbo (ockahead) to ac ge waton for the input string and w=)
entries in each row of arse fo Dredict the parsing process. There are no mult
LL) Grammar. Pasing table designed for parsing the input usin
The context free grammar
| recursive and if it j
| if it is unambi
iguous gramm;
ar,
|
can be converted to LLG) grammar if CEG is not let
1
For example: Refer example 388, |
Qg What imit a,
at are the limitations of recursive qj
lescer
a ut parsers ? |
Refer section 3.6,
EEE
Se
Seamnes eeepier D080" 3-55
Raia Syntax Analysis
(q10 Write a recursive descent parser
bexpr > bexpr or bterm| bterm
| bterm -» bterm and bfactor| bfactor
bfactor ~ notbfactor| (bexpr)| true| false
Where or, and, not, (,), true,
‘for the grammar,
Ans. 3
terminal symbols.
bexpr = E
blerm = T
blactpr = F
=
or =
and
Now the grammar becomes
EST |T
TOT*R|E
1 F|(E) | true |false
But the grammar is left recursive. We will elimi:
following rule.
inate the left recursion using
if A>Aal|p then ABA‘
A’ aA
1. Fee ESTE
Aha b ES+TE |e
2 ToTHE LF Torr
Aiea Tor |e
f
3. F1F|(E) |true|false does not contain left recursion.
To summarize
ESTE’
FE’ 4TE’ |e
Torr
TF’ e
FO!FI®) [true [false
TECHNICAL PUBLICATIONS”- An up theust for knowied?
Scanned with CamScannerompiler Design
i g a gare ~
‘The procedure for recursive descent parsing
Procedure E ()
TOs
Edash ( );
Procedure Edash ()
‘ if (lookhead = '+')
match ('+');
1)
Edash ( );
else null;
}
Procedure T( )
F();
Tdash ( );
i
Procedure Tdash ( )
if (lookahead =’)
match (
F()
Tdash ( );
}
alse null;
}
Procedure F ( )
{
if (lookahead =")
{
) match ('!);
F();
©lse if lookahead = '¢)
{
match ((' );
}
else if (true)
__ Few tue
3-56
ty
Hy
My
MN Gookabead x ¥y y
match ("9),
Scanned with CamScannercompiler Desig? 3-57 ‘Syntax Analysis
else
return false;
}
it Eliminate ambiguity if any form the
" pexpr — bexpr or bterm | bterm
bterm — bterm and bfactor| bfactor
bfactor ~> not bfactor| (bexpr) | true false
Where or, and, not, (,), true,
grammar for boolean expressions,
false are terminals in the grammar.
‘Ans. : For the given grammar, assume
bexpt = E
bterm = T
bfactor = F
or = +
and = *
The grammar then becomes -
E>EST|T
TOT*F|F
F!F|(E) |true |false
This grammar is unambiguous grammar. It is obtained from the grammar -
ESE+E
ESE‘E
ESIE
Es true
E- false,
But the unambiguous grammar is left recursive. The left recursion can be
eliminated. For elimination of left recursion refer answer of Q. 10.
Q42 Consider the grammar given below
EE+E| E-E| E*E| E/E|a| b
Obtain leftmost and rightmost derivation for the string a+b* a+!
[Ans + Refer example 3.3.2.
Scanned with (amScannerva
Compiler Design
What are ti
Ans. : Refer section 3.5.1
Q.14 Consider the following grammar
sa(Lla |
L>L, S|S
Construct leftmost derivations and parse trees for the followi,
i) (a,(a,a))
ii) te (wa), (a,@)))
Ans. : Refer example 3.3.3
"8 Sentences.
Q45 Explain backtracking with example.
Ans. : Refer section 3.5.1(1).
Q46 Compute FIRST and FOLLOW sets for all nonterminals in the Solo
grammar. ny
S$ Aa|bAc| Be| bBa
And
Bod Ee |
Ans, : Refer example 3.89. |
Q.17 Write a CFG for the ‘while’ statement in 'C’ language.
Ans: The CRG G = En
. * = {V/T,P,S} where V is a set of nonterminals, T is a at d
terminal symbols, P is a set of production rules and S is a start symbol. The set d
production rules P= { $—} while (condition) stmt
condition + id relop id |
relop >< [>| <=] >=] 1
8 while (condition) ( 1}
L~ stmt L| stmt
}
V = (, condition, relop, L) ;
a Scanned with CamScanner
T= (while, (,),<,>,<5,5‘syllabus
nto Simple LR - Why LR parsers - Model of an LR parsers - Operator precedence- Shift
= Difference between LR and LL parsers, Construction of SLR tables.
reduce parsing
Contents
44 Bottom-up Parsing
42. Why LR Parsers ?
43 Model of an LR Parser
44 Operator Precedence .................. Aug/Sept-08, Set-1,4;
.. May-08, Set-1,
eee May-06, 04, Set-4,1,
May-07, Set-3,4, Dec.-05, Set-1,2,4;
April-03, Set-3, Marks 8 -- Marks 10
Shift Reduce Parsing
46. Difference between LR and LL Parsers
47 Construction of SLR Parsing Table May-06, 04, Set-2,3,4,
. Aug/Sept.-07, 06, Set-4,1,3;
. May-05, Set-3
May-05, 04, Set-3,1
Marks 12
Scanned with CamScannerSimy i
Compiler Design Ie LR
[EG Bottom-up Parsing
In this section, we will dis "
i sk is a program : ist
compilers, the task is done by a pro a : : ty
iter it checks the input string completely for its eee ae error my, ld.
on syntactically input strings. In bottom-up parsing, a hod, the oe String js ti
byes i 't ‘i ‘ammar
first and we try to reduce this string with the help of gre and try to coin :
start cymbal. The process of parsing halts successfully aS Soon as we reach 4g ra
s "How an input string gets parsed effig,
alled parsers. The task of parsop ently »
symbol. ,
7 The parse tree is constructed from bottom to up that is from leaves to root, In th
process, the input symbols are placed at the leaf nodes after successful parsing
bottom-up parse tree is created starting from leaves, the leaf nodes together are rey,
further to internal nodes, these internal nodes are further reduced and eventually 8 roy
node is obtained. The internal nodes are created from the list of terminal and
non-terminal symbols. This involves -
Non-terminal for internal node = Non-terminal U terminal
In this process, basically parser tries to identify R.H.S. of production rule and replay
it by corresponding L.H.S. This activity is called reduction. Thus the crucial but prime
task in bottom-up parsing is to find the productions that can be used for reduction. The
bottom-up parse tree construction process indicates that the tracing of derivations ar
to be done in reverse order.
For example
Consider the grammar for declarative statement,
SoTL;
Tint | float
L> Lid | id
The input string is float id id,id;
Parse Tree
Step 1:
We will start from leaf node
Step 2: =
|
i
float |
Scanned with CamScannera 4-3 Simple LR Parsers
Read next string from input.
Reducing id to L. L id.
t
Co) ©
Step 6: Read next string from input. ‘
U
Cot) © d
Read next string from input.
‘ebbbs
id id gets reduced.
i
Edbvd
TECIMImAL eine inwrinaie™ An vn test for knowledge
step 4
step 5:
‘Step 7:
Scanned with CamScannerFig, 4.1.1 Bottom-up parsing
; : tial form produced while constructing this parse tree is
4 Step 10: The senten! a
! T id,id;
i T Lid;
; TL
Ss
Step 11: Thus looking at sentential form we can say that the rightmost derivation =
reverse order is performed.
Thus basic steps in bottom-up parsing are
1) Reduction of input string to start symbol.
! 2) The sentential forms that are produced in the reduction process should trace
| rightmost derivation in reverse.
Handle Pruning
As said earlier, the crucial task in bottom-u;
P parsing is to find the substring tt
uch a substring is called handle.
se ante is a sting of substring. that matches the right side ¢
e : i
production, Such reduction rpreece mane ey 2 mon-terminal on left hand $8
Fomlly we can define handle as follow, P MO"8 the Feverse of rightmost deriva
Handle of right sentential r
d form y is i
| where the string B may be found and rie pee |
Sentential form in rightmost Servation of,” “ ¥° Produce the previous #8
——————_Sivation of y",
SS |
For example
Consider the grammar
. E+ E+E
TECHNICAL PUBLICATIONS"
Scanned with CamScanner8 Sie t Parers
~ id
Now consider the string id + id + id and the rightmost derivation is
> E+E
EO E+ESE
Eo E+ Eeid
Ba id + id
3S id+id sid
‘The bold stings are called handl
Right sentential form
id + id + id
Thus bottom parser is essentially a process of detecting handles and using them in
reduction. This process is called handle pruning.
(EEE Why LR Parsers ?
This is the most efficient method of bottom-up parsing which can be used to parse
the large class of context free grammars. This method is also called LR(k) parsing. Here
* L stands for left to right scanning.
2 * Rostands for rightmost derivation in reverse.
* kis number of input symbols. When k is omitted k is_assumed to be 1.
° Properties of LR parser -
«UR parser is widely used for following reasons.
4 1 EX parsers can be constructed to recognize most of the programming languages for
’ which context free grammar can be written.
2
Te class of grammar that can be parsed by LR parser is a superset of class of
Srammars that can be parsed using predictive parsers.
a
TECHNICAL PUBLICATIONS” An up thrust for knowiodge
rt
Scanned with Ting4-6
Compitor Derign
, shift reduce a
4, LR parser works UNE non backtracking shift reduce technique yet
one.
actical errors very efficiently.
LR Parsers detect synt
14:3) Model of an LR Parser
‘The structure of LR parser is as given in following, Fig. 43.1.
yuffer INPUT Token
It consists of input b
for storing the input string, 4 Glo]
stack for storing the grammar ist
symbols, output and a parsing
table comprised of two parts,
namely action and goto. There 5,
is one parsing program which
is actually a driving program
and reads the input symbol one
at a time from the input buffer.
Stack
The driving program works
on following line.
1. It initializes the stack with
start symbol and invokes
Parsing table
Fig. 4.3.1. Structure of LR parser
scanner (lexical analyzer)
to get next token.
He Lp
a
it Ig
fy,
hy
2. It determines s, the state currently on the top of the stack and a, the current inp!
symbol.
3. It consults the parsing table for the action [s, .a,] which can have one
values.
i) 5, means shift state i.
ii) 4, means reduce by rule j.
iii) Accept means successful parsing is done.
iv) _ Error indicates syntactical error.
Types of LR parser
Following diagram represents the types of LR parser.
Seanned enLR parsor
SLR parser LAL parser Canonical LR parser
Fig, 4.3.2 Techniques of LR parsers
The SLR means simple LR parser, LALR means Lookahead LR parser and canonical
LR or simply “LR” parser - these are the three members of LR family. The overall
structure of all these LR parsers is same. All are table driven parsers. The relative
powers of these parsers is SLR(1) ¢ LALR(1) < LR(1). That means canonical LR parses
larger class than LALR and LALR parses larger class than SLR parser.
EZ] Operator Precedence
‘A grammar G is said to be operator precedence if it posses following properties - |
1. No production on the right side is.
2. There should not be any production rule possessing two adjacent non-terminals at
the right hand side. ee a -
Consider the grammar for arithmetic expressions.
E-EAE | (E) | -E | id
Ase l-1tl/ 1%
This grammar is not an operator precedent grammar as in the production rule.
E> EAE
It contains two consecutive non-terminals.
equivalent operator precedence grammar by removing A.
E> E+E|E-E|E*E|/E/E|B*E
E> @|-E| id
In operator precedence parsing we
between pair of terminals. The meaning
p gives more precedence than q:
Hence first we will convert it into
will first define precedence relations <-*and->
of these relations is
Po
p=q p has same precedence as q.
pq p takes precedence over q.
These meanings appear similar to the less than, equal to and greater than operators.
Now by considering the precedence relation between the arithmetic operators we will
TECHNICAL PUBLICATIONS”. An up thust for knowledge
Scanned with CamScannerfence table. The operators precedences wo jay oy
Ave
construct the operator preces =
are id. Mn,
’wo+ * $ |
|
| |
t
| id |
|
i |
} |
|S ee eetoges |
L a 2
Fig, 4.4.1 Precedence relation table
Now consider the string.
id + id eid
We will insert $ symbols at the start and end of the input string. We will als, ines
precedence operator by referring the precedence relation table.
S4< id >$
We will follow following, steps to parse the given string -
i) Scan the input from left to right until first -> is encountered.
encountered.
ti) Scan backwards over = until < - is
ini) The handle is a string between <-and +>.
The parsing can be done as follow
S<-id->sc-id >s Handle id is obtained between <->
i Reduce this by Es id
| Ercid pe cid ->$ Handle id is obtained between <-->
E— id
\ Reduce this by
E+ Be cid oo Handle id is obtained between <
Reduce this by E > id
Esbek " }
Remove all the non-terminals. 7
Insert $ at the beginning at the enh AB
| bs insert the precedence operttors, "|
=e A ae!
| the * operator is surrounded &
| indicates that * becomes hal |
{ Sseeng We have to reatuce ESE operation ft -—
: ve ea |
handle, Hence we &
: sed
Scanned with CamScannerAdvantage of Operator Precedence Parsing
‘Simple LR Parsers
1. This type of parsing is simple to implement,
Disadvantages of Operator Precedence Parsing
1. The operator like minus has two different
is hard to handle tokens like minus sign,
2. This kind of parsing is applicable to only sm.
Difference between Operator Precedenc:
Precedence Parser
Sr. Operator Precedence Parser
No.
The operators <-, ->,
involved to show the relationship
___among the symbols.
This is not an efficient parser.
For only the terminal symbols the
_ precedence is defined.
‘This type of parser is simple to
implement
5. This type of parser is applicable to
of grammar.
6 The operators li
binary has two different
precedences. Hence it is difficult to
handle such tokens.
minus, unary or
Operator Precedence Parsing Algorithm
precedence (unary and binary). Hence it
all class of grammars.
e Parser and Simple
rea
Simple Precedence Parser
The operators are not used to show
the precedence
‘The precedence is defined for both
terminal and non terminal symbols. |
This type of parser is complex to |
implement. |
This type of parser
comparati
applicable to
f
‘The unique precedence relation can
be applied to the tokens. |
Doce
1. Set i pointer to first symbol of string w. The string will be represented as follows,
[s
+a
i
2. If $ is on the top of the stack and if a is the symbol pointed by i then return.
3. If a is on the top of the stack and if the symbol b is read by
pointer i then
TECHNICAL PUBLICATIONS”. An up thrust for Knowiedge
"Scanned with CamScanner
|a)if a<-b or a =b thon
push b on to tho stack
advance the pointer i to next input symbol.
b) else if a->bthen ‘
De (top symbol of the stack. > recently popped terminal Symbol)
{
pop the stack, Here popping the symbol means reducing the to, :
symbol by equivalent non terminal. Mina
c)else error ( ).
y SEEEEERED construct an operator precedence parser for the grammar,
47 SEIS] iE1SeS| a
E-sble|d
Where a, b, c, d, e, i, tare terminals.
Solution : Let,
S— iEtS| iEtSeS|a
E>blc|d
be the given grammar.
Here, i stands for if
E stands for Expression
t stands for then
S stands for statement
The symbols a, b, ¢, dare the terminal symbols,
* The precedence if > then > else,
: i>t>e
* Similarly § symbol will have least precedence,
* The terminal symbols a, b, c,
and d have highest precedence over i, tande.
ve less precedence over a, b, ¢, dy i, e, t Hence &
igned as follows -
* The nonterminal symbols ha
Precedence table can be desi
Scanned with CamScannerSimulation consider a y
ee
Input Scanned Relation
| § ibtaea $ -
j- ——— =
| fecas Reduce byE>b
| sit taca$ eo .
) sits gas : Reduce by soa |
| Sifts eas
| sipse as Push a
ese $ 7 7 |}
i : sais bye
i ss $ ACK
alid string ibtaea for simulation.
Sp in Pasers
Precedence Functions
During operator precedence parsing, the table of precedence relation is not stored.
Instead of it, the operator precedence parser use precedence functions to map terminal
symbols to integers so that the precedence relations between the symbols are
implemented by numerical comparison.
For example
Consider two functions f and For these functions the precedence relation can be
shown with some integers.
Scanned with CamScanner“2 ces
Precedence table -
le we
< precedence tb
Output: Precedence function
1 Create functions &, ands. for each grammar terminal a and for S.
2 Partition the cmbols in groups so that £, and g, ave in the same group
3. Create a directed graph by adding edges, in following manner
ac-b then add an edge from and g,and §.-
5) Fa->b then add an edge from & and gp.
4. I the constructed graph contains no cycle then there exists precedence fume 8 4 4 ot
: = 4
aoe 5 2 a 3 eo |
Peach
Scanned with CamScanner4.4.2 Precedence graph
Fig.
[EEA shift Reduce Parsing
aves to root. Thus it works
nstruct parse tree from le
1 requires following data
attempts to cor
A shift reduce parse
Shift reduce parser
{ bottom up parser.
on the same principle of
structures.
1. The input buffer storing the input
and accessing!
The initial configuration of Shift stack
ss
reduce parser is as shown below.
The parser performs following by Fig. 4.5.1 Initial configuration
t string.
he LHS. and RS. of rules:
2. A stack for storing
Input buffer
asic
operations.
tack, this action is called
oo + Moving of the symbols from input buffer onto the
shift.
2. Reduce : If the handle appe
appropriate rule is done. That means
in. This action is called Reduce action.
3. :
er If the stack contains start symbol only
fame tne then that action is called accept. When accep!
? parsing then it means a successful parsing is done.
‘Error = a t
Stor: A, sition an which parser cannot either shift or reduce the symbols, it
ven perform the accept action is called as error.
tat for knowlege
reduction of it by
the stack then
d LHS. is pushed
ars on the top of
ed of ani
RHS. of rule is popP'
is empty at the
.ed in the process
and input buffer
t state is obtain
TECHNICAL PUBLICATIONS”- An up.
Scanned with CamScanner7206 he
ompier Design shift-reduce parser
th
take some examples t0 lear
Let us
Consier the gram”
EsE-E
EOEE ce
sn se of the input striNS vqqhid2nid3”.
perform sf-Redece parsing of
Solution ¢ erase as
| 2 Parsing action
shift =
[ ~id2vid3 Reduce by E> id
_ -id2¥id38 out
se aiess
| Us id2eid3S shift
—— id3$ Reduce by E> id
= —_ shift
id3$ Shift
: Reduce by E> id
= Reduce by E> EXE
8 Reduce by E-» E-E
+n
Here we have followed two rules,
1. If the incoming operat i
Ht Perator has more priority than in stack operator then perfor
2 If in stack operator me of ior rity of incoming oper
perator has same or | ty Priority
less,
; Priority than the prio:
Consider the following grammar
T >in | float
L Lid | ia
Parse the input stri
string ink id,
PM String int id, using shift-red
'§ shift-reduce parsey
Scanned with CamScannersotution *
Stack 1
; eer Varsingy action
sint i" ais
i‘ — Reduce by Po tnt
- Within Shute
sm als Reduce by bes id
stl. ius Shit |
__ stl, las Shit
$tLid s mn \
‘ Retuce by LoL, i
st 8 Shift
= Reduce by $9 TL; 4
Accept ‘
Consider the following grammar i
Sa a
L3L$|$
Parse the input string (a, (a, a)) using shift-reduce parser.
Solution =
Input buffer Parsing action
7 (a) $ Shift
ala) $ Shift
ay Reduce S99 i
(a, ay$_ Reduce =» $_
7 (a, a)) $ Shift
Shift
Reduce $ @
Reduce L.-» §
‘shift
shift
Reduce $4
Low
are thrust for knowlod®
TECHNICAL PUBLICATIONS” A” ¥?
Scanned with CamScannersooso|ist|2
Solution : To design the
VW
Sr. No.
\$ Reduce $ > (L)
$
)$ Reduce LL, §
)$ Shift
| $ Reduce $ > (L)
Accept
Desig shift reduce Parser for He following grammar :
sign
ift-reduce parser we will consider the input "10201",
re
i Stack Input Buff
s__t2018
si
$10
Reduce $—> 1S1_
Accept
EQ] Ditterence between LR and LL Parsers
LR Parsers
‘These are bottom up parse
This is complex to implement
is simple to implement
or LL(I) the first L_means the inpot
is scanned from left to right. The
derivation in Second L. means it uses leftmost
{The number 1 derivation for the input string. TH
lookahead syeibol to.” svamber 1 inhcatée tat of
Parsing process, lookahead symbol to predict the
parsing process
For LR() the first L means the
scanned from left to right. The
means it uses 1
reverse for the input stri
indicates that one
predict the
input is
second R
These are efficient parsers, ‘This is less efficient.
Iis applied to a large class of i
programming languayes,
i ss9 of
is applied to small class of
languages. ie
TECHNICAL PURL ean Scanned with CamSecannerWe will start by the
of the three methods bu
|
Simple LR Parsers
simplest form of LR Parsing called SLR parser. It is the weakest
it is easiest to implement. The Parsing can be done as follows.
Context free grammar
Construction of canonical set
of items
Construction of SLR parsing
table
Parsing tint sing Lal
aes
Output
Fig. 4.7.1 Working of SLR (1)
Definition of LR (0) items and related terms -
1) The LR(O) item for grammar G is production rule in which symbol # is inserted at
some position in RH. of the rule.
For example
The production $e generates only one item S-> ».
2) Augmented grammar : If a grammar G is having start
grammar is a new grammar G’ in which S’ is a new start symbol such that SS
The purpose of this grammar is to indicate the acceptance of input. That is when
Parser is about to reduce $5 it reaches to acceptance state.
‘A grammar for which SLR parser can be constructed is called SLR grammar.
symbol S then augmented
5) Kemel items : It is collection of items S—> +S and all the items whose dots are not
at the leftmost end of RHS. of the rule.
Non-Kernel items : The collection of all the items in which are at the left end of
RUS. of the rule.
4) Functions closure and goto : These are two important functions required to create
Collection of canonical set of items.
TECHNICAL PUBLICATIONS”. An up trust for krowiedge
Scanned with CamScannereel |
|
It is the set of prefixes in the right sentential form wise |
Netigg |
_
5) Viable prefix /
A a. This set can appear on the stack during shift/reduce action,
of p
|
Closure operation -
For a context free grammar G, if 1 is the set of items then the function losin
be constructed using following rules ou
1, Consider 1 is a set of canor
items and initially every item 1 iy ag et
closure(). b
2.1F rule Ae eB is a rule in closure(l) and there is another rule for g Such 5
By then
closure(I) : A > ce ©BB
Booey
This rule has to be applied until no more new items can be added to closure(t,
The meaning of rule A a © BB is that during derivation of the input string at Some
Point we may require strings derivable from BB as input. A non-terminal immediately
the right of « indicates that it has to be expanded shortly.
Goto operation -
The function goto can be defines as follows.
If there is a production A 0 ¢BB then goto(A +a ¢BB,B) = A OB +B. That mens
simply shifting of * one position ahead over the grammar symbol (may be terminal «
non-terminal). The rule A > ¢ BP is in I then the same goto function can be written 3s
goto(1,B).
Consider the grammar
X— Xbla
Compute closure(l) and goto(1).
Solution : Let
1:X3eXb
Closure(1) =X Xb
=X0«
ae x LXMb gives xX Xeb
The goto function can be computed as
goto(X) = X > Xeb
Similarly goto (I, a) gives X— ae
EEGEZZEZ) Consider the grammar
S$ AS| b
A>SA|a
Compute closure (1) and goto (1).
xo fa gives Xa
Scanned with CamScannerst write the grammar using dot operator,
jon: WE w
otfOP se 0 AS
garb
Ace
‘| Closure (I)
Ase
this as state as Ig.
is call
pets Sasa
ow we aPPlY
Ip: SHAS
Seb
A-+SA
Anea
£ goto (Ip, A)
SoAeS
Seb
AseSA
Astea
+ goto (Ip, b)
So be
900 (Ip, S)
A>SeA
As>eSA
Avea
S—*AS
Soeb
Ig: goto (Ip, a)
Avae
Consider the following grammar :
S-+Aa] bAc| Be| bBa
Aad
Bod
Compute closure (1) and goto (1)-
Solution : First we will write the grammar using dot operator.
In :S Aa
Ss sbAc
$+ Be
S ebBa
Ana
es
® will apply goto on each symbol from state To. F
Fe Sut
—— states,
sSeanned with CamScanner
Each goto (I) will generate4-20 Simple ip
ompier Design
1, : goto (Io A)
go Asa
h + goto (Io/b)
S— beAc
5 > beBa
Aaed
Boed
Ig : goto (lo-B)
$3 Bec
Ty + goto (Ip/4)
Ao ds
Bode
Construction of canonical collection of set of item -
1. For the grammar G initially add S'— eS in the set of item C.
> For each set of items I, in C and for each grammar symbol X (may be termins «
nor-terminal) add closure (IjX). This process should be repeated by appisis
goto(l,X) for each X in I, such that goto(|;X) is not empty and not in C. The set
items has to constructed until no more set of items can be added to C.
Now we will consider one grammar and construct the set of items by applyisg
closure and goto functions
Example :
E>E+T
E>T
ToOT*F
TTF
F>()
Fid
In this grammar we will add the a °
2 * en
to apply closure(1). tigmented grammar E'—>«E in the 1
Scanned with CamScannerSimplo LR Parsors
| E>eE+T
EseT
ToeTer
Ter
Fe(B)
‘The item Ip is constructed starting from the grammar E’ « E. Now immediately right
to eis E. Hence we have applied closure(I)) and thereby we add E-productions with * at
the left end of the rule. That means we have added Ee E + T and ET in Ip. But
again as we can see that the rule E> ¢ T which we have added, contains non-terminal T
immediately right to *. So we have to add T-productions in Ip.
Toat+Fand To 6 F.
In T-productions after ¢ comes T and F respectively. But since we have already added
T productions so we will not add those. But we will add all the F-productions having
dots. The Fe (E) and F— id will be added. Now we can see that after dot and id
are coming in these two productions. The ( and id are terminal symbols and are not
aie “eriving any rule. Hence our closure function terminates over here. Since there is no
Prue further we will st. ting |,
e will stop creating Ip
Now apply goto(|yE)
Eee Shiftdotto ES E«
EseE+T fight Es Ee+T
Thus 1; becomes,
goto(ly, E)
lie
E> Ee?
Since ig L js ee al 7
y m 4 there is no non-terminal after dat we cannot apply closure(1,).
“PPIVINg goto on T of lp,
~
Scanned with CamScannerven
ae
inal after dot we cannot apply closure(l,),
Since in I, there is no non-te!
By applying goto on Fofly, a
goto(ly, F)
i TOF
Since after dot in I there is nothing, hence we cannot apply closure(I,).
By applying goto on (of Ip But after dot E comes hence we will apply closure eq
then on T, then on F.
goto(ly () |
ki TE) |
EseE+T |
Eset
ToeTsF |
ToeF
F>e()
| F seid
—
By applying goto on id of Ip,
8oto(Iy, id)
Ig: Fide
Since in 1;
function here,
applying goto. In I, th
e 1, there are two producti
Point applying goto on F Productions B°
goto, > Ee, hence we will ¢
there is no non-
terminal to the
C01 .
re pre RE DE Bots on I. We will consider hi
Fight of dot we cannot apply «los
> Ee and E> Be 4 7. Thee 8%
onsider E’5 Ee +T for application
TECHNICAL PUBLicianina.om
Seanned with (am ScammerSimplo LR Parsors
compier Desi”
FT (8)
|
|
|
|
|
|
| Fold
qhere is no other rule in 1; for applying goto. So we will move to 1, In f; there are
> Teand T-> Te*E, We will apply goto on
two prontuctions
goto(ly, *) |
ly: ToT+eF |
F> e(E)
Foeid |
The goto cannot be applied on ly Apply goto on E in I, In I, there are two
productions having E after dot (FE and E-*E+T ). Hence we will apply goto on
both of these productions. The Ig becomes,
E>Ee+T |
|
[ a
| gotolly, E) |
(igs FICE) |
|
|
|
# we will apply goto on (Iy, T) but we get E> Te and T~ Te*F which is I, only
Hence we will not apply goto on T. Similarly we will not apply goto on F, ( and ict as
0 get the states 15, I,, Is again. Hence these gotos cannot be applied to avoid repetition.
There is no point applying goto on Is, Hence now we will move ahead by applying
set0 on I, for T,
gotolls, T) |
yi BOB e Ts
Tote
Scanned with CamScannerCompiler Design
Then,
goto(ly, F)
Io: ToOT#ke
Then,
golo(ly))
In F> (Be
Applying goto on Ip Tio, Ii; is of no use. Thus now there is No ites
added in the set of items. The collection of set of items is from Ip to 1).
™ that can by
Construction SLR parsing table -
As we have seen in the structure of SLR parser that are two
table and those are action and goto. By considering b:
reduce, accept and error we will fill up the action
using goto function. Let us see the algorithm -
Parts of SLR paring
asic parsing actions such as shi,
table, The goto table can be filled vp
Input: An Augmented grammar G’
Output : SLR parsing table.
Algorithm :
J. Initially construct set of items C = {Ip Ty, Tp.
LR(0) items for the input grammar G’.
I,] where C is a collection of set!
2. The parsing actions are based on each item J, The actions are as given below.
@ IfA a eaB is in and goto(t, a= J, then set actionfi, a] as “shift j': No
that a must be a terminal symbol.
b. If there is a rule A Sa eis i
vie
a
in I; then set action|i, a] to “reduce A 76"
all symbols a,
A
where a € FOLLOW(A). Note that A must not
augmented grammar S’,
. IFS’ Sis in I, then the entry in the action table action{i, $] ="accePt”
tate!
3. The goto part of the SLR table can be filled ag oe
+ The goto transitions |
considered for non-terminals only. If Bolo(l;, A) = I, then gotofl, Al =i
4. All the entries not defined by rule 2 and 3 are considered to be “error”.
#
7,
Scanned with CamScannerSunny ER Pwe
ESET NT
Best construct a collection of canonical sel af items fr the above
grammar. The set of items Semerated by this methad are also called SER) Tema, Ay
there is no lookahead symbol & 2e00 i prat in the bracket
HAY)
Qi Es Se
Esde+T
9 (LE)
Tatesk WERE |
BE est |
928 (hs. F) — —
(ig: Take got (lg, T)
oo IgsESEsTe
oe oe
ee) —
EseE+T | | goto (ly, F) |
EaeT
Tae
Ter
Fase)
Fees
190% (lg id) |
| ag:Foide |
TECHNICAL PUBLICATIONS” Anup trust or hnoniedgo
Scanned with CamScanneree
We can design a DEA for above set of items as “Oo
To state
'y
To stato
4
Scanned with CamScannerDesign :
compen 4-27 Simple LR Parsers
In the given DFA every state is final state, The state Ip is ini
pea recognizes viable prefixes. Ig is initial state. Note that the
For example - For the item,
1, : goto (Ip. E)
ESEe
E> Ee+T
Tg : goto (Iy, +) =
ESE+eT
Ig: 99to (Ig, T)
ESE+T+
‘The viable prefixes E, E+ and E+T are recognized here continuing in this fashion. The
DFA can be constructed for set of items. Thus DFA helps in recognizing valid viable
prefixes that can appear on the top of the stack.
Now we will also prepare a FOLLOW set for all the non-terminals because we
require FOLLOW set according to rule 2b of parsing table algorithm.
[Note for readers : Below is a manual method of obtaining FOLLOW. We have
already discussed the method of obtaining FOLLOW by using rules. This manual
method can be used for getting FOLLOW quickly]
FOLLOW(E’) = As E’ is a start symbol $ will be placed
= {5}
FOLLOW(E) :
E+E. that means E’ = E = start symbol.
. we will add +.
we will add $.
E-EsT the + is following E. -
F5(E) the) is following E. -.we will add )-
+ FOLLOW(E) = (+, ), $}
FOLLOW(T) :
ASE’ +E, ET the E’ = E=T = start symbol. --we will add
ESET
EST ET
The + is following T hence we will add +.
Scanned with CamScanner4-28 Si
Compiler Desi
ToT#F
ve will add *.
‘As «is following T
F>€)
Fo
As ) is followin}
FOLLOW() = (+8)
FOLLOW(F) :
AsEOE,E OT and T > F the E’ = E=T =F = start symbol. We will add ,
eB oT.
eT we will add )
ESET
EoT+T wEOT
EORT sTOF
The + is following F hence we will add +.
ToTtF
ToFeT TOF
‘As « is following F we will add *.
F> (B) *
F(T) E>T
Fo () TOF
As ) is following F we will add ).
FOLLOW(@) = (+, *, ), $}
Building of parsing table
Now from canonical collection of set of items, consider Ip.
E>eEsT
EseT
ToeTeR
Toor
Fe (E)
Foeid
Searined with CamScannerDesign
jider F> ° (E)
Aca
,a=(B=E)
Cons!
Ache
goto, = Ie
saction(0, (1 = shift 4
similarly for F ¢ id
gntry in the action table action[0, id] = shift 5 ~> goto(ly id) = 1,
ther item in Ip does not give any action, Hence we will find the actions from Ip to
In
State Action
id] +] *]cClo|s
[os | st pr
[= 6 Z|
7 fT
T
$8 4
8 sf
3s 4
$6 -
3 7
10
n
Thus SLR parsing table is filled up with the shift actions. Now we will fill it up with
reduce and accept action.
According to the rule 2.c from parsing table algorithm, there is a production E'> E*
1], Hence we will add the action “accept” in action[1, $].
Now in state h
Est.
Anas
a tule 2.b
AtBaet
TECHNICAL PUBLICATIONS” An up thust for knowiodgo
Pe
Scanned with CamScannerOF a
yay
An
a2
Compiler Design ;
. ttn
by traction, a © reduce Afb then pop 2°IBL syMbo!. i th
whack they push A, then pus goto AL OW ANE LOP OF the gig tay
spt then halt the parsing, process. 1 indicg
©) IE action|$, a} = ace
parsing,
Let us take one valid string for the grammar
NE ENT
DEST
HT 9 TE
yt > F
5)F (ED
6) F > id
Input string : id idvid
taking, the parsing act
We will consider two data structures while 8 and thos
——.
ck and input buffer.
: P; ™
Parsing action |
Input buffer Action table Goto table
Reduce by F-> id
___Reduce by T F
i __ Reduce by
[ = “Reduce by E2T
wel _ shift
| sore shitt_
sornsesas aid
| SEL+619
I
In the above table at first row we get action(0, id]
input buffer onto the stack and then push the state number
get action[5, *J as r6 that means reduce by rule 6, E> id, |
referring, goto[0, F] we get state number 3 hence
ECHNICAL PUBLICATIONS?
Scanned with CamScannerDesign -
comple. sae Simplo LR Parsors
giack. Note that for every reduce action goto is performed. This process of parsing. is
cetinued in this fashion and finally when we get action{1, $] = Accept we halt b
successfully parsing the input string, , pl alt by
Consider the following grammar
EoOE+T|T
TTF | F
porelalb
Construct the SLR parsing table for this grammar. Also parse the input a* b+ a.
olution : Let us number the production rules in the grammar.
1)ESE+T
QEST
3)T 3 TF
TOF
5)F>F*
6)Foa
NE3b
Now we will build the canonical set of SLR (0) items. We will first introduce an
augmented grammar. We will first introduce an augmented grammar E’ — +E and then
the initial set of items Ip will be generated as follows.
| _. Asafter « the symbol E appears we will add rules of E.
|__. After « the T appears, so add rules of T
|. Asafter « the F appears, so add rules of F
Now we will use goto function. From state Ip goto on E, T, F and a, b will be
applied step by step. Each goto transition will generate new state I;-
Ty: goto (19, E)
Esk
ESET
Scanned with CamScanner‘Compiler Design
Tq : goto (I, T)
EoT.
ToOT+F
FoeF*
Foea
15 + goto (Ig, F)
a ToFe®
we
q FR)Fe* J
j
4 | Ig = goto (Ig, a)
%,
q Fae
S
HE Is : goto (Ip, b)
Fobs =|
Now we will start applying goto transitions on state I,. From I, state it is Possible y
¢ apply goto transitions only on +. Hence
¢ ae
4
Tg: goto (Li, +)
i ESE+eT
] T-+TF As after * the T comes will add T transitions
. Tek same is true for F.
| =
} Foea
Fob
The goto transitions will be a
transition because there is no po
Ty = goto (1, F)
TTF.
LioiTore |
If we will apply goto on a or :
' bf : on
Which are states Land Ig rspetey, NH then we will get Fae
PPlied on Iz state now. We will choose F to apply 5%
int in applying goto on T.
TECHNICAL PUBLICATIONS”. an yp theust for knowiedge
& Scanned with CamScannercompor Desi 4236 snghes LF Yarsans
Hence We will not consider Ly and Is again, Now more ty state ba; from by the yo
transition" * Hence
ile |
Fy egoto (Lae) |
foohet
1s no point in applying, goto on state ty and fy, We will chonne state by, for
As there
oto tration
Fy goto (lor 1) |
|
|
|
|
| Fea
(Bob
Now we will first obtain FOLLOW of ET and F, As the FOLLOW computation 's
required when the SLE parsing, table is building,
FOLLOW(E} (+, $)
FOLLOW(N)= — f+, a,b, $I
FOLLOW(F)= {+, 4a, b, $}
As by rule 2 and 4, F -» T and T > F we can state E F, But E is a start
symbol. Then by rule 2 and 4, T and F can act as start symbol. «we have added $ in
FOLLOW(E).
FOLLOW(T) and FOLLOW(F).
The SLR parsing table can be constructed as follows.
State Action
TECHNICAL PUBLICATIONS” An up thst for ecwtota®
Scanned with CamScanner
|
ip + ausing above parse table.
ut a * S
Input buffer
name the IAP
Now we will parse the InP
See
asbeaS-
reduce by Fv
Shift
reduce by F—> F +
reduce by T3 F
Shift
reduce by Fb
aS if
tt
a byTF
sa nee by BEST
SE a
Accept
Thus input string is parsed,
" Construct the 7 F ie i
following grammar 5 _, bara LRO) item sets and draw the goto s™
Indicate the confticts(
Soluti if any) i 5
Solution : We wit) number . in the various states Of the SLR parser.
the i
Production rules in the grammar.
~~“ Seanined with CamSeannerpier Desig” 4-37
Simple LR Parsors
jet us construct canonical set of items -
: Ses
Sess
Sea
Sas
yt goto (Ip, S)
Ses
S3Ses
S—+ss
Sea
Se
hh: goto (Ip, a)
Srae
goto (I, S)
S3SSe
SSeS
S+ ess
Sea
Se
The goto graph will be
Now
This is
augmented grammar,
Te ~
‘CHNICAL PUBLICATIONS”. An up thst fr knowodgo
Scanned with CamScannerCompiler Design 4-38
The construction of SLR(1) table will be.
get Ip state. Hence M[3, a] = $2. In I, there is a production S SS + whi,
with A a Rule S— SS is rule number 1. And FOLLOW(S) = {a,S} -. MS
al s
Sl=n,
Goto
1 82 Accept 3
2 m nm
3 “s2 FT
AN
v
Conflict occurs. [shift-reduce conflict]
Hence conflict occurs in state 13.
Construct LR(O) parsing table for the following grammar
$— cA|ccB
AacAla -
B->ccB|b
Solution : We will first number the grammar rules
INS >cA
2)S > cB
3)A>cA
4)A>a
5) B> ccB
6) Bob
Now we will construct canonical set of items
Ip: SiS
SecA
S—>eccB
A>eCA
A-ea
TECHNICAL PUBLICATIONS” An up thrust for knowledge
Simp up
Pe,
In I, state there is a production rule $ > «a . If we apply goto (I, a) 4,
en We
ty va
eh,
My
Scanned with CamScanner ~
|
|
|
|ne
Bo ecB
Boeb
goto (lo 5)
g+Se
goto (Ip ©)
gsaceA
sac
AsceA
Bocce
Az Océ
Azea
goto (Ip, a) |
A>ae
ye goto (Ip b) |
Bo be
goto (Ip, A)
SacAs
AacAS
It goto (Ip)
scceB
B—>cceB
AacA
AwecA
Avea
Beb
bose
bi goto (le, B)
S+cBe
BocBe
Ie goto (Ik, A) a i
As+cAe f
TANS Sah oy oS I
af
TECHNICAL PUBLICATIONS" Anup tust or rowed?
Scanned with CamScannerCompiler Design
yt
Tyo?
4-40
goto (ly
poco’
A
Ase
Aa eck
Asea
goto (ly 6)
BocceB
AsceA
Aaeck
Azee
B— eccB
Bo eb
goto (Tyo, B)
B= Be
FOLLOW) ={$1}
FOLLOW(A) = [$1 -
FOLLOW(5) = {$1}
‘The SLR parsing table can be constructed as follows -
Action
a i c $s s
b
s | s | 2
Scanned with CamScannersompler ea 4-41
Simplo LR Parsers
Q2
Ans. :
Qs
Ans.
a4
Ans. :
Qs
Qé
What is an operator grammar? Give an example
Refer section 4.4.
Write an operator precedence parsing algorithm,
What are the advantages and disadvantages of operator precedence parser ?
CRA,
Refer section 4.4.2.
Refer section 4.4.
Construct an operator precedence parser for the grammar,
S$ iEtS| iEtSeS| a, E>b|c| d
Where a, b,c, d, e, i, t are terminals.
WATS A a c
Refer example 4.4.1
List out the rules for constructing the simple precedence relation for a CFG.
DAIS ee a
: Refer section 4.4.
What is a SLR grammar? or
LT ee eda a
Define LR(O) mmar.
efine LR(O) gra on
faa
What is an LR(O) grammar?
Ans. : Refer section 4.7.
Q7 Construct SLR parsing table for the following grammar,
ESE+T|T, TOTE|F, F>F*|alb
Ere
Ans. :_ Refer section 4.7.5.
Scanned with CamScanner4-42
ot
‘ompiler Design
Qs Distinguish between top-down and bottom-up parsing.
Ans. :
Sr. Top down parser Bottom up parser
Fete
1 Parse tree can be built from root to
leaves.
This is simple to implement.
This is less efficient parsing techniques.
Various problems that occur during top
down technique are ambiguity left
recursion
eae |
5
Parse tree is built from leaves to toot |
|
|
This complex to to implement.
When the bottom up parser handles
ambiguous grammar conflicts occur in
parse table.
It is applicable to small class of
languages.
Various parsing techniques are 1)
Recursive descent parser 2) Predictive
parser
It is applicable to a broad class of
languages.
Various parsing techniques are 1) Shift
reduce 2) Operator precedence 3) LR
‘Parser. |
Scanned with (amScannerMore Powerful LR Parsers
May-09, Set-4, +++ +++ 9505+ Marks 8
on of LR Parsers
ise Ambigui May-09, Set-3, ++ Marks 8
very in LR Parsing
ic Parser Generator
Scanned with CamScannerCompiler Dian 5-2 Moro Poverty
* Pa,
Construction of CLR(1)
‘The canonical set of items vm
generated while constructing set of items, Hence the collection of set of items ig ne 4
as LR(1). The value 1 in the bracket indicates that there is one lookahead symbo
set of items. ny
We follow the same steps as discussed in SLR parsing technique and those are
1. Construction of canonical set of items along with the lookahead.
2, Building canonical LR parsing table.
3. Parsing the input string using canonical LR parsing table.
Construction of canonical set of items along with the lookahead
1, For the grammar G initially add S’> «Sin the set of item C.
2. For each set of items I, in C and for each grammar symbol X (may be terminal «
non-terminal) add closure (I;, X). This process should be repeated by apply,
goto((I;, X) for each X in I; such that goto((J,, X) is not empty and not in C. The se
of items has to constructed until no more set of items can be added to C.
3. The closure function can be computed as follows.
For each item A a eX Ba and rule X-y and b € FIRST@ a)
such that X ~ © y and b is not in I then add X +¢ y, b tol.
4, Similarly the goto function can be computed as : For each item
[A + aX, a] is in I and rule[A 0. X¢f,a] is not in goto items then add
[A > aX¢B,a] to goto items.
This process is repeated until no more set of items can be added to the collection C
ERs s
s>cc
Ca acld
Construct LR(1) set of items for the above grammar.
Solution : We will initially add S'S, $ as the first rule in I. Now match
S'S, $ with
[A >0¢Xp,a]
HenceS'+ « $$
AsaeXp,a
A=Sa
&X=SB=ea2$
TECHNICAL PUBLICATIONS” An up twust for knowledge
~~ Seanned with CamScanner ~~avis a production X —>y,b then add Xe y,
ae ck. be FIRST @a)
. be FIRST S) ase $= §
be FIRSTS)
b= Is}
ge CC, §$ will be added in Ip.
Now $2 # CC, $is in ly we will match it with Aa eXf,a
Now
azs,aceX=CB=Ga=s
if there is a production X — y, b then add X — « y,b
ca eac, be FIRST(B a)
Coed be FIRST(CS)
b ¢ FIRST(C) as FIRST(C) = {a, d)
= lad)
C= aC, a or d will be added in Ip.
Similarly C> * d, a/d will be added in Ip-
ic Hence Ip: |
| sess |
| Saecas
| CreaCa/d |
|
Ced,a/d
Ls
Here a/d is used to denote a or d. That means for the production C— ¢ aC, a and
Gerad
Now apply goto on Ip.
GFoeSs
S$ +eCCs
Cac, asd
Ced,a/d
Scanned with CamScannerHono
| goto(ly, 8)
| ars S98
Now apply goto on C in ly.
S$ CeG,$ add in ly. Now as after dot C comes we will add the
Males of q
X=eC pee, aas
X—> ey ,b where be FIRST a)
Co ead be FIRST $)
Coed be FIRSTS)
Hence
Co eal, $ and C ed, $ will be added to 1;
; : ;
Ty: goto(ly, C)
Now we will apply goto on a of Ig for rule C > © aC, a/d that becomes Ca et
a/d will be added in 1).
Cr aeGa/d
ASA=GazaX=GPecaza/d
Hence X > ey, b
Cac be FIRST@ a)
Coed be FIRST@ a) or FIRST¢ d)
bea/d
Hence
a
| Ig: goto(ly, a)
CaaeGa/d |
C> aC ad i
|
|
|
9
L
_
a
2
&
i
Scanned with CamScannerwe apply
Now comes
ne 1 econ
wr 14: goto, a |
codeald
£7
As Pi
16
aa
| geen
Ld
Ig: gotol(l, C)
in the rule C
goto on d of Ip I, a/d then we get C+ d
a ing gotos on Ig is over. We will move to I,
ply
a/d
but we get no new production
‘will apply goto on C in Ip, And there is no closure possible in this state.
ve
ee
ve will apply goto on a from I, on the rule C-> ¢ aC, § and we get the state I,.
We wil
caarG$
AvaeX Ba
AsCGa=aX=CP=e,a=$
Xoey
C+eaC and Coed
be FIRST(B a)
be FIRSTE §)
bes
| Ig: goto(ly, a)
| Caaecs
|
|
Cora
1, kota" note one thing that Is and Ig are different because the second component in
h
dis different,
“PPIY goto on d of Iy for
1 goto(l,, d)
Csdes
Now i
I we ay
*© Point in
the rule C> ¢ d,$.
ply goto on a and d of 1; we will get I; and ly respectively and there is
Fepeating the states. So we will apply goto on C of by.
TECHNICAL PUBLICATIONS™- An up thrust for Anowiest3e
Scanned with CamScanner5-6 More Po)
Werf
Re,
ry
otos. Applying g0tos on a and q
point in applying 8
goto on Ig for C for the rule ot,
For 1, and Is there is no 1
gives igand 1p respectivel): Hence we will aPPIY
coaee $ _
Fay: gotolls. ©
| ovens
and Ig we cannot apply goto. Hence the proces
= completed. Thus the set of LR() items consists oy,
For the remaining states In Jp
f LR(1) items
construction of set o!
to I, states
1g: goto (la» ©)
Mo’
goe8,8 s20ce$
sect, $
aerated Jo: goto (la ®)
caedald caarc.§
coeat,$
14: goto (lo. 5) Coeds
gases 17: goto (Ip, 4)
1g: got0 (lg: ©) cde$
oar wat
Coe, cH ace, ald
1g: got0 (ly) Ig: goto (Igy ©)
coaeC,ald co ace$
co eal, ald
Coed,ald
14:.G0t0 (lg &)
Code,ald
Construction of canonical LR parsing table A
To i the
construct the canonical LR parsing table first of all we will see 1
e exami
algorithm and then we ‘will learn how to apply that algorithm oF sot
parsing table is similar to SLR parsing table comprised of action and 8° pat
Input; An augmented grammar GC.
Output : The canonical LR parsing table. ee
a,
Scanned with CamScanneread More Powerful LR Parsers
poe Ho Bt ar
snstruct set of items C = {ly 1,, Tyee) wi
se for the input grammar G
ms
h sitll”
pay ier actions are based on each item 1, The acti
epee +8, bl is in I, and goto(, a) —
» le table action[l,, a} = shift j,
here C is a collection of set of
Ons are as given below,
I then create an entry in the
) there is a production S’— Ss, § in 1, then action|i,
¢
$1 = accept,
no part of the LR table can be filled as : The gote transitions for state i is
athe = for non-terminals only. If goto(l, A) =
ons
a the entries not defined by rule 2 and 3
¥ then gotoll,, A} =
Construct the LR(L) parsing table
are considered to be “error”
for the following grammar,
IS 81
ysacc
yea
aco
Solution : First we will construct the set of LR(1) items,
y 1g: goto (13, C)
S$ S+CC«,§
ean lg goto (1p, a)
7 Csa0,§
dad
o - C-reac.$
i +48
| Vy: goto (Ip, S) C
| SoSes
|
Ty: goto (Ip, d)
Iz: goto (Ip, C)
Code§
S+CeC§ Ig goto (Ip, C)
C+eac.s Cate, ald
Coeds
Ig. goto (Ig, C)
Ty goto (lg, a) eee %
CoaeCald .
Cac, ald
Corda
Ug: goto (Ip, d)
Code ad
TECHNICAL PUBLICATIONS- An up thrust for knowedge
Scanned with CamScanner
Stinnge
&
EA for the set of ttents ont do thawte as fastlowes,
yy tore is a rates nate lity W AULA ate eaiiit |
Sas
The t
Novw consider {y whe
Cos oad ad and it the goto 8 apptiod on a then wo gett stato tog
as lence
create entry action|d) a abit & Similarly we Wey
nh
Cred ad
Asa ead
AeGacsaedpecbead
gotoIy, dl
\
hence action|0, d] =
For state ly
C > dea/d
A >~aea
A daca/d
TECHNICAL PUMLICATIONS.
"Up tus ar no
Scanned with CamScanner_ rite Powerful LA Pesan
reawee by CL bes rule a
action[!, $] = accept,
fOlO (1p, 5) 6 1,
Q-® #010 (Ig, C) = Ly
Bolo (1, C) = I,
imtace gotolly, 8) = fh. Hence gotol0, 8}
te LR(1) parsing table as follows,
flu ae
Action
0 3 sf
The remaini
“maining blank entries in the table are considered as syntactical error.
sin
° the input using LR(1) Parsing table
$88 above p;
‘arsing table we can parse the input string”aadd” as
Action table Goto table
action| 0,a
action|3,a}=s3
TECHNICAL PUBLICATIONS™= An up thrust for knowledge
Scanned with CamScannerCompiler Design
action[3al shin %
‘S0a3a3d4 as action|4.d}=r8 Ree by
$0a3a3C8 as action|8,d]=#2 Reduce we 4 |
as action|Sd}=t2 Rede y ¢ «|
ds action[2d]=87 Shitt + |
| socad7 _s action{7,SI=3 lade by €-y, |
goers So _action|5S}=t) _ Reduce by 56.4
[sot $ accept sy
sd
Thus the given input string is euccessfully parsed using LR parser or can.
hal
LR parser.
(SEES Construct a DFA whose states are the canonical
ing augmented grammar,
collection of LR(1) items ¢,
ESO
BaBlb
Solution : Initially we will start with
S30, $
We will add the rules A > «BA and A
the rule
36 toly.
a] with $A, &
a=s.
Now let us map the rule [A >a °XB,
Such that A=S, a=e, X=A, Bre,
Then second component of A > ¢BA and A > *
Will be = FIRST (Ba)
= FIRST (e,$)
= FIRST (5)
= (8)
S3eA,$
ABA, $
A+eS
x
pore ty
‘As there is a rule A> #BA we must add the rules Bea B and
to compute second component of B'S rule we will use
A. «BA, $ for mapping with [A > a *Xf,a]
(Scarined with CamScanner"rererimen erence
5-99
a
ert X=B, B=A, a=$
second component of B+ eaB and B+ ep
wil be = FIRST (Ba)
= FIRST (A,$)
= FIRST (A)
2 {abs}
= {ab}
‘
3
poeaBab
Bob. ab
Hence Ip will be
Ip, SOAS
ABA S
Awe
BoeaBab
B-seb, ab
goto (Ip. AD
S+0AS$
1p) goto (Ip. BY
A+BAAS
ABA, §
Boob ah
Ty: goto (Ip, a)
Barack ab
B-seaB ab
Beh ab
12 goto (1p. b)
Babs ab
Is: goto (13, A)
AsBAe $
Seanned with CamScannerCompiler Design
lt goto (1p, B) >
6? i
BoaBe a/b |
|
|
|
‘The DFA can be constructed as follows -
Fig, 5.1.2 DFA
[EG] LALR Parsing
In this type of parser the lookahead symbol is generated for each set of item Ty
table obtained by this method are smaller in size than LR(k) parser. In fact the sits ¢
SLR and LALR parsing are always same. Most of the programming languages use Lit
parsers.
We follow the same steps as discussed in SLR and canonical LR Parsing techniqu:
and those are
1. Construction of canonical set of items along with the lookahead.
2. Building LALR parsing table.
5 Parsing the input string using canonical LR parsing table.
Construction set of LR(1) items along with the lookahead
_ The construction LR(1) items is same as discussed in section 5.1. But wee
difference is that : in construction of LR(1) items for LR parser, we have differed
ponents from both the states.
" set
we have got I; and Ig because of different ©.
* We will consider these two states as same bY
For example in section 5,1
components, but for LALR parser
these states. i.e,
TECHNICAL PUBLICATIONS". nua
Scanned with CamScannerwet
gotetle a)
weranc/a/s
oF cals
cand 2/4/58
sa ss uke one example to understand the construction of LR(1) items for LALR
esc
C70
cod
Construct set of LR(1) items for LALR parser.
ation : First we will construct set of LR(1) items,
ly Ig goto (Ip, C)
S40S.$ {
S-+0CC,$ —
| Creat. a | le goto dt. ay |
| Creda Co+mcs
| Code$
|
|
|
|
=
We wi
“il merge states 3, 6 then 4, 7 and 8, 9.
pees
Scanned with CamScanner
it
tt4g 90)
eos $908
Breas gg GO ee )
Cor he C7 Cones
C106
bg Oly)
9900, 2G
Ct he BO
C706 Os
Sep G0 Vey 2)
C189 BOE
He have cergpa two sates by 2nd J, and made the second component as a or d ors
The production rule will remain a it is. Similarly in I, and I;. The set of items consist
states {Vay bay bey bas, Sez be bah
Construction of LALR parsing table -
‘the alyorithen Sor construction of LALR parsing table is as given below.
Step 1: Construct the S211) set of items,
Step 2: Serge the two states I; and i i ir cere Ce Enc nis
‘With dots) are makching, and create 2 new state replacing one of the of
state such as ty = It;
Stop 2: ‘The parsing actions are based on each item J, The actions are as given bela
aj IEA aa f, bl is in I; and gotoll, a) = I, then create an entry i
action table action{S, a = shift j,
bj If there is a production (A>, a] in J, then in the action table acim
Sj, a)» reduce by A-vu, Here A. should not be S’.
GS thete is 8 production 5+ «,§ in J, then action{i, $] = accept
Btop A: The who part of the LY table can be filled tions for $8
f as : The goto transitions for
1 cmsidered for nemterminals only. If gotoll, A) wt then gotoll, Al=F
ts a
TECHYWAL PUDLICATIONS™- An up Wrst Sr trom
Scanned with CamScannerction conflict then
dered to be “error”,
Coa
th .
and grammar is not LALR(I), an pl algorith
More Powertul LR Parsers
*m fails to produce LALR Parser
© entries not defined by rule 3 and 4 are
Construct the parsing table for LAIR) paneer
geistion + Pest the set LR(2) Hems can be constructed as follows with merged states.
ly
SH+S,$
S400.
Cea, ald
Coedaid
Ty: goto (IS)
SHSe$
Iz: goto (lp, C)
SoCeCS
C+e2C,$
Coed,$
13g goto (Ip, a)
Coaecaus
Coeac, aas
Coed, alas
Taz goto (Ip, d)
Code, as
Ig: goto (Ip, C)
S3CCe$
gg: goto (15, C)
Co aCe aids
Now consider state Ip there is a match with the rule [A— ot +a, b] and goto(I, a) = I,
C-42C, a/d/$ and if the goto is applied on a’ then we get the state Is. Hence we
wil ceate entry action[0, a] = shift 36. Similarly,
bh
C sed asd
4 sueaBb
42Caseacdp
Po, da
Fee ation, d] = shift 47
For state Ip,
Sade ars
‘Ste,
&,b=a/d
wot dasa/a/s
Xo = reduce by C > d ie. rule 3
7,
“4 reduce by C+ d ie. rule 3
a
Scanned with CamScanner5-16 Mc
Compiler Design we Power tn
action[47, §] = reduce by C > die. rule 3
S3S$4$ink
= accept.
using the goto functions.
So we will create action|1, $}
The goto table can be filled by
For instance goto(ly, $) = 1; Hence goto[0, S] = 1. Continuing in this fashion 7
#1) up the LR(Z) parsing table as follows.
State Action
string belonging to given grammar can be parsed using LALR parser. The bak
are supposed to be syntactical errors.
Persing the input string using LALR parser
‘The string having reguiar expression = atdatd € grammar G. We will consider isf#
ng, 2s “2add” for parsing by using LALR parsing table.
oe |
Stack input buffer Action table fo table Parsing action |
| aadds action{ Oa }=536 ———
cd adds action|36,a}=536 - 4 |
| sens ads action}?
| SAAT as action[A7/3]=136 —_[36,C]=89
{ PWIA D9 os action{a9dJor2 136.0 }-89 Red
| OAS a setion[A9sJo2 OC} :
C2. 4% betion{2 Jews? A
TECHNICA PUBLICATIONS
‘An up thnet tor krgnlodge
Scanned with CamScannerwe _
\ and LR parser will mimic one another on the same input
ee LALR
= Construct LALR parsing table for the following grammar :
Aa
5
aes input string bde using table generated by you.
ation : Let us fist number the production rules as below.
saa
ss bAc
sae
$—+bda
3 Avd
Now we will construct canonical set of LR(1) items for the above grammar.
hy:
36S
S++Aa,$
S— «bAc, $
Sede, $
S—+bda,$
Aseda
h
on set of items we will start from S — S,
; After * the S comes, hence will add the rules derivin;
$. The second component is $
g S. Now we have got the
S—+Aa,$
*sembling with A >a «x B,a
.
ion we
iy, &" Map x to A then the second component of X — #§ FIRST Ga)
Our rule
TECHNICAL PUBLICATIONS” An up thrust for krowiedo®
Scanned with CamScannerTon SU '
A. «@ and second component is FIRST (6a) = a oe
Hence A - +d, a will be added in To.
+ goto (10/5)
= . . - We will carry second component as it is.
1g + goto (Ig, A)
S > Aea$
1g + goto (Ig,b)
Sa beAc §$
S— beda,$
Adc
: S beAc$ and FIRST(Ba)
= FIRST (c§) = c- Hence
second component of A — d is ¢
1g + goto (Ip, 4)
Sa deg §
As dea
goto(l2, a)
S Aae,$
Ug = goto (13,4)
S bAsc S$
1y = goto (13,4)
Sa bdea,s
Aa dec
Tg + goto (4,0)
SH des,§
Ty : goto (Ig,0)
S— bAce,s
Tyo : goto(ly ,a)
S— bdae,§
=
a
In above set of canonical items
To states are havi tion rules HE,
Wwe cannot merge these states. The sa ‘ing common product ld
Me set i i i for
LALR parsing table, Set of items will be considered
TECHNICAL PUBLICATIONS”. An verse
Scanned with CamScannerMore Poworful LR Parsers
eng DO ,
“yconstrct LALR parsing lable using, following rules.
We ena, b]is in Ty and goto (1), a) = 1) then action [i, a] = Shift j.
lf ee jg.a production [A ~> +, a]in some state I; then
2 pa {i,a] = reduce by A 0%.
a product > S+,$ in I then action fi, $] = accept.
yr there fs
~ Action
rs - {
_ | rR
|
10
Consider the input "bde" for parsing with the help of above LALR parsing table.
Stack Input buffer Action
bo bas “snit3
$063 ; cs shift 7
S87 3s __Reduce by A> d J
ae Ss Shift 9 ee
|_S0b3A6c9 $
81 in A 1
Thus the input ei,
‘he input string gets parsed completely.
Show that the followin;
Ig grammar.
$> Aal bAc| Bel ba §
And
Bog
"IRA but not LAL (1)
[ee
TECHNICAL PUBLICATIONS”. An up thust for knowledge
Scanned with CamScannervrais Qasr
Solution ¢ We watt mmbor out the prostuction res i given grammar Se,
. =
Yo Ses Aa
2 Sabre
» sak
Y Same
canonical set of LRU) items,
cgmentat grammar the second component is § Ag y
A> eX Ba] where at iss, X is § Biscandant
on $ and add all the production rules deriving § Te SW
all be $. This is because for [S’ > slitepe
= FIRS = $ Hence we will get *
SasbR
Now we have got one rule [S + «Aa, $] which is matching with [A > a
B, alan
[X = y). Hence we will add the closure on A, Hence [A= +d] will be added in rial
Now the second component of A — ed will be decided. As. [S > eAa, S]is matching
wth [A > A+X 8, aj having X as A. Bas a and aas $ |
Then FIRST Ga) = FIRST (a$) = FIRST (@) = a. Hence mile [A> +d a] will
added.
Now,
Sa-3$
Sa+Aa$
SebAc$
eg
Scanned with CamScanneroe ie S > eB, & It suggest to apply closure on B (As after dot B
i “ne ;
now PotGjately) This rule is matching with [A —> a eX fi, a] and [X ~» 1]. Hence we
oor me deriving B. Hence B +d will be added in the above list. Now
3
ce
«
se $can be mapped with [A > aX B, a].
a=e
57
wee oe
pec
ass
since FIRST (a) = FIRST (€ 8) = FIRST (=e
vance rue [B > 24» ¢] will be added.
Hence finally
Hi S388
S3+Aa,S
$3+bAc, $
$+ Be $
$+ bBa, $
Aveda
Boedc
Ip will be -
Continue with applying goto on each symbol.
There is no chance of applying closure or goto in this
state. Hence it will have only one rule.
80t0 (Ip, A)
Ss Ava
Applied goto on A. But as after dot a terminal symbol
comes we cannot apply any rule further. The second
component is carried as it is.
£8910 (Io, b) After dot A comes hence rule for A > ed will be added.
SsbeAacs| As [S > beAc, $] is matching, with
5 bebe, g | [A aX 8, a and X > y], FIRST @ a) = FIRST (s
Aad FIRST (c) = c, The second component of A — «disc.
Boda
Scanned with CamScannerLewy
nystor Des? .
MUR,
] » goto on Bis applied and second: ¢
Egoto(ioek) TR “Mone,
{carried as it is
$4 Bec, $_J
The goto on d in state Ty is applied with corre,
| Ig = got (ld)
L
soto (13, d)
So Ave, Aw. desis
: goto (Ip, a) ly
Boade,a
Tio + goto (ly, c)
S> Bee, $
Is : goto (Iz, B) Th = goto (Iz, c)
S$ bBea, $ S> bAce, $
———__
Tyg + goto (Ig, a)
‘ “i
second components as it is, Metin
B
S— bBae, $
Now using the above set of LR(I) items we will construct LR(1) parsing table
follows.
i
} - Action Goto
1 / 2° pos
ee
|
|
:
|
4
|
f
|
|
|
i
TECHNICAL Bi om. Scanned with CamScanner
Kathe deh des5-23 More Powerful LR Parsers
wwe can parse the String "pda" using above constructed LR (1) parsing table as :
Input buffer Action
ey bdaS _ Shift 3 |
daS _ Shift 9 t
aS Reduce by B > d |
aS. Shift 12 |
s Reduce by $ > bBa_
$ Accept
Now we will construct a set of LALR(1) items. In this construction we will simply
mege the states deriving same production rules which differ in their second
ssxponents only. In above set of LR(1) items state Is and Ig are such states ie.
goto (Ip, d) Ig : goto (I3, 4)
Asda Av de,c
Bodec Bods,a
We win
(“Swell form only one state by merging state 5 and 9 as
“3 goto (Ip, d)
Asda/ec
Bade ase
Scanned with CamScannerSe
Fi ~ SRE La i
Hence the LALR(1) set of items are given as : fs,
Ip: S3°S,$ Ig : goto (Iz, a)
S—+Aa,$ S— Aae,$
S— +bAc,$ I; : goto (Iz, A)
S— + Be, $ S> bAcc, $
Aseda Is : goto (13, B)
Boedc S— bBea, $
Ty + goto (Ip, S) lio : goto (14, 6)
Si Ss,$ S> Bee, $
1, : goto (Ip, A) In + goto (7, 0)
S Asa $ S— bAce, $
13 + goto (Ip, b) Tz : goto (Ig, a)
Sm beAc § S— bBa+, $
S— b-Ba, $
Avedec
: Beda
Ig + goto (Io, B)
S> Bec $
Ig : goto (Ip, d)
A> de, a/c
: Bo de, a/c
The LALR parsing table will be -
eae Action
= L © a s s
B ACCEPT
1
ZB sil
TECHNICAL PUBLICATIONS”. an up thrus for knowledge
Scanned with CamScanner—Moro Powerful LR Parsers
le shows multiple entries in Action [59, a] and Action (59, c}. This is
ing, tl ‘
ie pane conflict. Because of this conflict we cannot parse input.
given grammar is LR(1) but not LALR()
cipis shown that §
gpl comparison of LR Parsers
isa time to comp:
“CFG, elliciency an
are SLR, LALR and LR parser for the common factors such as size,
d cost in terms of time and space.
LALR parser Canonical LR parser
‘The LALR and SLR have LR parser or canonical LR
the same size. parser is largest in si
|1 Itis an easiest method ‘This method is applicable to This method is most
hased on FOLLOW function. wider class than SLR. powerful than SLR and
LALR.
This method exposes less Most of the syntactic This method exposes less
syntactic features than that features of a language are _ syntactic features than that
_OFLR parsers. expressed in LALR. of LR parsers.
‘tog . oe arr ~ —
Exor detection is not Error detection is not Immediate error detection is
\_immediate in SLR. done by LR
The time and space The time and space
Space complexity. complexity is more in LALR complexity is more for
but efficient methods exist. canonical LR parser.
ke for constructing LALR
Scanned with CamScannerMUR py
Pay
Graphival yoprosiontation far the elias at ERC family: bas given below, J
ian
LAUR
SLR
Fig, 6.3.1 Classification of grammars
EXCY Dangling Else Ambiguity roy
i grammar is used then the conflicts occy
8 occu
ty the parsing methods if the ambigue
and thers we can not parse the input string,
1é the two entries that appear in the parsing table M[A,a] are for reduce action
reduce conflict occurs.
then reduce
© VWoone entry is for shift action and another for reduce action in M[A,a] then
shift-reduce contliet occurs,
Using dangling else ambiguity
Consider the grammar
iS fa
= if expression then Statement else Statement
if expression then Statement
a = all other productions
We will have this grammar in this manner -
S48
Sr iSeS | iS | a
Now we will build the LR(0) set of items for as
Scanned with CamScannerMore Powerful LR Parsors
$3: goto (1, a)
Srae
1g: goto (15, $)
SiS ees
Ss iSe
15: goto (Iy, @)
SsiSees
So +iSes
Sis
Tp: goto (Ip, i)
S+ieSes
SoieS
S$ *iSes
SiS
Sea
The FOLLOW(S) = {e, $] .Now we will build the SLR parse table for above obtained
qofitems.
“layin the above table at action{5, e] there is a shift/reduce conflict. We will now
to resolve it,
Scanned with CamScannerCompier Design 5-28
Consider the input “iiaeaS” for processing,
———_—— ————
r — 7 - —
| Stack Input Action with conflict resolution |
eS shift |
i - J
| so Shit : |
| soz sexs Shift.
i2i2ad aS Reduce S—+ a _
| sniziass aS From the conflict we have chosen
i Reduce S-+ 1S
| |
| $0i2S4 aS Reduce S-» iS -
| i
‘That means the choice of 12 in action[4.e] is not valid. Hence we will try itty
choosing the shift action.
Stack Input Action with conflict resolution
” ___iiaess Shaft
son meas Shut
soi2i2 seas Shut
0i2i2a3 ea$ Reduce
soni2st ¢a$ From the conflict we have chosen |
Reduce $9 15
SDi2IDSHeS a8 Shift an
SOi22S4e593 $ Reduce Sa
S022 Reduce $+ 5S -
s
soi2s4
[ ses s
a
Logically also we should favor the shift operation as by shifting the es¢ wo
associate it with previous “if expression then statement”, Therefore shift/reduce
is resolved in favour of shift.
TECHNICAL PUBLICATIONS”. An up thust fork
Scanned with CamScannerresolving the conflict we get the parsing table for dangling else problem as
crus bY Joacicrascesibait tte im Sox tealons
State Acti ry}
a HO goto |
overy in LR Parsing
ga Error Rec!
se LR parser is 2 table driven parsing method in which the blank entries are treated
seemors When we compile any program we first get the syntactical errors. These errors
a wsualy denoted by user friendly error messages. To understand how to decode these
seme messages consider one example.
EE+E
EsE*E
E+€)
Eid
The parsing table for the above grammar will be -
‘State Action Goto |
i i ae) s EB |
|
4
hn on tenet for knowlege
Scanned with CamScannerCompiter Design
During the error detection and recove
entries by particular reduction rules. This
‘ Pos nk
detection until one or more reductions are done. [This is how we re oeting tf ay
down! and error will be introduced before any shift move take place. “88g the
"
Consider the set of items generated for obtaining the error messages
Ip: Eek
It means that there is no symbol before the dot. And being the initial g
is empty. In such a case if + of * oF $ comes in the input string thes a
operand is missing for these operator. In other words first some operand pv Sy ty
and then this operator should appear. ‘ould Pree |
————_—
Stack Input Error could be
|
| $ __ Missing operand |
Is Missing operand |
| $ Missing operand
Unbalanced right
1 : goto(|p, E)
E>Ee
If we ultimately reduce E — id then id id = missing operator, id( = missing operate,
id ) = unbalanced right parenthesis.
Stack Input —_Error could be
id Missing operator
Missing operator
Unbalanced right
parenthesis
Ig: goto(ly, E)
E+ (Ee
cans af
If we eventually reduce E -> id then the rule becomes E —> (ide). That mer sy
id we expect ) and if § comes then the error will be missing right parenthesis
after id again id or ( comes then it will be missing operator.
tee eet
TECHNICAL PUBLICATIONS”. An up thrust for knowledge |
Scammed wb CamScannerke
Ertor could be
Missing operator
Missing operator
Missing right
Parenth
—__Parent
From all these situations we conclude some CrTOr Messages are ;
+ EL: These errors are in states Iy, Iy, ly and 1
should appear before operator. Hence the
operand”.
Is. This. indicates that the operand
SOF message will be “missing
+ £2: This error is in ) column and from
unbalancing in right parenthesis. Hence or
parenthesis”.
states Ip,
Ty Ty Iy and I, indicating
Tor messa
Be will be “unbalanced right
+E}: The operator is expected in this case of error
a8 it is from state 1, or Ig. The
error message will be “missing operator”,
+ E4: This error occurs at state 6 in the $ column,
at the end of expression. Hence the error
parenthesis”.
The state 6 expects ) parenthesis
message will be “missing right
Thus the modified table with appropriate error messages is as shown below.
TECHNICAL PUBLICATIONS”. An up thrust for knowledge
Scanned with CamScannersf sew
w
mu
he parsing table if it des
ut, LR parser will reter t
1 fe will be reported This is how we gq
x ty
Dective error mess
entry then re
when we compile our program:
[EG] Automatic Parser Generator
vie have discussed the manual method of constraction of LR parser, This invalyg
ke
of work for parsing the i Hence there is a need for automatioy
nh OF this
process in onler fo achieve the ficiency w@ parsing the inputs Certain automaton yyy
lable. YACC is one stich automatic tool for generating jy
ere
for parser generation are ava
Js for Yet Another Compiler Compiler which is basically
parser program. YACC stare
ty YACC is LALR parser generator. The Yace
utility available from UNIX. Basics ~
report conflicts or ambiguities (fat all) in the form of error messages, In earlier chan
vee have seen one such tool LEX for lexical analyzer, LEN and YACC work together ip
analyse the program syntactically.
The typical YACC translator can be represented as shown in Fig. 5.6.1
Specification
file ytab.c. and
ytab.c cout
CC ~ The C compiler
Input strin,
—_— Executable program
aout
Fig. 5.6.1 YACC : Parser generator mode!
First we write a YACC specification file; let us name it as xy. This file is given?
YACC compiler by UNIX command -
yace x.y
Then it will generate a parser program using your YACC specification
parser program has a standard name as y.tab.c. This is basically parser Pros
generated automatically. You can also give the command with - d option “
file. ™
ram i
Scanned with CamScanner5.
esse SS
a Move Poworful LR Parsors
_ yace = a yy
4 option two files will get generated one
is yatabye
tab.h will store all the tokens
You nee
B
jet file ys
peat ie ghe generated
executable
and invalid
and other is y.tab.h. ‘the
Mt not have to 6
"be compiled by
ren YOU can test your YACC
dsc
y.tabc program will the
put file. Th
strings.
y.lab.h
compiler and
ram with the help
eaplcitly
eeratS the
sore vat a j
wrriting YACC specification program is the most log This specification fle
the context fee grammar and using the production rut of context free
a the parsing of the input string can be done by y.tabre
erat
pre
ical activity,
et us lean how to write YACC program,
FXII YACC Specification
Fist of all we will see the sucture of YACC specification,
The YACC specification file consists of three parts de
‘claration section, translation rule
ection and supporting C functions,
Declaration section
(Ordinary C deciarations)
Translation rule
(Context free Grammar)
‘Supporting C functions
Fig, 5.6.2 Parts of YACC specification
The specification file with these sections can be written as
q
1 delaation section */
%
‘ , Translation rule section */
1 Recuited ¢ functions*/
; ; y this
| Declaration part: in this section ordinary C declarations can be put. Not ony
; ane jokens
han 280 declare grammar tokens in this section. The declaration ©
should be within %{ and %),
“ 1 nn,
TECHNICAL PURLICRE. him tO AO 8 A aces Cam ScannerComyew Dar aw —
"Oy
Mon
Yor histance <
ratte t action t
ratte 2 action 2 |
i
rule 1 action
ane more than one alternatives toa single rule then those attornay
, tive
sy J character, The actions are typical © statements, 1 CHG jg nay
patent b
.Jalternative n
alternative 1 [alternative 2
alternative I action I
| alternative 2 {action 2)
alternative 1 faction n}
is of One main function in which the
3. C functions section : This section con:
routine yyparse() will be called, And it also consists of required C funetions,
Programing Example ; Write a YACC program for implementing desktop calculator,
Program : We will first create LEX program named calci.l then we will write progam
for YACC as caleiy. The extension to LEX program is I and to YACC
program is .y
{*Program name :calci|*/
#include “y.tab.h" /*defines the tokens */
#include
%)
KK
{*To recognize a valid number*/
{10-04 + }{40-9]°\ 10-9]+ )(LeE-+1716-914)2) — {yylval.dval = atof(yytext):
return NUMEER;}
‘ot log no | LOG no (log base 10)*/
log |
LOG {return LOG))
/*For In no (Natural log)*/
{return nL.OG;} ae
In
Scanned with (amScannerpFor sin angle*/
sin | ;
gin {return SINE:}
/*For cos angle*/
cos |
cos {return COS;}
/*For tan angle*/
tan |
TAN {retum TAN;}
("For memery*/
mem {return MEM;}
{\d ; /*Ignore white spaces*/
/*End of input*/
\$ {return 0;}
/*Catch the remaining and return a single character
token to the parser*/
in]. return yytext/0);
¥
/*Program Name :calci-y */
a
double memvar;
}
[To define possible symbol types*/
anion
{
; double dval;
/*Tokens used which are returned by lexer*/
token NUMBER
‘Aoken MEM
‘token LOG SINE nLOG COS TAN
/"Defining the precedence and assogiativity*/
TEGHNIGAL PUDLICATIONS + An up thus for knowkaigo
Scanned with CamScanner~~
Compiler Design edence*/
—_—— /rLowest Prec
soft oF
cqtott 7
aright '°" 0
mel L0G SINE nLOG C
s TAN /*Highest precedence /
yf
"No associativity’ < aunt fi
on UMINUS y*Unary
pests the type for non-terminal*/
satypo expression
*% .
/*Start state*/
start: statement ‘\n’
| start statement ‘\n
/*For storing the answer(memory)*/
statement: MEM '=' expression {memvar = $3;}
| expression {printf("Answer = %g\n",$1);}
; "For printing the answer*/
/*For binary arithmetic operators*/
expression ‘+' expression {$$ = $1 + $3;}
| expression " expression {$$
| expression '*" expression {$$
| expression '/' expression
{ /*Tohandle divide by zero case*/
expression:
= $1* $3;}
i($3 == 0)
yyerror("“divide by zero");
else
$$ = $1/$3;
| expression '*' expression {$$ = pow($1,$3);}
/*For unary operators*/
expression: expression %prec UMINUS {$§ = $2;}
/“Sprec UMINUS signifies that unary minus :
cr ee pea minus should have
expression ‘) $ ,
| LOG expressio: ieee
nm {$$ = 1 '
| expression see Matton
1 Gatisnometric functiongty “ /°8'S2H}
3.141892654 / 180);}
3.141892654 / 180);}
3.141892654 / 180);}
Techn
NAL PUBLICATIONS”. An up ny tor
us for knowledge
Scanned with CamScannercopier Desion
NUMBER {83 = $13
MEM {S$ = memvar;}
[Retrieving the memory contents*/
ee
main()
print{("Enter the expression:
yyparse():
}
jnt yyertor(char *error)
{
fprint{(stdert,"%s\n" error);
}
Compiling and running of LEX and YACC programs
The output of the program can be obtained by following commands
{root@localhost|# lex calci.1 ;
root@localhost|# yace -d calci.y Note that if we use ~d option
{root@localhost}# cc y.tab.clexyy@-ay am ———| then tabsh gets created
{root@localhost]# ./a.out automatically and we need not have
Enter the expression : 242 x Srosta explicitly
Answer = 4
How to run lex and Yacc Programs together ?
eae lex calci. 1 — will create lexyy.c
2528 yacc - d calci.y will creat y.tab.c
anew cc y.tab.c lex.yy.c ~ ll - ly ~ lm will compile both
lex.yy.c and y.tab.c
mem = cos 45 ae
sin 45 / mem By Jenna various
or Ubrary file
ino /a out < will run executable file of program
‘Answer = 2.30259
SEE er sites)
Declaration part
‘*étoken is used to declare tokens in the YACC. In the above program token NUMBER
can be declared as
“btoken NUMBER
The precedence and associativity can be declared in the above program “oleft, Yoright
and Yononassoc declares the associativity of the operator being, left associative, right
associativ pee
“sociative or nonassociativity respectively.
The precedence can be declared in an increasing order.
TECHNICAL PUBLICATIONS”- An up thst for Knowtedge
~ Seamed with CamScannertka
The yylval used is the union of type double. YACC can associate data type wth
token by using this yylval.
satoken NI
Ut means token NUMBER has the data type double. Any number of data types cay,
declared in union
7
sociate*/
cleft LOG SINE nLOG COS TAN //*Highest precedence*/
PNo
sociativity"/
Senonassoc UMINUS /*Unary Minus*/
The operators on the same line has equal precedence. For instance ‘+’ and ‘~' has the
same precedence.
‘The shift/reduce conflict can be resolved by YACC with the help of these precedence
rules. If the input token ‘id’ has more precedence then the shift action will be
periormed. If the precedence of production rule Ara is greater than ‘id’ then reduce by
A-a will be performed. If there is same precedence then YACC checks for associativity:
associativity shift action will be performed and for right associativity reduce
action will be performed.
Rule section
im the rule section : (colon) is used to separate LHS and RHS of production rule. Te
termination of each rule is given by * ; . The SS is used as attribute value at LHS of
grammar.
1 there is a rule E+E then it has attribute values as $1483. Since E = §1, + is RO"!
for $2 and E = $3. To represent the RHS of the grammar $1, $2, ..., $n symbols are US
Finally the answer can be in $$ = $1, Oe OES
Subroutine section
In the main important function yyparse() is called YACC invokes tirst yypanel
which inturn calls yylex when it requires tokens.
The routine yyerror is used to print the error messay reed
parsing, of input.
Thus we have seen how an input string is parsed and syntactically checked.
se when an error is occtl
TEGHIVCAL PUBLICATIONS”- Aap that bo knominsge
Scanned with CamScannerve the f
go deve t a Wing PAL EE gry
Method we have to
v6 lection of items using LR) items
e
ag LRA) items are
sess 1, + goto (I, =)
5aeL=RS S+L=ers
SaeRS R-Ls
LaeR= |S LoeRs
Leid, =|$ Lids
Ra6L S$
i: goto (Ip, S) Ty : goto (I;, R)
Secs L «Ree Is
ke goto (Ip, L) Is : goto (Iy, L)
SoL+=RS RoLe=|s
RaLe,$
I, + goto (I, R)
S3L=Re,$
4 goto (ly, R)
& pooag he * goo Ly
baeeRe 5 1, : goto (I, *)
LoeeRS
Ro-L$
LoeeRS
Leveid $
Typ: goto (ly, id)
4 Loides
1y 5 goto (ly ®)
Love Ra
TECHNICAL PUBLICATIONS”. An ve trust fe Mee
Seared with Cam Scanner5-40
Mor Power,
cermin
From above set of items we have got ~
i) I, and Ij, give same production but lookheads are different, Here re,
to form Lin
Hence I512
Hence Iny3
iv) Ig = Io Hence Ig1o
Therefore the set of items for LALR are
he ga-S$ Ing tL o*Re= 1S |
So+L=RS$
S2°RS$ IggiR— Le, = |S
LoetR=I$
Lid, =|$ b:S> L=Re$ |
Ro -+L,$ j
I: S3S8+,8
I: S>L+,=R$
&:
In
Is
L—eld,= |S
Ip Laid
Ij: S3L=-R$ ets
Ra-L$ Lo. ias
The parsing table can be constructed as follows -
| Action Goto |
T ~ —_ ~
| |
fe
Scanned with Cam ScannerThe LALR parsing table is
Action Goto
. s L R
went 1 2
2
ea — 80713
es a
eal | 810 9
Ld |
|
82 Construct LALR parsin;
S3CG, Ceca
Ans, ;
LAE Beer example 5.2.2 assuming a =
ig table for the following grammar
TECHNICAL PUBLICATIONS". An up unis or owledge
Scanned with CamScanner
|
|
|More Powe
5-42 ‘erful LR p, ers
Compiler Design
in shift - reduceparse,
rser are i) Shift reduce
‘ommon conflicts that can be encountered
Q.3 What are the c
ur in shift-reduce pa Tip
Ans. ; The common conflicts that occ!
ii) Reduce-Reduce.
Example : Refer section 5.4.
Q4 Explain canonical LR parsing.
Ans. : Refer section 5.1
Scanned with CamScannerSemantic Analysis
Syllabus |
semantic analysis, SDT, Evaluation of semantic rules |
Contents
6.1 Introduction
6.2. Syntax Direction Translation (SDT) ......... January-10, May-08,09, Set-3,
yee tee ae - Aug./Sept-08, Set-1,3, «Marks 16
6.3. Botlom-Up Evaluation of 8-Altributed Definitions
. Dec-05, Set-2,3,
ae . May-05,09, Set-4,------- Marks &
64 Attributed Definitions... 0... eee ... danuary-10, Set-2,
ssaenoneees .... Aug./Sept.-06, 08, 07, Set-1,
neat neeevenerberve May-06,07,Set-1,3,4, --- Marks 16
6.5 Bottom-Up Evaluation of Inherited Attributes
6.6 Recursive Evaluation
Scanned with CamScannerAs we know how an input string is syntactically checked by the syntax analy
should now know how to analyze the input semantically. In semantic analysig (We
analyzes the meaning of the program. Beyond syntax analysis, Programming eqn.
fare analyzed. Hence extra-syntactic rules are imposed in this phase. The mt
analysis is carried out by scanning the input text and it is static in nature n
properties that can not be captured by context free grammar are captured by se S
rules. ‘These properties could be name, scope analysis, type checking “ang te
conversion. 7
(ORD Need of Semantic Analysis
The semantic analysis is done in order to obtain the precise meaning
programming construct. This phase is also known as static semantic analysis, The nog
for semantic analysis firstly, is to build a symbol table that keeps track of names
established in declarations. Secondly the data type of identifier is obtained and ty
checking, of the whole expression and statements is done in order to follow the type
rules of the language. Thirdly to identify scope of identifiers.
(EE) syntax Direction Translation (SDT)
While doing the static analysis of the language we use syntax-directed definitions
That means an augmented context free grammar is generated. In other words the set of
attributes are associated with each terminal and non-terminal symbols. The attribute an
be a string, a number, a type, a memory location or anything else.
The syntax-directed definition is a kind of abstract specification. The conceptual view
of syntax-directed translation can be as shown in Fig. 6.2.1.
vata
wpatsiog Le] synaee Lomf Povendoney L | Sanerigh
[semantic rules|
Fig. 6.2.1 Syntax-directed translation
Firstly we parse the input token stream and a syntax tree is generated. Then the
is being traversed for evaluating the semantic rules at the parse tree nodes.
The implementation need not have to follow
pass implementation semantic rules can be evaluated during parsing without expletl
constructing @ parse tree, or dependency graph. In such semantic evaluation, 3
nodes of the syntax tree, values of the attribute are defined for the given input string
Such a parse tree containing the values of attributes at each node is called an annt!
or decorated the parse tree.
e
all the steps given in Fig. 6.2.1. In si"
TECHNICAL PUBLICATIONS". Anup tas for knowledge
Scanned with CamScannerSiemantic Analysis:
ion : Syntaxcdirected definition is a jeneralization of context free yam
pane sociated with Ita set of semantic rte
foam a QB peBa eee), where a isan ateute obtained f
atyibute # The attribute can be string
pel
mar production Xv
p the function £
Number, a type, a memory location or
or X-vee be a content free grammar and ase t(bysby cooly) wh
Conside
thon there are two Lypes of attributes
attributes
1. Synthesized attribute : The attribute ‘a’ is called synthesized! atteibute of X and
by byreenb§ ae attributes belonging to the production symbols
‘The value of synthesized attribute at a node is compu
attributes at the children of that node in the parse t
1 fram the values of
therited attribute of one of the
inherited attribute : The attribute ‘a’ is called
grammar symbol on the right side of the produ
belonging to either X or a «
fon (ie. 2) and by by oooby are
The inherited attributes can be computed from the values of the attributes at
the siblings and parent of that node.
STEEL 4. synthesized attribute
died dix “Let us see how to compute synthesized attributes.
wonis fer
Peace, PEER: Conser the context free grammar as
SEN
con E+E+T
E+E-T
‘anu to he brains
‘wa?
Ze TECHRIGH, PUDLIGATIO
Scanned with CamScannersemantic actions for 6
- Seman
a
definition can be written for the above Stammay, b
definition © rt
production
The syntardirected
1 Semantic actions
jon rule
Production 7
Print(E.val) }
> EN -
Eo
—E oo
Boo
tT 9
Io
Too
Foo
an |
Fos |
N = : Can be ignored lexical analyzer as ji {
terminating symbol.
ee ee)
For the non-terminals E, T and F the values can be obtained using the attribute ‘wi.
Here “val” is a attribute and semantic rule is computing the value of val. (How? that
will discuss shortly!)
The token digit has synthesized attribute lexval whose value can be obtained t*
lexical analyzer. In the rule S + EN, symbol $ is the start symbol. This rule is 0 F*
the final answer of the expression.
In syntax-directed definition, terminals have synthesized attributes only.
wot
* Thus there is no definition of terminal. The synthesized attributes are quit ef
used in syntax-directed definition. The syntax-directed definition that "5°
synthesized attributes is called S-attributed definition.
anne
* In a parse tree, at each node the semantic tule is evaluated for
(computing) the S-attributed defi
fasten?
inition. This processing is in bottom uP **
from leaves to root.
Following steps are followed to compute S-attributed definition. oe
1. Write the syntax-directed definition “
using the appropriate semantic
corresponding production rule of the given grammar.
TECHNICAL PUBLICATIONS". An un tne
‘Séaiined with CamScannerpaee 6-5 Somantic Analysis
2. The annotated parse tree is generated and attribute values are computed. The
computation is done in bottom up manner.
3 The value obtained at the root node is supposed to be the final output.
Let us take an input string for computing the S-attributed definition for the above
ven grammar.
sample : Construct parse tree, syntax tree and annotated parse tree for the input
tring is. 5 * 6 + 75
oa Ter
at ak
[ \ 7 | ®
digit
® 6)
(2) Syntax tree (b) Parse tree
Value sto, ~
‘ from child,”
" to parent | Ev
Eval = 30 27
Tal = 30 Fval=7
‘ fo \
Tal=5 + Fval=6 ——digitlexval=7
’ Fval=5 —— digitiexval=6
digit toxval = 5
? (c) Annotated parse tro
Fig. 6.2.2 Computation of S-attributed definition
/ TECHNICAL PUBLICATIONS” An up thus! for knowledgo
Scanned with CamScannerCompler Design
5 tation of art from the leftmost bottommog
ov aes 3 in order to reduce digit to F. The semantic action 44
ae F > agi mw lewval. The value of digit is obtained from lexicay
here is Fal: = digit-lexval.
parser invokes the lexical analyzer 10 §
attributes we st
od
at take
anal
got the token value) which becomes they
5 Ale
oy,
Hence Ewval=5. a
since T is the parent node of F and semantic action suggests that Twva} x
Since T is the pa
ae is aly
the T.val=5. Thus the computation of S-attributes is done from, children, We
can ge a S
‘Then consider T — T, * F production; the corresponding semantic action js
Tyal = Tyval x Bal. Hence
LR Twal = Tyval x F.val
if = 5*6=30
4 Similarly, The combination of Ey.val + T.val becomes the E node,
2 4 . Eval = Ejval + Twal
fr = 3047
f Eval = 37
Here we get the Ey.val from left child of E and T.val from right child of E, Finally ye
acquire the value of E as 37. Then the production S > EN is applied to reduce
Ewal = 37. Then rule N->; indicates termination of the current expression. The semani:
action associated with S > EN suggests us to print the result E,val. Hence the output
will be 37. Thus S-attributed definition can be computed by a bottom-up fashion
using postorder traversal.
2. Inherited attribute
The value of inherited attribute at a node in a parse tree is defined using
attribute values at the parent or siblings. Consider an example and let us compute
inherited attributes.
Annotate the parse tree for the computation of inherited attributes for the §**
} string: int a, b, c; the grammar is as given below.
if SoTL
T > int
T > float
T ~ char
j T ~ double
Ls Ly id
i L > pid.
ch ;
TECHNICAL PUBLICATIONS”. An op ive or knowledge
Scanned with CamScannersate itt bc we Rave to dstbute the datatype into al the identifiers 3
. such that a becomes integer, b becomes integer and ¢ becomes integer.
eps ae to be followed : |
polo
1. Construct the syntax-directed definition using semantic action.
annotate the parse tree with inherited attributes by processing in top-down
fashion.
2
‘he syntaxcdirected definition for the above given grammar is
| Production rule as
| Ss > TL rare Hi
| T = int Typeinteger | h |
T + float Tiypestoa ‘AI |
To -+ char Taypeschar |
| T + — double Ttype-=double i |
| Loo Lyid Ly.in= Lin i
| Enter_typet id.entry, Lin) 1
L
}
Q
2
:
i
i
5
1 Type = int
Value obtained / ;
from child { | Value obtained
opment from sibing / l \
|
©
Value obtained /
from parent {!
tochid |
Fig, 6.2.3 Annotated parse tree
TECHNICAL PUBLICATIONS” - An up thrust for knowledge
Scanned with CamScannerThe value of L nodes is first obtained from
Serscal value obta
of the identifiers
Preon
insert
v
ned as int or float or char or double. Then the |, nodes ti
. and c. The computation of type is done in Lop-donm ne et
traversal. Using function Enter_type the type of identifiers ab ty
in the symbol table at corresponding, id.entry (The identry jg the PR &
comesponding, identifier in the symbol table) . te,
3. Dependency graph
The directed graph that represents the interdependencies between ‘iMthesing
ishented attributes at nodes in the parse tree is called dependency graph, ag
For the rule XYZ, the semantic action is given by X.x := f(Y.y, Z.2) then
Synthesized attribute is X.x and Xx depends upon attributes Y-y and Zz,
Algorithm for Constructing dependency graph
for (each node n in the parse tree) do
4
for (cach attribute a of grammar at node n) do
\ {
for the attribute a, build a node of dependency graph.
11 Constructing all the nodes of graph,
4
for (cach node n in the parse tree) do
for (each semantic rule b : = £ (e, , Cap Gy)
which is associated with Productions) do
for (i := 1) to k) do
Consturct edge from node ¢; to b,
Design the dependency graph for the Soltowing grammar.
7
Scanned with CamScannerje semantic rules for the above grammar is as given below.
sotutio" th
{
Production rule
nantic rule
Eo Eek, Eevakatly evalet eval
Eo EYE valet eval X By eval
graph is as shown in Fig, 6.2.
ATS
AS,
Eyval * — Ezeval
The dep
Fig. 6.2.4 Dependency graph
ed attributes can be represented by eval. Hence the synthesized
ven by Eval, Ey val and Eval. The dependencies among the nodes is
© The arrows from E, and Ey show that value of E depends upon
arse tree using dotted lines.
synthesiz
sxtributes are
We have represented pi
jgn the dependency graph for the following, grammar.
Taint
T= float
T-cur
T = double
List + Listy, id
List > id
Solution: The dotted line is for representing, the parse tree.
The semantic rules for the above grammar is as given below.
Semantic actions
Listini=Ttype
‘Taypessinteger
TECHNICAL PUBLICATIONS”- An up tina for anole?
Scanned with CamScannerAmt
ae
2 tN Ad.
a
float >
Raypes
Taypesschar
T > char
1 — double “Paype=double
List -> List), id
Listin
Parent Us
to ,
child
From sibling
a
Fig. 6.2.5 Dependency graph
‘The dependencies among the nodes can be shown be solid arrows. In the above
drawn dependency graph how the values can be inherited from the parent oF sibling
node is shown clearly. Hence the name for the attributes is inherited attributes.
4, Evaluation order |
ration order in 2M
ed
‘Thereo®
The topological sort of the dependency graph decides the evalu
tree. In deciding evaluation order the semantic rules in the syntax-direct
are used, Thus the translation is specified by syntax-directed definitions.
precise definition of syntax-directed definition is required.
jee
Scanned with CamScannerconpierDesion
Semantic Analysis
———_r_ayais
T. a
---=Uistin= i @
@isuinvin +;
i
in S®
J
Fig. 6.2.6 Evaluation order
The evaluation order can be decided as follows,
1. The type int is obtained from lexical analyzer by
Y analyzing the input token.
2. The List.in is
assigned the type int from the sibling Titype.
3. The entry in the symbol table for idi
lentifier ¢ gets associate
Hence variable ¢ becomes of integer type.
4 The Listin is assigned the type int from the parent Listin,
5. The entry in the symbol table for i
Hence variable b becomes of integer
‘d with the type int,
identifier b gets associated with thi
e type int.
type.
6 The List.in is assigned the type i
7 The entry in the symbol table for identifier a
Hence variable a becomes of integer type.
Thus by evaluation the semantic rules in this order stores the type int in the symbol
“ile enty for each identifier a, b and c.
QI construction of Syntax Trees
The syntax tree is an abstract re
tees are
Presentation of the language constructs. The syntax
used to write the translati
See hy
ion routines using syntax-directed definitions. Let us
"ow to construct Syntax tree for expression and how to obtain translation routines.
iy Construction of Syntax Tree for Expression
Srammar considered for the expression is
ESEatT
Ess,
pp
TECHNICAL PUBLICATIONS” An up thrust for knowledge
Scanned with CamScannerCompiter Design oe
E> EYT
EOT
T > id
T > num
Constructing syntax tree for an expression means translation of ’xPression
postfix form. The nodes for each operator and int
operand is created. Each node can be implemented
as a record with multiple fields, Following are the
functions used in syntax tree for expression.
1, mknode(op,eft,right) : This function creates a
node with the field operator having operator as
label, and the two pointers to left and right.
2, mkleaf(identry) : This function creates an
identifier node with label id and a pointer to
wn num
symbol table is given by ‘entry’.
3. mkleaf(num,val) : This function creates node
for number with label num and val is for value
of that number.
ESSE) Construct the syntax tree for the expression xy-542,
Solution :
Step 1: Convert the expression from infix to postfix xys5-2+,
Step 2: Make use of the functions mknode(),mkleaf(id,ptr) and mkleaf(num,val).
Step 3: The sequence of function calls is given.
Postfix expression xy5 ~ z+
Operation
Premkleaflid, ptr to ent
try x)
_¥____Pa= mkleaf(id, ptr to entry y)
Ps=mknode(*p,,p2)
Ps=mknode(-, p3,p,4)
TECHNICAL PUBLICATIONS”. Anup trust for knowedgo
Scanned with CamScanneranaes Semantic Analysis
“|
22m miknadets, p
Consider the string x*y-54z and let us draw the syntax tree
e 7 5
7s enn =
EL]
Pointer to Pointer to
symtab symtab
for x fory
Fig. 6.2.7 Syntax tree
The syntax-directed definition for the above grammar is as given below.
|
| Production rule Semantic operation
ee
| ESET — Enptr=mknode(’+’,£,.nptr,T.nptr)
EsE,*T Enpte =mknode(", 1 nptr,T.nptr)
E> T ptr:=T.nptr
Tid E.nptr:=mkleaf(id,id.ptr_entry)
‘T.nptre=mkleaf(num,num.val)
se
Scanned with CamScanner"Age,
Eenptr
Emyit
Eenptr
AN
ee
i
'
t ontry for
1 z
'
Hv
entry for entry for
x y
Fig. 6.2.7 (a) Constructed syntax tree
As we have seen that in the function calls the pointers to various nodes az
Benerated. Such pointer are py, p2,P3 and so on. A synthesized attribute nptr for Ean
T is used to keep track of these pointers for the nodes E and T. Thus we get npt,
ptr_entry, val as synthesized attributes,
Directed Acyclic Graph for Expression
The directed acyclic graph usually refered as DAG is
identifying the common subexpressions. Like
the subexpressions in the expression. These
operand2 where operands are the children of that
a directed graph drawn t
nodes have operator, operand!
node.
The difference between DAG an
more than one parent and in 8
represented as duplicated subtree,
Ez
1
J Draw the syntax tree and DAG for the
expression (avb)+(c-d)(arb)+l
Solution : This expression can be evaluated ag ((((a*d)+(c-d))x(aeb)) +b)
The postorder traversal is abtcd—4ab*y
TECHNICAL PUBLICATIONS. An up thrust or knowiodgo
Scanned with CamScanner
. ions i |
id syntax tree is that common subexpressions ®
yntax tree the common subexpression would *
yntax tree DAG has nodes representeonpte DOI 6.18
Somantic Analysis
prom this postorder sequence the ayntay tree andl DAG ean by senate a fl
an be yenerated as follows,
“N
AN
LNA A
be da pb
Fig. 6.2.8 Syntax troo
The sequence of operations for syntax tree is
mkleai(ida)
P=
P2 = mkleaf(id,b)
ps = mknode(*,p,,p2) > (arb)
mkleaf(id,c)
ps = mkleaf(id,d)
Ps = mknode(-,py/ps) > (cd)
Pr = mknode(+,p3,p4) ~ (arb)+(e-d)
mkleaf(id,a)
Py = mkleaf(id,b)
Pip = mknode(*,p7,Ps)
Puy = mknode(+;pp,piq) — ((arb}+(c-d))Marb)
Piz = mkleaf(id,b)
mknode(+,Pj1,P12) > (((aeb)+(c-d))*(arb)) +b)
2
"
Ps =
Fig, 6.2.9 DAG for (((a* b) + (e-d)+(aeb))) 4b
TEGHHIGAL PUBLIGATIONS”- An up thnust for knowodgo
Scanned with CamScanner. ax
The sequence of operation for DAG is as fotos,
Pr = mkleaf(id,a)
P2 = mkleaf(id,b)
P3 = mknode(*,p,,p2) > (a*b)
Pa = mkleaf(id,c)
Ps = mkleaf(id,d)
Pe = mknode(-,pyps) > (c-d)
P7 = mknode(+,p3,Pe) — (a*b)+(c-d)
Ps = mknode(*,p;,p3) > ((atb)+(c-d))*(a"b)
Po = mknode(+,ps,P2)
Construct a syntax tree and DAG for
Solution :
+5.
Ji™ ‘S,
7 7~\
k 5 k 5
(a) Syntax tree (b) DAG
[23 Bottom-up Evaluation of S-Attributed Definitions
We have already discussed how to use syntax-directed definitions to ee
translations. Now in this section we will discuss how to implement syntax
translation scheme for the syntax-directed definitions. Hence a translator is bg
task of building translator for any arbitrary syntax-directed definition is very
However, to accomplish this task there are large classes of syntax-directed definitio
which it is easy to construct translators, a vit
S-attributed definition is one such class of syntax-directed definite”
synthesized attributes only.
Synthesized attributes can be evaluated using the bottom-up parser:
aot
zea ac
The purpose of stack is to keep track of values of the synthesize xa"
associated with the grammar symbol on its stack. This stack is comm"
as parser stack.
TECHNICAL PUBLICATIONS”. An up thrust for knowledge
Scanned with CamScarinerSomantic Analysie
pe synthesized Attributes on the Parser Stack
1, A translator for S-attributed definition is implemented using, LR parser generator.
2 A bottom up method is used to parse the input string,
3. A parser ed to hold the values of synthesized attribute,
The stack is implemented as a pair of state and value. Each state entry is the
pointer to the LR (1) parsing table. There is no need to store the grammar symbol
implicitly in the parser stack at the state entry. But for ease of understanding, we
will refer the state by unique grammar symbol that is been placed in the parser
stack, Hence parser stack can be denoted as stack{iJ. And stack[i] is a combination
of state[i] and valueli].For example, for the production rule X > ABC the stack can
be as shown in Fig. 6.3.1.
Production : X > ABC
Value State Value
aa x Xx top,
B Bb =
top—]| C Coc
Before reduction After reduction
Fig, 6.3.1 Parser stack
The top symbol on the stack is pointed by pointer top.
Production rule Semantic action
X > ABC Xx = f(A.aBbCo)
Before reduction the states A, B and C can be inserted in the stack along with the
values A.a, B.b and C.c.The top pointer of value[top] will point the value Cc,
similarly Bb is in value[top - 1] and A.a is in value[top ~ 2].After reduction the
left hand side symbol of the production ie. X will be placed in the stack along
with the value X.x at the top. Hence after reduction value[top] = X.x.
4. After reduction lop is decremented by 2 the state covering X is placed at the top of
stateltop] and value of synthesized attribute X.x is put in valueltop].
5-If the symbol has no attribute then the corresponding entry in the value array will
be kept undefined.
Scanned with CamScannersan sign ott
(CEELLELY cor ine fottaoing given grammar construct Me te dfn gy
generate the cone fragment (ranlator) using S-attribaded definition,
+ EN
Bane
RET
Eat
Ty Tah
To Tf
rok
P(t)
P digit
N~>;
Also evaluate the inpul string 20304; with parser slack wing LR parsing method,
Solution :
he syntax-directed definition for the given grammar can be written as follow
Production rule Semantic actions
53EN Print(P.val)
Bobet E.val=Byvale'Toval
repo Ba
yvaleTval
ROT Eval:
val
Po TF
Ty.val X P.val
Tots
val/Pval
Tor
F- (L)
digit lexval
Can be ignored by lexical analyzer
ils terminating, symbol,
can be generated, (W
‘ated, (We hy,
"Beneralor ty chapter 5)
© The LR parser table
LR parse
ave already discussed the meth
a,
PUALICATIONS™» An up tus for knowieit
sSeanined with CamScanner
Hs,
po
_———6-19 Somanie anys
on se
the attributes the code fragment can be generated by using the parser
yaluate ‘
. oe the appropriate reduction of each production and corresponding code
fragment is as given
below.
Code fragment
Print(value|top])
valueltop]:=valueltop-2]*value{top]
alue[top-2]/valueltop]
+ The sequence of moves made by the parser for the input 23+4; are as given
below.
Value Production rule used|
“Scanned with CamScannerOn seeing the first input symb'
the state stacl
II be implemented an
the parser shifts F in
Eval = digitexval wil
move parser reduces by
the value{top] and state|
of the input string is di
state{top] = S the start sate.
In this way the bottom-up evaluation of S-attril
[top] is left unchanged. Continuing in
S7 EN
‘ol 2, initially the symbol 2 is recognized as digit ant
k. F corresponding to digit and the semantic actna
d the value[top] becomes = 2. In the net
tis associated with this produetin
this fashion the evaluation
nit reaches to
TF. As no code fragmen
one and the parser halts successfully whe
buted definitions is done.
attributed grammar to convert the given grammar swith inf
operators to prefix operators.
LoEESE+T, EDE-LET,
ToT+E TOF,
F3FTP,F3P, P(E), Pid
Solution : The grammar
rules having synthesized attributes only, is call
for converting infix operators to prefix is given
Production rule
TF
nye ro
th the seman
such gram
ute:
that contains all the syntactic rules along wi
led s - attributed grammar.
by using the ‘val’ as S - attrib
TECHNICAL PUBLICATIONS” An up thrust for knowiedge
ae
Scanned with CamScanner