Unit - 3 Mid - 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

Department of CE

CD : COMPILER DESIGN

Parsing Unit no : 3
Parsing
(01CE0714)

Prof. Jaydip Siyara


Outline :
Role of parser
Parse tree Department of CE
Classification of grammar
Derivation and Reduction Unit no : 3
Parsing
Ambiguous grammar (01CE0714)
Left Recursion
Left Factoring
Top-down Bottom-up parsing
LR Parsers – LR(0), SLR, CLR , LALR
Prof.Jaydip Siyara
 In our compiler model, the parser obtains a string of
tokens from the lexical analyzer and verifies that the
string of token names can be generated by the
Role of grammar for the source language.
Parser  It reports any syntax errors in the program. It also
recovers from commonly occurring errors so that it
can continue processing its input.
Parse Tree

Scanner –
Parser
Interaction
 For well-formed programs, the parser constructs a
parse tree and passes it to the rest of the compiler for
further processing.
 If a compiler had to process only correct programs,
its design and implementation would be greatly
simplified.
 But programmers frequently write incorrect
programs, and a good compiler should assist the
programmer in identifying and locating errors.
 We know that programs can contain errors at many
Syntax Error different levels. For example, errors can be
Handling  Lexical : Such a misspelling an identifier, keyword, or
operator
 Syntactic : Such as arithmetic expression with
unbalanced parenthesis
 Semantic : Such as an operator applied to
incompatible operand
 Logical : Such as infinitely recursive call
 The error handler in a parser has simple-to-state
goals :
 It should report the presence of errors clearly and
Syntax Error accurately.
Handling  It should recover from each error quickly enough to
be able to detect sub sequent errors.
 It should not significantly slow down the processing
of correct programs.
 Parse tree is graphical representation of symbol.
Symbol can be terminal as well as non-terminal.
 The root of parse tree is start symbol of the string.
 Parse tree follows the precedence of operators. The
deepest sub-tree traversed first. So, the operator in
the parent node has less precedence over the
Parse tree operator in the sub-tree.
 Example:-
Syntax tree Parse tree

Parse tree
V/S
Syntax tree
 Grammars are classified on the basis of production they use
(Chomsky, 1963).
 Given below are class of grammar where each class has its own
characteristics and limitations.
1. Type-0 Grammar:- Recursively Enumerable Grammar
 These grammars are known as phrase structure grammars.
Their productions are of the form,
 α = β, where both α and β are terminal and non-terminal
Classification symbols.
 This type of grammar is not relevant to Specifications of
of Grammar programming languages.
2. Type-1 Grammar:- Context Sensitive Grammar
 These Grammars have rules of the form αAβ → αΥβ with A
nonterminal and α, β, Υ strings of terminal and nonterminal
symbols. The string α and β may be empty but Υ must not be
nonempty.
 Eg:- AB->CDB
Ab->Cdb
A->b
3. Type-2 Grammar:- Context Free Grammar
 These are defined by the rules of the form A → Υ, with A
nonterminal and Υ a sting of terminal and nonterminal
Symbols. These grammar can be applied independent of its
context so it is Context free Grammar (CFG). CFGs are ideally
suited for programming language specification.
 Eg:- A → aBc
Classification
4. Type-3 Grammar:- Regular Grammar
of Grammar  It restrict its rule of single nonterminal on the left hand side
and a right-hand side consisting of a single terminal, possibly
followed by a single nonterminal. The rule S → ϵ is also
allowed if S does not appear on the right side of any rule.
 Eg:- A → ϵ
A→a
A → aB
 Let production P1 of grammar G be of the form
P1 : A::= α
and let β be a string such that β = γAθ, then
replacement of A by α in string β constitutes a
derivation according to production P1.
• Example
Derivation <Sentence> ::= <Noun Phrase><Verb Phrase>
<Noun Phrase> ::= <Article> <Noun>
<Verb Phrase> ::= <Verb><Noun Phrase>
<Article> ::= a | an | the
<Noun> ::= boy | apple
<Verb> ::= ate
 The following strings are sentential form.
<Sentence>
<Noun Phrase> <Verb Phrase>
the boy <Verb Phrase>
Derivation the boy <verb> <Noun Phrase>
the boy ate <Noun Phrase>
the boy ate an apple

String : id + id * id
 The process of deriving string is called Derivation
and graphical representation of derivation is called
derivation tree or parse tree.
 Derivation is a sequence of a production rules, to get
the input string.
 During parsing we take two decisions:
1) Deciding the non terminal which is to be replaced.
Derivation
2) Deciding the production rule by which non
terminal will be replaced.
For this we are having:
1) Left most derivation
2) Right most derivation
 A derivation of a string S in a grammar G is a left most
derivation if at every step the left most non terminal is
replaced.
Example:
 Production:
S S + S
S S * S
Left S id
Derivation  String:- id+id*id
S S * S
SS + S * S
S id + S * S
S id + id * S
S id + id * id
 A derivation of a string S in a grammar G is a right most
derivation if at every step the Right most non terminal
is replaced.
Example:
 Production:
S S + S
S S * S
Right S id
Derivation  String:- id+id*id
S S + S
SS + S * S
S S + S * id
S S + id * id
S id + id * id
Let production P1 of grammar G be of the form
P1 : A::= α
and let σ be a string such that σ = γ α θ, then replacement of
α by A in string σ constitutes a reduction according to
production P1.
Step String
0 the boy ate an apple
Reduction 1 <Article> boy ate an apple
2 <Article> <Noun> ate an apple
3 <Article> <Noun> <Verb> an apple
4 <Article> <Noun> <Verb> <Article> apple
5 <Article> <Noun> <Verb> <Article> <Noun>
6 <Noun Phrase> <Verb> <Article> <Noun>
7 <Noun Phrase> <Verb> <Noun Phrase>
8 <Noun Phrase> <Verb Phrase>
9 <Sentence>
• It implies the possibility of different interpretation of a
source string.
• Existence of ambiguity at the level of the syntactic
structure of a string would mean that more than one parse
tree can be built for the string. So string can have more
than one meaning associated with it.
• A grammar that produces more than one parse tree for
Ambiguous some sentence is said to be ambiguous.
Grammar Ambiguous Grammar
E  Id| E + E | E * E a+b*c a+b*c
+ *
Id  a | b | c
+ c
a *
Both tree have same
string : a + b * c b c a b
E E + E | E * E | id
By parse tree:-
E

E + E Parse tree-1

Ambiguous id E * E

Grammar
id id
E

E * E
Parse tree-2

E + E id
id id
 Prove that given grammar is ambiguous grammar:
 E a | Ea | bEE | EEb | EbE
Ans:-
Assume string baaab
E bEE
baE
Left derivation-1
Ambiguous baEEb

Grammar baaEb
baaab
Example:- OR
E EEb
bEEEb
Left derivation-2
baEEb
baaEb
baaab
 In leftmost derivation by scanning the input from left
to right, grammars of the form A  A x may cause
endless recursion.
 Such grammars are called left-recursive and they
must be transformed if we want to use a top-down
parser.
Left  Example:
Recursion E Ea | E+b | c
 Assign an ordering from A1,…….An to the non terminal of
the grammar;
 For i = 1 to n do
begin
for j=1 to i-1 do
begin
replace each production of the form Ai  AiΥ
by the productions Ai  δ1Υ | δ2Υ |…… | δkΥ
Algorithm where Aj  δ1 | δ2 |……….. | δk are all current
Aj production.
end
eliminate the intermediate left recursion
among Ai productions.

end
• There are three types of left recursion:

direct (A  A x)

indirect (A  B C, B  A )

hidden (A  B A, B  )
Left
To eliminate direct left recursion replace
Recursion
A  A1 | A2 | ... | Am | 1 | 2 | ... | n

with

A  1 A’ | 2 A’ | ... | n A’
A’  1 A’ | 2 A’ | ... | m A’ | 
1. E  E + T | T
T T * F | F
F  (E) | id
Ans.
A  A | 
Replace with,
A   A’
Example A’   A’ | 

E  TE’
E’  +TE’ | 
T  FT’
T’  *FT’ | 
F  (E) | id
1. A  Aad | Afg | b
Ans:-
Remove left recursion
A bA’
A’  adA’ | fgA’ | 

Example 2. A Acd | Ab | jk
B Bh | n
Ans :-
Remove left recursion
A jkA’
A’  cdA’ | bA’ | 
B  nB’
B’  hB’ | 
3. E  Aa | b
A  Ac | Ed | 
Ans:-
Replace E,
E Aa | b
Example A  Ac | Aad | bd | 
Remove left recursion
E Aa | b
A  bdA’ | A’
A  cA’ | adA’ | 
 Left factoring is a grammar transformation that is useful
for producing a grammar suitable for predictive parsing.
 Consider,
 S  if E then S else S | if E then S
 Which of the two productions should we use to
expand non-terminal S when the next token is if?
 We can solve this problem by factoring out the
common part in these rules. This way, we are
Left postponing the decision about which rule to choose
Factoring until we have more information (namely, whether
there is an else or not).
 This is called left factoring
 For each non terminal A find the longest prefix α common to
two or more of its alternative.
 If α!=E, i.e, there is a non trivial common prefix, replace all the
A productions A αβ1 | αβ2 | ….. | αβn | Υ ,
where Υ represents all the alternative which do not
starts with α by,
A αA’ | Υ
Algorithm A’ β1 | β2 |…… | βn
Here, A’ is new non terminal, repeatedly apply this
transformation until no two alternatives for a non-terminal
have a common prefix.
A  1 | 2 |...| n | 

becomes

Left A   A”| 
Factoring A”  1 | 2 |...| n
E -> T+E | T
T -> V*T | V
V-> id
Ans.
E TE’
E’  +E | 
T VT’
Example T’  *T | 
V  id
1. S  cdLk | cdk | cd
L mn | 
Ans.
S cdS’
S’  Lk | k | 
L mn | 

Example 2. E  iEtE | iEtEeE | a


Ab
Ans.
E  iEtEE’ | a
E’   | eE
A’  b
Example :

A  bcg | gh
FIRST(A) = {b, g}

Compute
A  Bcd | gh
First
Bm|
FIRST(A) = {m, c , g}
FIRST(B) = {m , }
Example :
A  BCD | Cx
Bb|
Cc|
Dd|
Compute
First
FIRST(A) = {b, c, d, x, }
FIRST(B) = {b, }
FIRST(C) = {c , }
FIRST(D) = {d , }
Example :
S  PQr | s
P  Abc | 
Qd|
A  a| 
Compute
First
A  mn | Xy | Z
Xx|
Z
Example :
A  bcg | gh
FOLLOW(A) = {$}

A  Bcd | gh
Compute
B  mA | 
Follow
FOLLOW(A) = {$, c}
FOLLOW(B) = {c}
Example :
A  BCD | Cx
Bb|
Cc|
Dd|
Compute
Follow
Parser

Top Down Bottom Up


Parser Parser

Parser With Without


Backtracking
Operator
Precedence LR Parser
Backtracking
Parser
LR(0)
Recursive Predictive
Descent Parser CLR(1)

SLR(1)
LL(1) Parser
LALR(1)
End of the the Mid-1
syllabus.

You might also like