0% found this document useful (0 votes)
316 views

Grammar, Ambiguity, Left Recursion, Left Factoring, Recursive Descent & Predictive Parser PDF

The document describes a context-free grammar for a formal language. The grammar defines the variables (non-terminals), symbols (terminals), production rules, and start symbol. It then provides examples of derivations using the production rules to generate strings from the language.

Uploaded by

avantika gaur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
316 views

Grammar, Ambiguity, Left Recursion, Left Factoring, Recursive Descent & Predictive Parser PDF

The document describes a context-free grammar for a formal language. The grammar defines the variables (non-terminals), symbols (terminals), production rules, and start symbol. It then provides examples of derivations using the production rules to generate strings from the language.

Uploaded by

avantika gaur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 94

Grammar

• L={He is good,
• He is bad
• She is good
• She is bad
• He was good,
• He was bad
• She was good
• She was bad
• }
Grammar
• Grammar =<V,∑,P,S>
• Where
• V= set of variable/non terminal
• ∑= set of alphabet/symbols/terminals
• P= set of Production rules
• S= start symbol
• V= {S, SUB , HV, ADJ}
• ∑={He, She, is, was, bad, good}
• P={1. S→ SUB HV ADJ
• 2. SUB → He
• 3. SUB → She
• 4. HV → is
• 5. HV → was
• 6. ADJ →bad
• 7.ADJ → good
• }
• S => SUB HV ADJ [PR1]
• => He HV ADJ [PR 2]
• => He is ADJ [PR4]
• => He is good [PR 7]

• S => SUB HV ADJ [PR1]


• => He HV ADJ [PR 2]
• => He is ADJ [PR4]
• => He is bad [PR 6]
• S => SUB HV ADJ [PR1]
• => He HV ADJ [PR 2]
• => He was ADJ [PR5]
• => He was good [PR 7]

• S => SUB HV ADJ [PR1]


• => He HV ADJ [PR 2]
• => He was ADJ [PR5]
• => He was bad [PR 6]
• S => SUB HV ADJ [PR1]
• => She HV ADJ [PR 3]
• => She is ADJ [PR4]
• => She is good [PR 7]
• S => SUB HV ADJ [PR1]
• => She HV ADJ [PR 3]
• => She is ADJ [PR4]
• => She is bad [PR 6]
Left most derivation
• S => SUB HV ADJ [PR1]
• => She HV ADJ [PR 3]
• => She was ADJ [PR5]
• => She was good [PR 7]
• S => SUB HV ADJ [PR1]
• => She HV ADJ [PR 3]
• => She was ADJ [PR5]
• => She was bad [PR 6]
Right most derivation
• S => SUB HV ADJ [PR1]
• => SUB HV good [PR 7]
• => SUB is good [PR4]
• => He is good [PR 2]

• S => SUB HV ADJ [PR1]


• => SUB HV bad [PR 6]
• => SUB is bad [PR4]
• => He is bad [PR 2]
• S *=> string then string is accepted
• If a string can be found starting from S using production rules, the string is
valid one
• Or
• Start with the string and keep on reducing using production rules and
reaches S, then also string is accepted
• He is good
• He is ADJ
• He HV ADJ
• SUB HV ADJ
•S
• accepted
• Grammar =<V,∑,P,S>
• Where
• V= set of variable/non terminal={S}
• ∑= set of alphabet/symbols/terminals={(,)}
• P= set of Production rules
• { S → SS
• S →(S)
• S →e
• }
• S= start symbol
• S => SS
• =>(S)S
• =>()(s)
• =>()()
• Find Context free grammar for a language all palindrome string over {a,b}
• Grammar =<V,∑,P,S>
• Where
• V= {S}
• ∑={a,b}
• P={1. S → aSa
• 2. S →bSb
• 3. S →e
• }
• S=>aSa
• =>abSba
• =>abaSaba
• =>abaaba
• Find Context free grammar for a language L={𝑎𝑛 𝑏𝑛 : n>=0}
• Grammar =<V,∑,P,S>
• Where
• V= {S}
• ∑={a,b}
• P={1. S → aSb
• 2. S →e
• }
• S=>aSb
• =>aaSbb
• =>aaaSbbb
• =>aaaaSbbbb
• =>aaaaaSbbbbb
• =>aaaaaaSbbbbbb
• =>aaaaaabbbbbb
• Find the grammar for the following language L={𝑎𝑛 𝑏2𝑛 : n>=1}
• Grammar =<V,∑,P,S>
• Where
• V= {S}
• ∑={a,b}
• P={1. S → aSbb
• 2. S →abb
• }
• Grammar =<V,∑,P,S>
• Where
• V= {S,A,B}
• ∑={a,b}
• P={1. S → aB
• 2. S →bA
• 3. A →a
• 4. A → aS
• 5. A → bAA
• 6. B →b
• 7. B → bS
• 8. B → aBB
• }
• Find the language generated by this
• L(G)= set of all strings over {a,b} containing equal no. of as and bs
• A represent set of strings in which no. of as is one more than bs
• B represent set of strings in which no. of bs is one more than as
• Grammar =<V,∑,P,S>
• Where
• V= {S,A,B}
• ∑={a,b}
• P={1. S → bS
• 2. S →b
• 3. S →aA
• 4. A → bA
• 5. A → aB
• 6. B →bB
• 7. B → aS
• 8. B → a
• }
• Find the language generated by this
• L=set of strings over {a,b} in which no of as is multiple of 3
Types of Grammars -
Chomsky hierarchy of languages
• Venn Diagram of Grammar Types:

Type 0 – Phrase-structure Grammars- TM


Type 1 –
Context-Sensitive-LBA
Type 2 –
Context-Free-PDA
Type 3 –
Regular- FA
Defining the Grammar Types
• Type 0: Phase-structure grammars – no restrictions on
the production rules
• Type 1: Context-Sensitive Grammar:
• All RHS of Production are either longer than the LHS of
production , or empty:
• if b → a, then |b| < |a|  a = e .
• A →ab
• A →aA
• aAb →aBCb
• Are satisfying the above condition hence Context sensitive
• While
• aA →a and ABc →bc does not satisfy hence not context
sensitive
• Type 2: Context-Free Grammar:
• All LHS of productions have length 1 and are nonterminals:
if B → a, then |B| = 1 (B  V).
• Type 3: Regular Grammar:
• All LHS of productions have length 1 and nonterminals
• All RHS of Productions are either single terminals, or a single
terminal followed by non terminal or single terminals, or a
non terminal followed by a single terminal
• A →a
• A → aB
• Or
• A →a
• A →Ba
Language Generated by a Grammar
• G=<V,∑,P,S>
• Where
• V= {S}
• ∑={a,b}
P = {S → aA, S → b, A → aa}. S
S= start symbol
What is L(G)? aA b
Easy: We can just draw a tree
of all possible derivations. Example of a
We have: S  aA  aaa. aaa derivation tree
and S  b. or parse tree
Answer: L = {aaa, b}. .
Example: Derivation Tree/parse tree/syntax tree

► Let G be a context-free grammar with the productions


P = {S →aAB, A →Bba, B →bB, B →c}.
► The word w = acbabc can be derived from S as follows:
S ⇒ aAB ⇒ a(Bba)B ⇒ acbaB ⇒ acba(bB) ⇒ acbabc
Thus, the derivation tree is given as follows:
S

a A B

B b a b B

c c
• Context-Sensitive Languages

• The language { anbncn | n  1} is context-sensitive but not context free.


• A grammar for this language is given by the following production:
• S → aSBC | aBC
• CB → BC
• aB → ab
• bB → bb
• bC → bc
• cC → cc
• A derivation from this grammar is:-
• S  aSBC
•  aaBCBC (using S → aBC)
•  aabCBC (using aB → ab)
•  aabBCC (using CB → BC)
•  aabbCC (using bB → bb)
•  aabbcC (using bC → bc)
•  aabbcc (using cC → cc)
• which derives a2b2c2.
Regular Grammar
• Find the regular grammar for (a+b)*abb
• Grammar =<V,∑,P,S>
• Where
• V= set of variable/non terminal={S,A,B,C}
• ∑= set of alphabet/terminals={a,b}
• P= set of Production rules
• {S → aS/bS/aA
• A → bB
• B → bC
• C→e }
• S= start symbol
Regular Grammar
• Find the regular grammar for (a+b)(a+b+0+1)*
• Grammar =<V,∑,P,S>
• Where
• V= set of variable/non terminal={S,A,}
• ∑= set of alphabet/terminals={a,b,0,1}
• P= set of Production rules
• {S → aA/bA
• A → aA/bA/0A/1A/ e
• }
• S= start symbol
• Every regular set can be described by Context free grammar
• Why bother about regular expressions
• Reasons
• 1.Lexical rules are usually quite simple and so a powerful notation like CFG is not
required.
• with the regular expression notation it is a bit easier to understand what set of
strings is being defined than it is to grasp the language defined by a collection of
production rules
• 2. It is easier to construct efficient recognizer from regular expressions than from
context free grammars
• 3.Separating the syntactic structure of a language into lexical and non lexical parts
provides a convenient way of modularizing the front end of the Compiler into two
manageable size components
• Regular expressions are most useful for describing the structure of
lexical constructs such as identifiers, constants, keywords
• Context free grammars are most useful in describing nested structure
s such as balanced parentheses,matching begin –ends, corresponding
if-then-else
• Nested structures can not be described by regular expressions
• L1={wcw: w is in (a+b)*}
• Is not context free
• L2={𝑎𝑛 𝑏𝑚 𝑐 𝑛 𝑑𝑚 : n>=1 and m >=1 }
• Is not context free
• L3= {𝑎𝑛 𝑏𝑛 𝑐 𝑛 : n>=0 }
• Is not context free
• L4= {𝑎𝑛 𝑏𝑚 𝑐 𝑚 𝑑𝑛 : n>=1 and m >=1 }
• Is context free as
• S →aSd/aAd
• A →bAc/bc
L5= {𝑎𝑛 𝑏𝑛 𝑐 𝑚 𝑑𝑚 : n>=1 and m >=1 }
Is context free
S →AB
A → aAb/ab
B→cBd/cd
• L6=={wcwR: w is in (a+b)* where wR is w reversed}
• Is context free
• S →aSa/bSb/c
• L6= {𝑎𝑛 𝑏𝑛 : n>=1 }
• Is context free
• S →aSb/ab
• Context free languages are closed under union, concatenation and
kleen star but not closed under intersection and complementation
• L7={𝑎𝑚 𝑏𝑛 𝑐 𝑛 : m,n>=0 }
• Is context free as following productions can generate
• S →AB
• A →aA/ e
• B →bBc/ e
• L8={𝑎𝑛 𝑏𝑛 𝑐 𝑚 : m,n>=0 }
• S →AB
• A →aAb/ e
• B →cB/ e
• Is context free
• L7∩L8={𝑎𝑛 𝑏𝑛 𝑐 𝑛 : n>=0 }
• Is not context free
• Now consider a grammar with following productions with E as start
symbol
• E→ E+E/E*E/(E)/ -E/id
• Now consider the following strings id+id*id
• Find its left most derivation
• E  E+E
•  id +E
•  id+E*E
•  id+id *E
•  id+id*id
Parse tree
Another derivation
• E E*E
• E+E*E
•  id+E*E
• id+id*E
• id+id*id
Parse tree
• Hence id+id*id has two parse tree
• So, grammar is ambiguous
• A grammar is called ambiguous grammar if it produces more than
one left most derivation or more than one right most derivation.
• Ambiguous grammar has more than one parse tree.
• Now consider a grammar with following productions with E as start
symbol.
• E →E+E / E-E
• E → E*E
• E → E/E
• E → E↑E / (E)
• E→ -E/id
• Is it ambiguous ?
• We can disambiguate by specifying the associativity and precedence
of the arithmetic operators
• Precedence of operators
• -(uniary minus)
•↑
•*/
•+ -
• ↑ Right associative
• a↑b↑c will be taken as a↑(b↑c)
• Others left associative
• a-b-c will be taken as (a-b)-c
Unambiguous grammar
• One non terminal for each precedence level
• element represent indivisible subexpression
• Element is either a single identifier or a parenthesized expression
• element → (expression)/id
• Primary is elements with zero or more of operator of highest
precedence, the unary minus
• primary → -primary/element
• Factor as sequence of one or more primaries connected by
exponential
• Factor →primary ↑ factor / primary
• Term sequence of one or more factors connected by multiplicative operators
• term →term*factor
• term →term/factor
• term →factor
• Lastly expression for +,-
• Expression →expression+term
• Expression →expression-term
• Expression →term


Final productions
• expression →expression+term
• expression →expression-term
• expression →term
• term →term*factor
• term →term/factor
• term →factor
• factor →primary ↑ factor
• factor → primary
• primary → -primary
• primary → element
• element → (expression)
• element → id
• Find derivation for id+id*id
• Find the derivation and parse tree for the following statement

• If Grammar production rules are


Parse tree
• Now for the same grammar find parse tree for
Parse tree-1
Parse tree-2
Now consider the following production rules
• Now for the this grammar find parse tree for
derivation of string
if E1 then S1 else if E2 then S2 else S3
• Stmt  matched_stmt
•  if exprn then matched_stmt else matched_stmt
•  if E1 then matched_stmt else matched_stmt
•  if E1 then other else matched_stmt
•  if E1 then S1 else matched_stmt
•  if E1 then S1 else if exprn then matched_stmt else matched_stmt
•  if E1 then S1 else if E2 then matched_stmt else matched_stmt
•  if E1 then S1 else if E2 then other else matched_stmt
•  if E1 then S1 else if E2 then S2 else matched_stmt
•  if E1 then S1 else if E2 then S2 else other
•  if E1 then S1 else if E2 then S2 else S3
Top-down parser
• Top-down parsing can be viewed as the problem of
constructing a parse tree for the input string, starting
form the root and creating the nodes of the parse tree
in preorder.

• An example follows.

54
Top-down parser (Cont.)
• Given the grammar :
• E → TE’
• E’ → +TE’ | e
• T → FT’
• T’ → *FT’ | e
• F → (E) | id
• The input: id + id * id

55
Top-down parser (Cont.)

56
Top-down parser
• A top-down parsing program consists of a set of
procedures, one for each non-terminal.

• Execution begins with the procedure for the start


symbol, which halts and announces success if its
procedure body scans the entire input string.

57
Top-down parser
A typical procedure for non-terminal A in a top-down parser:

void A() {
choose an A-production, A → X1 X2 … Xk;
for (i= 1 to k) {
if (Xi is a non-terminal)
call procedure Xi();
else if (Xi matches the current input token “a”)
advance the input to the next token;
else /* an error has occurred */;
}
}

58
Recursive Descent Parser
• Consider the grammar:
S→cAd
A → ab | a

The input string is “cad”

59
60
61
Recursive Descent Parser (Cont.)
• Now, we have a match for the second input symbol “a”, so we
advance the input pointer to “d”, the third input symbol, and compare
d against the next leaf “b”.

• Backtracking
• Since “b” does not match “d”, we report failure and go back to A to see
whether there is another alternative for A that has not been tried - that might
produce a match!
• In going back to A, we must reset the input pointer to “a”.

62
63
• A left recursive grammar may cause a top down parser to go into
infinite loop
• So left recursion should be removed from the grammar before
constructing a top down parser for this
• If production are

• After removing left recursion we get



Left Recursion
• Remove left recursion from following production of a grammar
• E →E+T / T
• T →T*F / F
• F → (E) / id
• Productions after removing left recursion
• E →TE’
• E’ →+TE’ / e
• T → FT’
• T’ →*FT’ / e
• F →(E)/id
Removing left recursion
• In general to eliminate immediate left recursion among all A
productions we group A productions as

• Replace A production by
• Now Consider the following
• S →Aa / b
• A → Ac /Sd / e
• Remove left recursion from these productions
• A → Ac /Aad / bd / e after substituting S
• So
• Grammar production after removing left recursion
• S →Aa / b
• A →bdA’ / eA’
• A’ →cA’ /adA’ / e
Algorithm to remove left recursion
• Apply the algorithm on the following grammar production rules
• In many practical cases a top down parser may not need backtracking
if proper alternate is detectable by looking at only the first symbol it
derives. Eg

• Left factoring may be used if some common prefixes are among


different alternatives
Left factoring
Left factoring is the Process of factoring out the common
Prefixes of alternatives

After production after left factoring


• Find the left factored grammar productions for

• Left factored grammar productions


Left Factoring

• A problem occurs when two productions for the same nonterminal


begin with the same token.
• We cannot decide which production to use.
Left Factoring
• Consider the grammar
A →  | .
• We use left factorization to transform it into the form
A → A'
A' →  | .
• Now we can apply the productions immediately and unambiguously.
Example: Left Factoring
• Consider the following productions and perform left factoring
C → id == num | id != num | id < num
• To perform left factoring, introduce a nonterminal C':
C → id C'
C' → == num | != num | < num
Example: Left Factoring
• Consider the grammar of if statements.
S → if C then S else S
| if C then S
• We rewrite it as
S → if C then S S'
S' → else S | .
Parsers
• Recursive Descent Parser
• Predictive Parser- A Tabular implementation of Recursive Descent
Parser
Model of a table-driven predictive parser

82
Predictive Parsing Algorithm
• The predictive parsing algorithm uses
• The parse table,
• An input buffer containing a sequence of tokens,
• A stack of grammar symbols.
• Initially
• The input buffer contains the input followed by $.
• The stack contains $ and S, with S on the top.
Predictive Parsing Algorithm
• Consider the top stack symbol X.
• There are three possibilities.
• X is a terminal.
• X is a nonterminal.
• X is $.
Predictive Parsing Algorithm
• If X is a terminal, then
• If X matches the current token,
• Pop X from the stack.
• Advance to the next token.
• If X does not match the current token, then that is an error.
Predictive Parsing Algorithm
• If X is a nonterminal, then
• Use X together with the current token to get the entry from the parse table.
• If the entry is a production,
• Pop X from the stack.
• Push the symbols on the right-hand side of the production, from right to left, onto the
stack.
• If the entry is not a production, then that is an error.
Predictive Parsing Algorithm
• If X is $, then
• If the current token is also $,
• Accept the input.
• If not, then that is an error.
Parsing table
MOVES of predictive parser

• Stack Input Output


• $E id+id*id$
• $E’T id+id*id$ E →TE’
• $E’T’F id+id*id$ T →FT’
• $E’T’id id+id*id$ F →id
• $E’T’ +id*id$
• $E’ +id*id$ T’ → e
• $E’T+ +id*id$ E’ →+TE’
MOVES of predictive parser

• Stack Input Output


• $E’T id*id$
• $E’T’F id*id$ T →FT’
• $E’T’id id*id$ F → id
• $E’T’ *id$
• $E’T’F* *id$ T’ →*FT’
MOVES of predictive parser

• Stack Input Output


• $E’T’F id$
• $E’T’id id$ F →id
• $E’T’ $
• $E’ $ T’ → e
•$ $ E’ → e
• Hence accepted
Predictive parsing algorithm
Set input pointer (ip) to the first token a;
Push $ and start symbol to the stack.
Set X to the top stack symbol;
while (X != $) { /*stack is not empty*/
if (X is token a) pop the stack and advance ip;
else if (X is another token) error();
else if (M[X,a] is an error entry) error();
else if (M[X,a] = X → Y1Y2…Yk) {
output the production X → Y1Y2…Yk;
pop the stack; /* pop X */
/* leftmost derivation*/
push Yk,Yk-1,…, Y1 onto the stack, with Y1 on top;
}
set X to the top stack symbol Y1;
} // end while
92
LL Parsing Methods
• LL parsing methods read the tokens from Left to right and parse them
top-down according to a Leftmost derivation.
Table-Driven LL Parsing
• To build the parsing table, we need the notion of nullability and the
two functions
• FIRST
• FOLLOW

You might also like