0% found this document useful (0 votes)
16 views119 pages

Unit 3 - Syntax Analysis

This document provides an overview of topics to be covered in a compiler design course, including parsing and grammar. It discusses the role of the parser in obtaining tokens from the lexical analyzer and generating a syntax tree. Context-free grammars are defined, including nonterminals, terminals, the start symbol, and productions. Methods of derivation including leftmost and rightmost are explained. The document also covers ambiguity in grammars and how to eliminate left recursion.

Uploaded by

isaiahethi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views119 pages

Unit 3 - Syntax Analysis

This document provides an overview of topics to be covered in a compiler design course, including parsing and grammar. It discusses the role of the parser in obtaining tokens from the lexical analyzer and generating a syntax tree. Context-free grammars are defined, including nonterminals, terminals, the start symbol, and productions. Methods of derivation including leftmost and rightmost are explained. The document also covers ambiguity in grammars and how to eliminate left recursion.

Uploaded by

isaiahethi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 119

Hawassa University

Department of Computer Science


Compiler Design

Course code: CoSc4072


By: Mekonen M.
Topics to be covered
 Looping
• Role of parser
• Context free grammar
• Derivation & Ambiguity
• Left recursion & Left factoring
• Classification of parsing
• Backtracking
• LL(1) parsing
• Recursive descent paring
• Shift reduce parsing
• Operator precedence parsing
• LR parsing
Role of parser
Token Parse
Source Lexical Parse tree Rest of front IR
Parser
tree
program analyzer end

Get next token

Symbol table

• Parser obtains a string of token from the lexical analyzer and reports syntax error
if any otherwise generates syntax tree.
• There are two types of parser:
1. Top-down parser
2. Bottom-up parser
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 4
Context free grammar
• A context free grammar (CFG) is a 4-tuple 𝐺 = (𝑉, 𝛴, 𝑆, 𝑃) where,
𝑉 is finite set of non terminals,
𝛴 is disjoint finite set of terminals,
𝑆 is an element of 𝑉 and it’s a start symbol,
𝑃 is a finite set formulas of the form 𝐴 → 𝛼 where 𝐴 ∈ 𝑉 and 𝛼 ∈ (𝑉 ∪ 𝛴)∗

 Nonterminal symbol:
 The name of syntax category of a language, e.g., noun, verb, etc.
 The It is written as a single capital letter, or as a name enclosed between < … >, e.g., A or
<Noun>
<Noun Phrase> → <Article><Noun>
<Article> → a | an | the
<Noun> → boy | apple

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 6


Context free grammar
• A context free grammar (CFG) is a 4-tuple 𝐺 = (𝑉, 𝛴, 𝑆, 𝑃) where,
𝑉 is finite set of non terminals,
𝛴 is disjoint finite set of terminals,
𝑆 is an element of 𝑉 and it’s a start symbol,
𝑃 is a finite set formulas of the form 𝐴 → 𝛼 where 𝐴 ∈ 𝑉 and 𝛼 ∈ (𝑉 ∪ 𝛴)∗

 Terminal symbol:
 A symbol in the alphabet.
 It is denoted by lower case letter and punctuation marks used in language.

<Noun Phrase> → <Article><Noun>


<Article> → a | an | the
<Noun> → boy | apple

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 7


Context free grammar
• A context free grammar (CFG) is a 4-tuple 𝐺 = (𝑉, 𝛴, 𝑆, 𝑃) where,
𝑉 is finite set of non terminals,
𝛴 is disjoint finite set of terminals,
𝑆 is an element of 𝑉 and it’s a start symbol,
𝑃 is a finite set formulas of the form 𝐴 → 𝛼 where 𝐴 ∈ 𝑉 and 𝛼 ∈ (𝑉 ∪ 𝛴)∗

 Start symbol:
 First nonterminal symbol of the grammar is called start symbol.

<Noun Phrase> → <Article><Noun>


<Article> → a | an | the
<Noun> → boy | apple

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 8


Context free grammar
• A context free grammar (CFG) is a 4-tuple 𝐺 = (𝑉, 𝛴, 𝑆, 𝑃) where,
𝑉 is finite set of non terminals,
𝛴 is disjoint finite set of terminals,
𝑆 is an element of 𝑉 and it’s a start symbol,
𝑃 is a finite set formulas of the form 𝐴 → 𝛼 where 𝐴 ∈ 𝑉 and 𝛼 ∈ (𝑉 ∪ 𝛴)∗

 Production:
 A production, also called a rewriting rule, is a rule of grammar. It has the form of
A nonterminal symbol → String of terminal and nonterminal symbols

<Noun Phrase> → <Article><Noun>


<Article> → a | an | the
<Noun> → boy | apple

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 9


Example: Grammar
Write terminals, non terminals, start symbol, and productions for following grammar.
E → E O E| (E) | -E | id
O→+|-|*|/ |↑
Terminals: id + - * / ↑ ( )

Non terminals: E, O

Start symbol: E
Productions: E → E O E| (E) | -E | id
O→+|-|*|/ |↑

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 10


Derivation
• Derivation is used to find whether the string belongs to a given grammar or not.
• Types of derivations are:
1. Leftmost derivation
2. Rightmost derivation

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 12


Leftmost derivation
• A derivation of a string 𝑊 in a grammar 𝐺 is a left most derivation if at every step
the left most non terminal is replaced.
• Grammar: S→S+S | S-S | S*S | S/S | a Output string: a*a-a

S
S Parse tree represents the
→S-S
structure of derivation S - S
→S*S-S
S * S a
→a*S-S

→a*a-S
a a
→a*a-a Parse tree
Leftmost Derivation

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 13


Rightmost derivation
• A derivation of a string 𝑊 in a grammar 𝐺 is a right most derivation if at every
step the right most non terminal is replaced.
• It is all called canonical derivation.
• Grammar: S→S+S | S-S | S*S | S/S | a Output string: a*a-a

S
S
→S*S S S
*
→S*S-S
→S*S-a a S - S
→S*a-a
a a
→a*a-a
Rightmost Derivation Parse Tree
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 14
Exercise: Derivation
1. Perform leftmost derivation and draw parse tree.
S→A1B
A→0A | 𝜖
B→0B | 1B | 𝜖
Output string: 1001
2. Perform leftmost derivation and draw parse tree.
S→0S1 | 01 Output string: 000111
3. Perform rightmost derivation and draw parse tree.
E→E+E | E*E | id | (E) | -E
Output string: id + id * id

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 15


Ambiguity
• Ambiguity, is a word, phrase, or statement which contains more than one
meaning.

A long thin piece of potato

Chip

A small piece of silicon

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 17


Ambiguity
• In formal language grammar, ambiguity would arise if identical string can occur on
the RHS of two or more productions.
• Grammar:
𝑵𝟏 𝑵𝟐 Replaced by
𝑁1 → 𝛼 𝑵𝟏 or 𝑵𝟐 ?

𝑁2 → 𝛼
𝜶
• 𝛼 can be derived from either N1 or N2

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 18


Ambiguous grammar
• Ambiguous grammar is one that produces more than one leftmost or more then
one rightmost derivation for the same sentence.
• Grammar: S→S+S | S*S | (S) | a Output string: a+a*a

S S S S
→S*S →S+S S + S
S * S
→S+S*S →a+S
→a+S*S S + S a →a+S*S a S * S
→a+a*S →a+a*S
→a+a*a a a →a+a*a a a
• Here, Two leftmost derivation for string a+a*a is possible hence, above grammar
is ambiguous.

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 19


Exercise: Ambiguous Grammar
Check Ambiguity in following grammars:
1. S→ aS | Sa | 𝜖 (output string: aaaa)
2. S→ aSbS | bSaS | 𝜖 (output string: abab)
3. S→ SS+ | SS* | a (output string: aa+a*)
4. <exp> → <exp> + <term> | <term>
<term> → <term> * <letter> | <letter>
<letter> → a|b|c|…|z (output string: a+b*c)
5. Prove that the CFG with productions: S → a | Sa | bSS | SSb | SbS is ambiguous
(Hint: consider output string yourself)

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 20


Left recursion
• A grammar is said to be left recursive if it has a non terminal 𝐴 such that there is
a derivation 𝑨→𝑨𝜶 for some string 𝛼.
• Top-down parsing techniques cannot handle left-recursive grammars.
• So, we have to convert our left-recursive grammar into an equivalent grammar
which is not left-recursive.
• Two types of left-recursion
• immediate left-recursion - appear in a single step of the derivation,
• Indirect left-recursion - appear in more than one step of the derivation.

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 22


Left recursion elimination

𝐴 → 𝐴𝛼
𝛼 |𝛽 𝐴→ 𝐴’

𝐴’→ 𝐴’ | 𝜖

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 23


Examples: Left recursion elimination
E→E+T | T
E→TE’
E’→+TE’ | ε
T→T*F | F
T→FT’
T’→*FT’ | ε
X→X%Y | Z
X→ZX’
X’→%YX’ | ε

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 24


Examples: Indirect Left recursion elimination
S→Aa | b
A→Ac | Sd | c
eliminate Indirect left recursive in A

Replace S→Aa | b to A→ Sd and rewrite


S→Aa | b
A→Ac | Aad | bd | c
S→Aa | b
A→bdA’ | cA’
A’→cA’ | adA’ | ε

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 25


Exercise: Left recursion
1. A→Abd | Aa | a
B→Be | b
2. A→AB | AC | a | b
3. S→A | B
A→ABC | Acd | a | aa
B→Bee | b
4. Exp→Exp+term | Exp-term | term

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 26


Left factoring
If more than one grammar production rules has a common prefix string, then the parser cannot make a
choice as to which of the production it should take to parse the string in hand.
Left factoring is a grammar transformation that is useful for producing a grammar suitable for predictive
parsing.
Algorithm to left factor a grammar
Input: Grammar G
Output: An equivalent left factored grammar.
Method:
For each non terminal A find the longest prefix 𝛼 common to two or more of its alternatives. If 𝛼 ≠∈,
i.e., there is a non trivial common prefix, replace all the A productions
𝐴→ 𝛼𝛽1 𝛼𝛽2 … … … … . . 𝛼𝛽𝑛 𝛾 where 𝛾 represents all alternatives that do not begin with 𝛼 by
𝐴 → 𝛼 𝐴′| 𝛾
𝐴′ → 𝛽1 | 𝛽2 | … … … … . |𝛽𝑛
Here A' is new non terminal. Repeatedly apply this transformation until no two alternatives for a non-
terminal have a common prefix.

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 27


Left factoring elimination

𝐴→ 𝛼𝛽 | 𝛼 δ 𝐴→ 𝐴′
𝐴′ → |

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 28


Example: Left factoring elimination
S→aAB | aCD
S→aS’
S’→AB | CD
A→ xByA | xByAzA | a

A→ xByAA’ | a
A’→ Є | zA
A→ aAB | aA |a
A→aA’
A’→AB | A | 𝜖
A’→AA’’ | 𝜖
A’’→B | 𝜖
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 29
Exercise
1. S→iEtS | iEtSeS | a
2. A→ ad | a | ab | abc | x

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 30


Parsing
• Parsing is a technique that takes input string and produces output either a parse
tree if string is valid sentence of grammar, or an error message indicating that
string is not a valid.
• Types of parsing are:
1. Top down parsing: In top down parsing parser build parse tree from top to
bottom.
2. Bottom up parsing: Bottom up parser starts from leaves and work up to the root.

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 32


Classification of parsing methods
Parsing

Top down parsing Bottom up parsing (Shift reduce)

Back tracking
Operator precedence

Parsing without
backtracking (predictive LR parsing
parsing)
SLR
LL(1)
CLR
Recursive
descent
LALR

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 33


Backtracking
• In backtracking, expansion of nonterminal symbol we choose one alternative and
if any mismatch occurs then we try another alternative.
• Grammar: S→ cAd Input string: cad
A→ ab | a

S S S

c A d c A d c A d
Make prediction Make prediction

a b Backtrack a Parsing done

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 34


Exercise
1. E→ 5+T | 3-T
T→ V | V*V | V+V
V→ a | b
String: 3-a+b

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 35


Parsing Methods
Parsing

Top down parsing Bottom up parsing (Shift reduce)

Back tracking Operator precedence

Parsing without
backtracking (predictive LR parsing
parsing)
SLR
LL(1)
CLR
Recursive
descent LALR

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 36


predictive parsing or non-recursive predictive parsing
• A non-recursive predictive parser can be built by maintaining a stack explicitly,
rather than implicitly via recursive calls
• Non-Recursive predictive parsing is a table-driven top-down parser.

a + b $ INPUT

X
Predictive
Y
Stack parsing OUTPUT
Z program
$

Parsing table M

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 37


LL(1) parser (predictive parser)
• LL(1) is non recursive top down parser.
1. First L indicates input is scanned from left to right.
2. The second L means it uses leftmost derivation for input string
3. 1 means it uses only input symbol to predict the parsing process.

a + b $ INPUT

X
Predictive
Y
Stack parsing OUTPUT
Z program
$

Parsing table M

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 38


LL(1) parser (predictive parser) cont…
• Input buffer
• our string to be parsed. We will assume that its end is marked with a special symbol $.
• Output
• a production rule representing a step of the derivation sequence (left-most derivation) of the string in
the input buffer.
• Stack
• contains the grammar symbols
• at the bottom of the stack, there is a special end marker symbol $.
• initially the stack contains only the symbol $ and the starting symbol S.
• when the stack is emptied (i.e. only $ left in the stack), the parsing is completed.
• Parsing table
• a two-dimensional array M[A,a]
• each row is a non-terminal symbol
• each column is a terminal symbol or the special symbol $
• each entry holds a production rule.

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 39


LL(1) parsing (predictive parsing) cont…
Steps to construct LL(1) parser
1. Remove left recursion / Perform left factoring (if any).
2. Compute FIRST and FOLLOW of non terminals.
3. Construct predictive parsing table.
4. Parse the input string using parsing table.

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 40


Rules to compute first of non terminal
1. If 𝐴 → 𝛼 and 𝛼 is terminal, add 𝛼 to 𝐹𝐼𝑅𝑆𝑇(𝐴).
2. If 𝐴 → ∈, add ∈ to 𝐹𝐼𝑅𝑆𝑇(𝐴).
3. If 𝑋 is nonterminal and 𝑋→𝑌1 𝑌2 … . 𝑌𝑘 is a production, then place 𝑎 in
𝐹𝐼𝑅𝑆𝑇(𝑋) if for some 𝑖, a is in 𝐹𝐼𝑅𝑆𝑇(𝑌𝑖), and 𝜖 is in all of
𝐹𝐼𝑅𝑆𝑇(𝑌1), … … … , 𝐹𝐼𝑅𝑆𝑇(𝑌𝑖−1 ); that is 𝑌1 … 𝑌𝑖−1 ⇒ 𝜖. If 𝜖 is in 𝐹𝐼𝑅𝑆𝑇(𝑌𝑗)
for all 𝑗 = 1,2, … . . , 𝑘 then add 𝜖 to 𝐹𝐼𝑅𝑆𝑇(𝑋).
Everything in 𝐹𝐼𝑅𝑆𝑇(𝑌1) is surely in 𝐹𝐼𝑅𝑆𝑇(𝑋) If 𝑌1 does not derive 𝜖, then we
do nothing more to 𝐹𝐼𝑅𝑆𝑇(𝑋), but if 𝑌1 ⇒ 𝜖, then we add 𝐹𝐼𝑅𝑆𝑇(𝑌2) and so
on.

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 41


Rules to compute first of non terminal
Simplification of Rule 3
If 𝐴 → 𝑌1 𝑌2 … … . . 𝑌𝐾 ,
• If 𝑌1 does not derives ∈ 𝑡ℎ𝑒𝑛, 𝐹𝐼𝑅𝑆𝑇(𝐴) = 𝐹𝐼𝑅𝑆𝑇(𝑌1 )
• If 𝑌1 derives ∈ 𝑡ℎ𝑒𝑛,
𝐹𝐼𝑅𝑆𝑇 𝐴 = 𝐹𝐼𝑅𝑆𝑇 𝑌1 − 𝜖 𝑈 𝐹𝐼𝑅𝑆𝑇(𝑌2 )
• If 𝑌1 & Y2 derives ∈ 𝑡ℎ𝑒𝑛,
𝐹𝐼𝑅𝑆𝑇 𝐴 = 𝐹𝐼𝑅𝑆𝑇 𝑌1 − 𝜖 𝑈 𝐹𝐼𝑅𝑆𝑇(𝑌2 ) − 𝜖 𝑈 𝐹𝐼𝑅𝑆𝑇(𝑌3 )
• If 𝑌1 , Y2 & Y3 derives ∈ 𝑡ℎ𝑒𝑛,
𝐹𝐼𝑅𝑆𝑇 𝐴 = 𝐹𝐼𝑅𝑆𝑇 𝑌1 − 𝜖 𝑈 𝐹𝐼𝑅𝑆𝑇(𝑌2 ) − 𝜖 𝑈 𝐹𝐼𝑅𝑆𝑇(𝑌3 ) − 𝜖 𝑈 𝐹𝐼𝑅𝑆𝑇(𝑌4 )
• If 𝑌1 , Y2 , Y3 …..YK all derives ∈ 𝑡ℎ𝑒𝑛,
𝐹𝐼𝑅𝑆𝑇 𝐴 = 𝐹𝐼𝑅𝑆𝑇 𝑌1 − 𝜖 𝑈 𝐹𝐼𝑅𝑆𝑇(𝑌2 ) − 𝜖 𝑈 𝐹𝐼𝑅𝑆𝑇(𝑌3 ) − 𝜖 𝑈 𝐹𝐼𝑅𝑆𝑇(𝑌4 ) −
𝜖 𝑈 … … … … 𝐹𝐼𝑅𝑆𝑇(𝑌𝑘 ) (note: if all non terminals derives ∈ then add ∈ to FIRST(A))
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 42
Rules to compute FOLLOW of non terminal
1. Place $ 𝑖𝑛 𝑓𝑜𝑙𝑙𝑜𝑤 𝑆 . (S is start symbol)
2. If A → 𝛼𝐵𝛽 , then everything in 𝐹𝐼𝑅𝑆𝑇(𝛽) except for 𝜖 is placed in
𝐹𝑂𝐿𝐿𝑂𝑊(𝐵)
3. If there is a production A → 𝛼𝐵 or a production A → 𝛼𝐵𝛽 where 𝐹𝐼𝑅𝑆𝑇(𝛽)
contains 𝜖 then everything in F𝑂𝐿𝐿𝑂𝑊 𝐴 = 𝐹𝑂𝐿𝐿𝑂𝑊 𝐵

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 43


How to apply rules to find FOLLOW of non terminal?

A → 𝛼𝐵𝛽

𝛽 𝑖𝑠 𝑎𝑏𝑠𝑒𝑛𝑡 𝛽 𝑖𝑠 𝑝𝑟𝑒𝑠𝑒𝑛𝑡

𝑅𝑢𝑙𝑒 3 𝛽 is terminal 𝛽 𝑖𝑠 𝑁𝑜𝑛𝑡𝑒𝑟𝑚𝑖𝑛𝑎𝑙

𝑅𝑢𝑙𝑒 2 𝛽 does not derives 𝜖 𝛽 derives 𝜖

𝑅𝑢𝑙𝑒 2 𝑅𝑢𝑙𝑒 2 + 𝑅𝑢𝑙𝑒 3

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 44


Rules to construct predictive parsing table
1. For each production 𝐴 → 𝛼 of the grammar, do steps 2 and 3.
2. For each terminal 𝑎 in 𝑓𝑖𝑟𝑠𝑡(𝛼), Add 𝐴 → 𝛼 to 𝑀[𝐴, 𝑎].
3. If 𝜖 is in 𝑓𝑖𝑟𝑠𝑡(𝛼), Add 𝐴 → 𝛼 to 𝑀[𝐴, 𝑏] for each terminal 𝑏 in 𝐹𝑂𝐿𝐿𝑂𝑊(𝐵).
If 𝜖 is in 𝑓𝑖𝑟𝑠𝑡(𝛼), and $ is in 𝐹𝑂𝐿𝐿𝑂𝑊(𝐴), add 𝐴 → 𝛼 to 𝑀[𝐴, $].
4. Make each undefined entry of M be error.

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 45


Example-1: LL(1) parsing
S→aBa
B→bB | ϵ
NT First
Step 1: Not required S {a}

Step 2: Compute FIRST B {b,𝜖}

First(S) S → a B a Rule 1
S→aBa A → 𝛼 add 𝛼 to 𝐹𝐼𝑅𝑆𝑇(𝐴)
FIRST(S)={ a }

First(B)
B→bB B→𝜖

B → b B B → 𝜖
Rule 1
A → 𝛼 A → 𝜖
add 𝛼 to 𝐹𝐼𝑅𝑆𝑇(𝐴) Rule 2
add 𝜖 to 𝐹𝐼𝑅𝑆𝑇(𝐴)
FIRST(B)={ b , 𝜖 }
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 46
Example-1: LL(1) parsing
S→aBa
B→bB | ϵ NT First Follow
S {a} {$}
Step 2: Compute FOLLOW B {b,𝜖} {a}

Follow(S)
Rule 1: Place $ in FOLLOW(S)
Follow(S)={ $ }

Follow(B)
S→aBa B→bB
S → a B a Rule 2 B → b B Rule 3
A → 𝛂 B 𝛃 First(𝛽) − 𝜖 A → 𝛂 B Follow(A)=follow(B)

Follow(B)={ a }
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 47
Example-1: LL(1) parsing
S→aBa
B→bB | ϵ NT First Follow
S {a} {$}
Step 3: Prepare predictive parsing table B {b,𝜖} {a}

NT Input Symbol
a b $
S S→aBa
B

Rule: 2
S→aBa A→ 𝛼
a = first(𝛼)
a=FIRST(aBa)={ a } M[A,a] = A→ 𝛼

M[S,a]=S→aBa
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 48
Example-1: LL(1) parsing
S→aBa
B→bB | ϵ NT First Follow
S {a} {$}
Step 3: Prepare predictive parsing table B {b,𝜖} {a}

NT Input Symbol
a b $
S S→aBa
B B→bB

Rule: 2
B→bB A→ 𝛼
a = first(𝛼)
a=FIRST(bB)={ b } M[A,a] = A→ 𝛼

M[B,b]=B→bB
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 49
Example-1: LL(1) parsing
S→aBa
B→bB | ϵ NT First Follow
S {a} {$}
Step 3: Prepare predictive parsing table B {b,𝜖} {a}

NT Input Symbol
a b $
S S→aBa Error Error
B B→ϵ B→bB Error

Rule: 3
B→ϵ A→ 𝛼
b = follow(A)
b=FOLLOW(B)={ a } M[A,b] = A→ 𝛼

M[B,a]=B→𝜖
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 50
Example-2: LL(1) parsing
S→aB | ϵ
B→bC | ϵ
C→cS | ϵ

Step 1: Not required NT First


S { a, 𝜖 }
Step 2: Compute FIRST
B {b,𝜖}
First(S) C {c,𝜖}
S→aB S→𝜖
S → a B S → 𝜖
Rule 1 Rule 2
A → 𝛼 add 𝛼 to 𝐹𝐼𝑅𝑆𝑇(𝐴) A → 𝜖 add 𝜖 to 𝐹𝐼𝑅𝑆𝑇(𝐴)

FIRST(S)={ a , 𝜖 }

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 51


Example-2: LL(1) parsing
S→aB | ϵ
B→bC | ϵ
C→cS | ϵ

Step 1: Not required NT First


S { a, 𝜖 }
Step 2: Compute FIRST
B {b,𝜖}
First(B) C {c,𝜖}
B→bC B→𝜖
B → b C B → 𝜖
Rule 1 Rule 2
A → 𝛼 add 𝛼 to 𝐹𝐼𝑅𝑆𝑇(𝐴) A → 𝜖 add 𝜖 to 𝐹𝐼𝑅𝑆𝑇(𝐴)

FIRST(B)={ b , 𝜖 }

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 52


Example-2: LL(1) parsing
S→aB | ϵ
B→bC | ϵ
C→cS | ϵ

Step 1: Not required NT First


S { a, 𝜖 }
Step 2: Compute FIRST
B {b,𝜖}
First(C) C {c,𝜖}
C→cS C→𝜖
C → c S C → 𝜖
Rule 1 Rule 2
A → 𝛼 add 𝛼 to 𝐹𝐼𝑅𝑆𝑇(𝐴) A → 𝜖 add 𝜖 to 𝐹𝐼𝑅𝑆𝑇(𝐴)

FIRST(B)={ c , 𝜖 }

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 53


Example-2: LL(1) parsing
Step 2: Compute FOLLOW
Follow(S) Rule 1: Place $ in FOLLOW(S)
Follow(S)={ $ }
C→cS S→aB | ϵ
B→bC | ϵ
C → c S Rule 3 C→cS | ϵ
A → 𝛂 B Follow(A)=follow(B)
Follow(S)=Follow(C) ={$}
NT First Follow
S {a,𝜖} {$}
B {b,𝜖} {$}
B→bC S→aB
C {c,𝜖} {$}
B → b C Rule 3 S → a B Rule 3
A → 𝛂 B Follow(A)=follow(B) A → 𝛂 B Follow(A)=follow(B)
Follow(C)=Follow(B) ={$} Follow(B)=Follow(S) ={$}

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 54


Example-2: LL(1) parsing
S→aB | ϵ
NT First Follow
B→bC | ϵ
S {a,𝜖} {$}
C→cS | ϵ
B {b,𝜖} {$}
Step 3: Prepare predictive parsing table C {c,𝜖} {$}

N Input Symbol
T a b c $
S S→aB
B
C

S→aB Rule: 2
A→ 𝛼
a=FIRST(aB)={ a } a = first(𝛼)
M[A,a] = A→ 𝛼
M[S,a]=S→aB
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 55
Example-2: LL(1) parsing
S→aB | ϵ
NT First Follow
B→bC | ϵ
S {a} {$}
C→cS | ϵ
B {b,𝜖} {$}
Step 3: Prepare predictive parsing table C {c,𝜖} {$}

N Input Symbol
T a b c $
S S→aB S→𝜖
B
C

S→𝜖 Rule: 3
A→ 𝛼
b=FOLLOW(S)={ $ } b = follow(A)
M[A,b] = A→ 𝛼
M[S,$]=S→𝜖
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 56
Example-2: LL(1) parsing
S→aB | ϵ
NT First Follow
B→bC | ϵ
S {a} {$}
C→cS | ϵ
B {b,𝜖} {$}
Step 3: Prepare predictive parsing table C {c,𝜖} {$}

N Input Symbol
T a b c $
S S→aB S→𝜖
B B→bC
C

B→bC Rule: 2
A→ 𝛼
a=FIRST(bC)={ b } a = first(𝛼)
M[A,a] = A→ 𝛼
M[B,b]=B→bC
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 57
Example-2: LL(1) parsing
S→aB | ϵ
NT First Follow
B→bC | ϵ
S {a} {$}
C→cS | ϵ
B {b,𝜖} {$}
Step 3: Prepare predictive parsing table C {c,𝜖} {$}

N Input Symbol
T a b c $
S S→aB S→𝜖
B B→bC B→𝜖
C

B→𝜖 Rule: 3
A→ 𝛼
b=FOLLOW(B)={ $ } b = follow(A)
M[A,b] = A→ 𝛼
M[B,$]=B→𝜖
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 58
Example-2: LL(1) parsing
S→aB | ϵ
NT First Follow
B→bC | ϵ
S {a} {$}
C→cS | ϵ
B {b,𝜖} {$}
Step 3: Prepare predictive parsing table C {c,𝜖} {$}

N Input Symbol
T a b c $
S S→aB S→𝜖
B B→bC B→𝜖
C C→cS

C→cS Rule: 2
A→ 𝛼
a=FIRST(cS)={ c } a = first(𝛼)
M[A,a] = A→ 𝛼
M[C,c]=C→cS
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 59
Example-2: LL(1) parsing
S→aB | ϵ
NT First Follow
B→bC | ϵ
S {a} {$}
C→cS | ϵ
B {b,𝜖} {$}
Step 3: Prepare predictive parsing table C {c,𝜖} {$}

N Input Symbol
T a b c $
S S→aB Error Error S→𝜖
B Error B→bB Error B→𝜖
C Error Error C→cS C→𝜖

C→𝜖 Rule: 3
A→ 𝛼
b=FOLLOW(C)={ $ } b = follow(A)
M[A,b] = A→ 𝛼
M[C,$]=C→𝜖
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 60
Example-3: LL(1) parsing
E→E+T | T
T→T*F | F
F→(E) | id
Step 1: Remove left recursion
E→TE’
E’→+TE’ | ϵ
T→FT’
T’→*FT’ | ϵ
F→(E) | id

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 61


Example-3: LL(1) parsing
Step 2: Compute FIRST E→TE’
E’→+TE’ | ϵ
First(E) E → T E’ Rule 3 T→FT’
First(A)=First(Y1) T’→*FT’ | ϵ
E→TE’ A → Y1 Y2
F→(E) | id
FIRST(E)=FIRST(T) = {(, id }

NT First
E { (,id }
First(T) T → F T’ Rule 3 E’
A → Y1 Y2 First(A)=First(Y1)
T→FT’ T { (,id }
FIRST(T)=FIRST(F)= {(, id } T’
F { (,id }
First(F) F → ( E ) F → id
F→(E) A → 𝛼 Rule 1 F→id
A → 𝛼 Rule 1
add 𝛼 to 𝐹𝐼𝑅𝑆𝑇(𝐴) add 𝛼 to 𝐹𝐼𝑅𝑆𝑇(𝐴)
FIRST(F)={ ( , id }
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 62
Example-3: LL(1) parsing
Step 2: Compute FIRST E→TE’
E’→+TE’ | ϵ
First(E’) T→FT’
T’→*FT’ | ϵ
E’→+TE’ F→(E) | id
E’ → + T E’ Rule 1
add 𝛼 to 𝐹𝐼𝑅𝑆𝑇(𝐴) NT First
A → 𝛼
E { (,id }
E’ { +, 𝜖 }

E’→𝜖 T { (,id }
T’
E’ → 𝜖 Rule 2
F { (,id }
A → 𝜖 add 𝜖 to 𝐹𝐼𝑅𝑆𝑇(𝐴)

FIRST(E’)={ + , 𝜖 }
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 63
Example-3: LL(1) parsing
Step 2: Compute FIRST E→TE’
E’→+TE’ | ϵ
First(T’) T→FT’
T’→*FT’ | ϵ
T’→*FT’ F→(E) | id
T’ → * F T’ Rule 1
add 𝛼 to 𝐹𝐼𝑅𝑆𝑇(𝐴) NT First
A → 𝛼
E { (,id }
E’ { +, 𝜖 }

T’→𝜖 T { (,id }
T’ { *, 𝜖 }
T’ → 𝜖 Rule 2
F { (,id }
A → 𝜖 add 𝜖 to 𝐹𝐼𝑅𝑆𝑇(𝐴)

FIRST(T’)={ * , 𝜖 }
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 64
Example-3: LL(1) parsing
Step 2: Compute FOLLOW E→TE’
E’→+TE’ | ϵ
FOLLOW(E) T→FT’
Rule 1: Place $ in FOLLOW(E) T’→*FT’ | ϵ
F→(E) | id
F→(E) NT First Follow
E { (,id } { $,) }
E’ { +, 𝜖 }
F → ( E ) Rule 2 T { (,id }
A → 𝛂 B 𝛃
T’ { *, 𝜖 }
F { (,id }

FOLLOW(E)={ $, ) }

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 65


Example-3: LL(1) parsing
E→TE’
Step 2: Compute FOLLOW E’→+TE’ | ϵ
T→FT’
FOLLOW(E’) T’→*FT’ | ϵ
F→(E) | id
E→TE’
NT First Follow
E → T E’ Rule 3 E { (,id } { $,) }
A → 𝛂 B
E’ { +, 𝜖 } { $,) }
T { (,id }
T’ { *, 𝜖 }
E’→+TE’ F { (,id }
E’ → +T E’ Rule 3
A → 𝛂 B

FOLLOW(E’)={ $,) }

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 66


Example-3: LL(1) parsing
Step 2: Compute FOLLOW E→TE’
E’→+TE’ | ϵ
FOLLOW(T) T→FT’
T’→*FT’ | ϵ
E→TE’ F→(E) | id
NT First Follow
E → T E’ Rule 2 E { (,id } { $,) }
A → 𝛼 B 𝛃
E’ { +, 𝜖 } { $,) }
T { (,id }
T’ { *, 𝜖 }
F { (,id }
E → T E’ Rule 3
A → 𝛼 B 𝛃

FOLLOW(T)={ +, $, )
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 67
Example-3: LL(1) parsing
Step 2: Compute FOLLOW E→TE’
E’→+TE’ | ϵ
FOLLOW(T) T→FT’
T’→*FT’ | ϵ
E’→+TE’ F→(E) | id
NT First Follow
E’ → + T E’ Rule 2 E { (,id } { $,) }
A → 𝛂 B 𝛃
E’ { +, 𝜖 } { $,) }
T { (,id } { +,$,) }
T’ { *, 𝜖 }
F { (,id }
E’ → + T E’ Rule 3
A → 𝛂 B 𝛃

FOLLOW(T)={ +, $, ) }
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 68
Example-3: LL(1) parsing
Step 2: Compute FOLLOW E→TE’
E’→+TE’ | ϵ
FOLLOW(T’) T→FT’
T’→*FT’ | ϵ
T→FT’ F→(E) | id
NT First Follow
T → F T’ Rule 3 E { (,id } { $,) }
A → 𝛂 B
E’ { +, 𝜖 } { $,) }
T { (,id } { +,$,) }
T’→*FT’ T’ { *, 𝜖 } { +,$,) }
F { (,id }
T’ → *F T’ Rule 3
A → 𝛂 B

FOLLOW(T’)={+ $,) }
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 69
Example-3: LL(1) parsing
Step 2: Compute FOLLOW E→TE’
E’→+TE’ | ϵ
FOLLOW(F) T→FT’
T’→*FT’ | ϵ
T→FT’ F→(E) | id
NT First Follow
T → F T’ Rule 2 E { (,id } { $,) }
A → 𝛂 B 𝛃
E’ { +, 𝜖 } { $,) }
T { (,id } { +,$,) }
T’ { *, 𝜖 } { +,$,) }
F { (,id }
T → F T’ Rule 3
A → 𝛂 B 𝛃

FOLLOW(F)={ *, + ,$ , )
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 70
Example-3: LL(1) parsing
Step 2: Compute FOLLOW E→TE’
E’→+TE’ | ϵ
FOLLOW(F) T→FT’
T’→*FT’ | ϵ
T’→*FT’ F→(E) | id
NT First Follow
T’ → * F T’ Rule 2 E { (,id } { $,) }
A → 𝛂 B 𝛃
E’ { +, 𝜖 } { $,) }
T { (,id } { +,$,) }
T’ { *, 𝜖 } { +,$,) }
F { (,id } {*,+,$,)}
T’ → * F T’ Rule 3
A → 𝛂 B 𝛃

FOLLOW(F)={ *,+, $, ) }
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 71
Example-3: LL(1) parsing
Step 3: Construct predictive parsing table E→TE’
E’→+TE’ | ϵ
T→FT’
NT Input Symbol
T’→*FT’ | ϵ
id + * ( ) $ F→(E) | id
E E→TE’ E→TE’
E’ NT First Follow
T E { (,id } { $,) }
T’ E’ { +, 𝜖 } { $,) }
F T { (,id } { +,$,) }
T’ { *, 𝜖 } { +,$,) }
E→TE’ Rule: 2
F { (,id } {*,+,$,)}

a=FIRST(TE’)={ (,id } A→ 𝛼
a = first(𝛼)
M[E,(]=E→TE’ M[A,a] = A→ 𝛼

M[E,id]=E→TE’
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 72
Example-3: LL(1) parsing
Step 3: Construct predictive parsing table E→TE’
E’→+TE’ | ϵ
T→FT’
NT Input Symbol
T’→*FT’ | ϵ
id + * ( ) $ F→(E) | id
E E→TE’ E→TE’
E’ E’→+TE’ NT First Follow
T E { (,id } { $,) }
T’ E’ { +, 𝜖 } { $,) }
F T { (,id } { +,$,) }
T’ { *, 𝜖 } { +,$,) }
F { (,id } {*,+,$,)}
Rule: 2
E’→+TE’ A→ 𝛼
a = first(𝛼)
a=FIRST(+TE’)={ + } M[A,a] = A→ 𝛼

M[E’,+]=E’→+TE’
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 73
Example-3: LL(1) parsing
Step 3: Construct predictive parsing table E→TE’
E’→+TE’ | ϵ
T→FT’
NT Input Symbol
T’→*FT’ | ϵ
id + * ( ) $ F→(E) | id
E E→TE’ E→TE’
E’ E’→+TE’ E’→𝜖 E’→𝜖 NT First Follow
T E { (,id } { $,) }
T’ E’ { +, 𝜖 } { $,) }
F T { (,id } { +,$,) }
T’ { *, 𝜖 } { +,$,) }
E’→𝜖 Rule: 3
F { (,id } {*,+,$,)}

b=FOLLOW(E’)={ $,) } A→ 𝛼
b = follow(A)
M[E’,$]=E’→𝜖 M[A,b] = A→ 𝛼

M[E’,)]=E’→𝜖
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 74
Example-3: LL(1) parsing
Step 3: Construct predictive parsing table E→TE’
E’→+TE’ | ϵ
T→FT’
NT Input Symbol
T’→*FT’ | ϵ
id + * ( ) $ F→(E) | id
E E→TE’ E→TE’
E’ E’→+TE’ E’→𝜖 E’→𝜖 NT First Follow
T T→FT’ T→FT’ E { (,id } { $,) }
T’ E’ { +, 𝜖 } { $,) }
F T { (,id } { +,$,) }
T’ { *, 𝜖 } { +,$,) }
T→FT’ Rule: 2
F { (,id } {*,+,$,)}

a=FIRST(FT’)={ (,id } A→ 𝛼
a = first(𝛼)
M[T,(]=T→FT’ M[A,a] = A→ 𝛼

M[T,id]=T→FT’
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 75
Example-3: LL(1) parsing
Step 3: Construct predictive parsing table E→TE’
E’→+TE’ | ϵ
T→FT’
NT Input Symbol
T’→*FT’ | ϵ
id + * ( ) $ F→(E) | id
E E→TE’ E→TE’
E’ E’→+TE’ E’→𝜖 E’→𝜖 NT First Follow
T T→FT’ T→FT’ E { (,id } { $,) }
T’ T’→*FT’ E’ { +, 𝜖 } { $,) }
F T { (,id } { +,$,) }
T’ { *, 𝜖 } { +,$,) }
F { (,id } {*,+,$,)}
Rule: 2
T’→*FT’ A→ 𝛼
a = first(𝛼)
a=FIRST(*FT’)={ * } M[A,a] = A→ 𝛼

M[T’,*]=T’→*FT’
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 76
Example-3: LL(1) parsing
Step 3: Construct predictive parsing table E→TE’
E’→+TE’ | ϵ
NT Input Symbol T→FT’
id + * ( ) $ T’→*FT’ | ϵ
E E→TE’ E→TE’ F→(E) | id

E’ E’→+TE’ E’→𝜖 E’→𝜖


NT First Follow
T T→FT’ T→FT’
E { (,id } { $,) }
T’ T’→𝜖 T’→*FT’ T’→𝜖 T’→𝜖
E’ { +, 𝜖 } { $,) }
F
T { (,id } { +,$,) }
T’→𝜖 T’ { *, 𝜖 } { +,$,) }
b=FOLLOW(T’)={ +,$,) } Rule: 3
F { (,id } {*,+,$,)}

M[T’,+]=T’→𝜖 A→ 𝛼
b = follow(A)
M[T’,$]=T’→𝜖 M[A,b] = A→ 𝛼

M[T’,)]=T’→𝜖
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 77
Example-3: LL(1) parsing
Step 3: Construct predictive parsing table E→TE’
E’→+TE’ | ϵ
T→FT’
NT Input Symbol
T’→*FT’ | ϵ
id + * ( ) $ F→(E) | id
E E→TE’ E→TE’
E’ E’→+TE’ E’→𝜖 E’→𝜖 NT First Follow
T T→FT’ T→FT’ E { (,id } { $,) }
T’ T’→𝜖 T’→*FT’ T’→𝜖 T’→𝜖 E’ { +, 𝜖 } { $,) }
F F→(E) T { (,id } { +,$,) }
T’ { *, 𝜖 } { +,$,) }
F { (,id } {*,+,$,)}
Rule: 2
F→(E) A→ 𝛼
a = first(𝛼)
a=FIRST((E))={ ( } M[A,a] = A→ 𝛼

M[F,(]=F→(E)
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 78
Example-3: LL(1) parsing
Step 3: Construct predictive parsing table E→TE’
E’→+TE’ | ϵ
T→FT’
NT Input Symbol
T’→*FT’ | ϵ
id + * ( ) $ F→(E) | id
E E→TE’ E→TE’
E’ E’→+TE’ E’→𝜖 E’→𝜖 NT First Follow
T T→FT’ T→FT’ E { (,id } { $,) }
T’ T’→𝜖 T’→*FT’ T’→𝜖 T’→𝜖 E’ { +, 𝜖 } { $,) }
F F→id F→(E) T { (,id } { +,$,) }
T’ { *, 𝜖 } { +,$,) }
F { (,id } {*,+,$,)}
Rule: 2
F→id A→ 𝛼
a = first(𝛼)
a=FIRST(id)={ id } M[A,a] = A→ 𝛼

M[F,id]=F→id
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 79
Example-3: LL(1) parsing
• Step 4: Make each undefined entry of table be Error
NT Input Symbol
id + * ( ) $
E E→TE’ Error Error E→TE’ Error Error
E’ Error E’→+TE’ Error Error E’→𝜖 E’→𝜖
T T→FT’ Error Error T→FT’ Error Error
T’ Error T’→𝜖 T’→*FT’ Error T’→𝜖 T’→𝜖
F F→id Error Error F→(E) Error Error

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 80


Example-3: LL(1) parsing
Step 4: Parse the string : id + id * id $ NT Input Symbol
id + * ( ) $
STACK INPUT OUTPUT
E E→TE’ Error Error E→TE’ Error Error
E$ id+id*id$
E’ Error E’→+TE’ Error Error E’→𝜖 E’→𝜖
TE’$ id+id*id$ E→TE’
T T→FT’ Error Error T→FT’ Error Error
FT’E’$ id+id*id$ T→FT’
T’ Error T’→𝜖 T’→*FT’ Error T’→𝜖 T’→𝜖
idT’E’$ id+id*id$ F→id
F F→id Error Error F→(E) Error Error
T’E’$ +id*id$
E’$ +id*id$ T’→𝜖
+TE’$ +id*id$ E’→+TE’
TE’$ id*id$ FT’E’$ id$
FT’E’$ id*id$ T→FT’ idT’E’$ id$ F→id
idT’E’$ id*id$ F→id T’E’$ $
T’E’$ *id$ E’$ $ T’→𝜖
*FT’E’$ *id$ T→*FT’ $ $ E’→𝜖
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 81
A Grammar which is not LL(1)
• The parsing table of a grammar may contain more than one production rule.
• In this case, we say that it is not a LL(1) grammar.

S→iCtSE | a
E→eS | 
C→b
a b e i t $
FIRST(S->iCtSE) = {i}
FIRST(S->a) = {a} S S→a S → iCtSE

FIRST(E->eS) = {e} E E→eS E→


FIRST(E->) = {} E→
FIRST(C->b) = {b} C→b
C
FOLLOW(S) = { $,e }
FOLLOW(E) = { $,e }
FOLLOW(C) = { t }
Problem ➔ ambiguity

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 82


A Grammar which is not LL(1)
• What do we have to do if the resulting parsing table contains multiply
defined entries?
• Eliminate left recursion in the grammar, if it is not eliminated
• A → A | 
➔ any terminal that appears in FIRST() also appears FIRST(A)
because A  .
➔ If  is , any terminal that appears in FIRST() also appears in
FIRST(A) and FOLLOW(A).
• Left factor the grammar, if it is not left factored.
• A grammar is not left factored, it cannot be a LL(1) grammar: A → 1 | 2
➔any terminal that appears in FIRST(1) also appears in FIRST(

• If its (new grammar’s) parsing table still contains multiply defined entries, that
grammar is ambiguous or it is inherently not a LL(1) grammar.
• An ambiguous grammar cannot be a LL(1) grammar.

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 83


Parsing methods
Parsing

Top down parsing Bottom up parsing (Shift reduce)

Back tracking
Operator precedence

Parsing without
backtracking (predictive LR parsing
parsing)
SLR
LL(1)
CLR
Recusive
Descent LALR

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 84


Recursive descent parsing
• A top down parsing that executes a set of recursive procedure to process the
input without backtracking is called recursive descent parser.
• There is a procedure for each non terminal in the grammar.
• Consider RHS of any production rule as definition of the procedure.
• As it reads expected input symbol, it advances input pointer to next position.

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 85


Example: Recursive descent parsing
Procedure E Procedure T Proceduce Match(token t)
{ { {
If lookahead=num If lookahead=’*’ If lookahead=t
{ { lookahead=next_token;
Match(num); Match(‘*’); Else
T(); If lookahead=num Error();
} { }
Else Match(num);
Error(); T(); Procedure Error
If lookahead=$ } {
{ Else Print(“Error”);
Declare success; Error(); }
}
Else }
Error(); Else
} NULL E→ num T
} T→ * num T | 𝜖
3 * 4 $ Success

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 86


Example: Recursive descent parsing
Procedure E Procedure T Proceduce Match(token t)
{ { {
If lookahead=num If lookahead=’*’ If lookahead=t
{ { lookahead=next_token;
Match(num); Match(‘*’); Else
T(); If lookahead=num Error();
} { }
Else Match(num);
Error(); T(); Procedure Error
If lookahead=$ } {
{ Else Print(“Error”);
Declare success; Error(); }
}
Else }
Error(); Else
} NULL E→ num T
} T→ * num T | 𝜖
3 * 4 $ Success 3 4 * $ Error
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 87
Parsing Methods
Parsing

Top down parsing Bottom up parsing (Shift reduce)

Back tracking(Recursive
descent) Operator precedence

Parsing without
backtracking (predictive LR parsing
Parsing)
SLR
LL(1)
CLR
Recursive
Descent LALR

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 88


Bottom-up parsing
• A bottom-up parser creates the parse tree of the given input starting
from leaves towards the root.
• A bottom-up parser tries to find the RMD of the given input in the
reverse order.
S  ...   (the right-most derivation of )
 (the bottom-up parser finds the right-most derivation in the reverse
order)
• Bottom-up parsing is also known as shift-reduce parsing because its
two main actions are shift and reduce.
• At each shift action, the current symbol in the input string is pushed to a stack.
• At each reduction step, the symbols at the top of the stack (this symbol
sequence is the right side of a production) will be replaced by the non-terminal
at the left side of that production.
• There are also two more actions: accept and error.

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 89


Handle & Handle pruning
• Handle: A “handle” of a string is a substring of the string that matches the right side of a
production, and whose reduction to the non terminal of the production is one step along the
reverse of rightmost derivation. But not every substring matches the right side of a production
rule is handle
• Handle pruning: The process of discovering a handle and reducing it to appropriate left hand
side non terminal is known as handle pruning.
E→E+E
E→E*E String: id1+id2*id3
E→id
Rightmost Derivation Right sentential form Handle Production
id1+id2*id3 id1 E→id
E
E+E E+id2*id3 id2 E→id
E+E*E E+E*id3 id3 E→id
E+E*id3 E+E*E E*E E→E*E
E+id2*id3 E+E E+E E→E+E
id1+id2*id3 E
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 90
Shift reduce parser
• A shift-reduce parser tries to reduce the given input string into the starting symbol.
a string ➔ the starting symbol
reduced to
• The shift reduce parser performs following basic operations:
1. Shift: Moving of the symbols from input buffer onto the stack, this action is called shift.
2. Reduce: If handle appears on the top of the stack then reduction of it by appropriate rule is
done. This action is called reduce action.
3. Accept: If stack contains start symbol only and input buffer is empty at the same time then
that action is called accept.
4. Error: A situation in which parser cannot either shift or reduce the symbols, it cannot even
perform accept action then it is called error action.

• Initial stack just contains only the end-marker $.


• The end of the input string is marked by the end-marker $.
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 91
Example: Shift reduce parser
Grammar: Stack Input Buffer Action
E→E+T | T $ id+id*id$ Shift
T→T*F | F $id +id*id$ Reduce F→id
F→id $F +id*id$ Reduce T→F
String: id+id*id $T +id*id$ Reduce E→T
$E +id*id$ Shift
$E+ id*id$ Shift
$E+id *id$ Reduce F→id
$E+F *id$ Reduce T→F
$E+T *id$ Shift
$E+T* id$ Shift
$E+T*id $ Reduce F→id
$E+T*F $ Reduce T→T*F
$E+T $ Reduce E→E+T
$E $ Accept
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 92
Conflicts During Shift-Reduce Parsing
• There are context-free grammars for which shift-reduce parsers cannot
be used.
• Stack contents and the next input symbol may not decide action:
• shift/reduce conflict: Whether make a shift operation or a reduction.
• reduce/reduce conflict: The parser cannot decide which of several reductions to
make.
• If a shift-reduce parser cannot be used for a grammar, that grammar is
called as non-LR(k) grammar.

left to right right-most k lookhead


scanning derivation

• An ambiguous grammar can never be a LR grammar

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 93


Shift reduce parser
• There are two main categories of shift-reduce parsers

1. Operator-Precedence Parser CFG


• simple, but only a small class of grammars. LR
LALR

SLR
2. LR-Parsers
• covers wide range of grammars.
• SLR – simple LR parser
• LR – most general LR parser
• LALR – lookhead LR parser-intermediate LR parser
• SLR, LR and LALR work same, only their parsing tables are different.

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 94


Parsing Methods
Parsing

Top down parsing Bottom up parsing (Shift reduce)

Back tracking Operator precedence

Parsing without
backtracking (predictive LR parsing
parsing)
SLR
LL(1)
CLR
Recursive
Descent LALR

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 95


LR parser
• LR parsing is most efficient method of bottom up parsing which can be used to parse large
class of context free grammar.
• The technique is called LR(k) parsing:
1. The “L” is for left to right scanning of input symbol,
2. The “R” for constructing right most derivation in reverse,
3. The “k” for the number of input symbols of look ahead that are used in making parsing
decision. a + b $ INPUT

X
LR parsing
Y
program OUTPUT
Z
$
Parsing Table
Action Goto
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 96
LR Parser …
a + b $

X
LR parsing
Y
Stack program OUTPUT
Z
$

Action Table Goto Table


terminals and $ non-terminal
s s
t four different t each item is
a actions a a state number
t t
e e
s s

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 97


Steps to construct SLR parser
1. Construct Canonical set of LR(0) items
2. Construct SLR parsing table
3. Parse the input string

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 98


Construct Canonical set of LR(0) items
• An LR parser makes shift-reduce decisions by maintaining states to keep track of
where we are in a parse.
• An LR(0) item of a grammar G is a production of G a dot at the some position of the
right side.
.
• Ex: A → aBb Possible LR(0) Items:
A → a Bb
(four different possibility)
A → aB b
A → aBb
..
A → aBb .
• LR(0) items is useful to indicate that how much of the input has been scanned up to
a given point in the process of parsing.
• Sets of LR(0) items will be the states of action and goto table of the SLR parser.
• i.e. States represent sets of "items.“
• A collection of sets of LR(0) items (the canonical LR(0) collection) is the basis for
constructing SLR parsers.

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 99


Construct Canonical set of LR(0) items …
• To construct the canonical LR(0) collection for a grammar, we define an augmented grammar and
two functions, CLOSURE and GOTO.
Augmented Grammar:
• G’ is G with a new production rule S’→S where S’ is the new starting symbol.
• Purpose: to indicate the acceptance of input. If you reduce by this particular production (to the non-terminal S′),
you are accepting.
closure:
• If I is a set of LR(0) items for a grammar G, then closure(I) is the set of LR(0) items constructed
from I by the two rules:
1. Initially, every LR(0) item in I is added to closure(I).
2. If A → .B is in closure(I), for all production rules B→ in G, add B→. in the closure(I).
We will apply this rule until no more new LR(0) items can be added to closure(I).
Goto Operations:
• If I is a set of LR(0) items and X is a grammar symbol (terminal or non-terminal), then goto(I,X) is
defined as follows:
. .
• If A →  X in I, then every item in closure({A → X }) will be in goto(I,X).

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 100


Computation of closure & go to function
X→ Xb
Closure(I):
X→ . X b
Goto(I,X)
X→.X b

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 101


Example: SLR(1)- simple LR
S → AA
S→ AA . 𝑰5
A → aA | b S’→ S. 𝑰𝟏
𝐺𝑜 𝑡𝑜 (𝐼2, 𝐴) A→ a . A
𝑰𝟐
𝑰𝟎 𝐺𝑜 𝑡𝑜 (𝐼0, 𝑆) A→. aA 𝑰3
S→ A . A A→. b
S’→.S
A→. aA
S→. AA
A→. b A→ b. 𝑰4
A→. aA 𝐺𝑜 𝑡𝑜 (𝐼2, 𝑏)
A→. b
𝑰3 A→ aA . 𝑰6 LR(0) item set
Augmented
grammar A→ a . A
𝐺𝑜 𝑡𝑜 (𝐼0, 𝑏) A→. aA 𝐺𝑜 𝑡𝑜 (𝐼3, 𝑎) A→ a . A
A→. b 𝑰3
A→ b. 𝑰4 A→. aA
A→. b

A→ b. 𝑰4

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 102


Rules to construct SLR parsing table
1. Construct 𝐶 = { 𝐼0, 𝐼1, … … . 𝐼𝑛}, the collection of sets of LR(0) items for 𝐺’.
2. State 𝑖 is constructed from 𝐼𝑖 . The parsing actions for state 𝑖 are determined as
follow :
a) If [ 𝐴 → 𝛼. 𝑎𝛽 ] is in 𝐼𝑖 and GOTO (𝐼𝑖 , 𝑎) = 𝐼𝑗 , then set 𝐴𝐶𝑇𝐼𝑂𝑁[𝑖, 𝑎] to “shift j”. Here
a must be terminal.
b) If [𝐴 → 𝛼. ] is in 𝐼𝑖 , then set 𝐴𝐶𝑇𝐼𝑂𝑁[𝑖, 𝑎] to “reduce A→ 𝛼” for all a in 𝐹𝑂𝐿𝐿𝑂𝑊(𝐴);
here A may not be S’.
c) If [𝑆 → 𝑆. ] is in 𝐼𝑖 , then set action [𝑖, $] to “accept”.
3. The goto transitions for state i are constructed for all non terminals A using
the𝑖𝑓 𝐺𝑂𝑇𝑂( 𝐼𝑖 , 𝐴 ) = 𝐼𝑗 𝑡ℎ𝑒𝑛 𝐺𝑂𝑇𝑂 [𝑖, 𝐴] = 𝑗.
4. All entries not defined by rules 2 and 3 are made error.

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 103


Example: SLR(1)- simple LR
S→ AA . 𝑰5 𝐹𝑜𝑙𝑙𝑜𝑤(𝑆) = {$}
S’→ S. 𝑰𝟏
𝐺𝑜 𝑡𝑜 (𝐼2, 𝐴) A→ a . A 𝐹𝑜𝑙𝑙𝑜𝑤(𝐴) = {𝑎, 𝑏, $}
𝑰𝟐
𝑰𝟎 𝐺𝑜 𝑡𝑜 (𝐼0, 𝑆) A→. aA 𝑰3
S→ A . A A→. b Action Go to
S’→. S
A→. aA Item a b $ S A
S→. AA
A→. b A→ b. 𝑰4 set
A→. aA 𝐺𝑜 𝑡𝑜 (𝐼2, 𝑏)
0 S3 S4 1 2
A→. b
𝑰3 A→ aA . 𝑰6 1 Accept
2 S3 S4 5
A→ a . A
3 S3 S4 6
𝐺𝑜 𝑡𝑜 (𝐼0, 𝑏) A→. aA 𝐺𝑜 𝑡𝑜 (𝐼3, 𝑎) A→ a . A
4 R3 R3 R3
A→. b 𝑰3
A→ b. 𝑰4 A→. aA
5 R1
A→. b
6 R2 R2 R2
S → AA
A → aA | b A→ b. 𝑰4

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 104


Example : Construct SLR(1)
Construct SLR Parse table for the augmented grammar and
show how the parser accepts the string or input id*id+id
E → E+T
E→T
T → T*F
T→F
F → (E)
F → id

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 105


1. Construct Canonical set of LR(0) item

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 106


2. Construct SLR parsing table
1) E → E+T Action Table Goto Table
2) E→T state id + * ( ) $ E T F
0 s5 s4 1 2 3
3) T → T*F 1 s6 acc
4) T→F 2 r2 s7 r2 r2

5) F → (E) 3 r4 r4 r4 r4
4 s5 s4 8 2 3
6) F → id 5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 107
Rules:
1) E → E+T 4) T → F
3. Parse the input string 2) E → T 5) F → (E)
stack symbol input action output 3) T → T*F 6) F → id
0 id*id+id$ shift 5 state id + * ( ) $ E T F
05 id *id+id$ reduce by F→id F→id 0 s5 s4 1 2 3
03 F *id+id$ reduce by T→F T→F 1 s6 acc
02 T *id+id$ shift 7
2 r2 s7 r2 r2
027 T* id+id$ shift 5
3 r4 r4 r4 r4
0275 T*id +id$ reduce by F→id F→id
02710 T*F(*) +id$ reduce by T→T*F T→T*F 4 s5 s4 8 2 3
02 T +id$ reduce by E→T E→T 5 r6 r6 r6 r6

01 E +id$ shift 6 6 s5 s4 9 3
016 E+ id$ shift 5 7 s5 s4 10
0165 E+id $ reduce by F→id F→id
8 s6 s11
0163 E+F $ reduce by T→F T→F
9 r1 s7 r1 r1
0169 E+T(**) $ reduce by E→E+T E→E+T
10 r3 r3 r3 r3
01 E $ accept
11 r5 r5 r5 r5

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 108


Activity
• Construct SLR Parse table for the augmented grammar and show how
the parser accepts the string or input id-id

E→F–E/F
F → id

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 109


Parsing Methods
Parsing

Top down parsing Bottom up parsing (Shift reduce)

Back tracking(Recursive
Operator precedence
descent )

Parsing without
backtracking (predictive LR parsing
Parsing)
SLR
LL(1)
CLR

LALR

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 110


How to calculate look ahead?
How to calculate look ahead?
S→CC
S’ → . S , $
C→ cC | d
A → 𝜶 . X 𝜷 , 𝒂
Closure(I)
Lookahead = First 𝜷𝒂
S’→.S,$ First $
=$
S→.CC, $
C→.cC, c|d S → . C C , $
C→.d, c|d A → 𝜶 . X 𝜷 , 𝒂
Lookahead = First 𝜷𝒂
First 𝐶$
= 𝒄, 𝒅

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 111


Example: CLR(1)- canonical LR
𝑰9
𝑰5
S→ AA. ,$
𝑰6 A→ aA.,$
S’→ S., $ 𝑰𝟏 A→ a.A,$ 𝑰6
𝐺𝑜 𝑡𝑜 (𝐼2, 𝐴) A→ a.A,$
𝑰𝟐 A→. aA,$
𝑰𝟎 𝐺𝑜 𝑡𝑜 (𝐼0, 𝑆) A→. aA,$
A→. b, $
S→ A.A,$ A→. b, $
S’→.S,$ A→ b. ,S
A→.aA, $ 𝑰7 𝑰7
S→.AA,$
A→. b, $ A→ b. ,$
A→.aA, a|b 𝐺𝑜 𝑡𝑜 (𝐼2, 𝑏)
𝑰8
A→.b, a|b
𝑰3 A→ aA.,a|b LR(1) item set
Augmented
grammar A→a.A, a|b 𝑰3
𝐺𝑜 𝑡𝑜 (𝐼0, 𝑏) A→.aA ,a|b 𝐺𝑜 𝑡𝑜 (𝐼3, 𝑎) A→ a.A , a|b
A→. b, a|b A→.aA , a|b
A→ b., a|b 𝑰4
A→.b , a|b
S → AA
A → aA | b 𝑰4 A→ b., a|b

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 112


Example: CLR(1)- canonical LR
𝑰9
𝑰5
S→ AA. ,$
𝑰6 A→ aA.,$
S’→ S., $ 𝑰𝟏 A→ a.A,$ 𝑰6
𝐺𝑜 𝑡𝑜 (𝐼2, 𝐴) A→ a.A,$
𝑰𝟐 A→. aA,$
𝑰𝟎 𝐺𝑜 𝑡𝑜 (𝐼0, 𝑆) A→. aA,$
A→. b, $
S→ A.A,$ A→. b, $
S’→.S,$ A→ b. ,S
A→.aA, $ 𝑰7 𝑰7
S→.AA,$
A→. b, $ A→ b. ,S
A→.aA, a|b 𝐺𝑜 𝑡𝑜 (𝐼2, 𝑏) Item Action Go to
𝑰8 set a b $ S A
A→.b, a|b
𝑰3 A→ aA.,a|b 0 S3 S4 1 2
1 Accept
A→a.A, a|b 𝑰3
2 S6 S7 5
𝐺𝑜 𝑡𝑜 (𝐼0, 𝑏) A→.aA ,a|b 𝐺𝑜 𝑡𝑜 (𝐼3, 𝑎) A→ a.A , a|b 3 S3 S4 8
A→. b, a|b 4 R3 R3
A→ b., a|b 𝑰4 A→.aA , a|b
5 R1
A→.b , a|b 6 S6 S7 9
S → AA 7 R3
A → aA | b 8 R2 R2
𝑰4 A→ b., a|b
9 R2
Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 113
Parsing Methods
Parsing

Top down parsing Bottom up parsing (Shift reduce)

Back tracking(Recursive Operator precedence


descent )

Parsing without
backtracking (predictive LR parsing
Parsing)
SLR
LL(1)
CLR

LALR

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 114


Example: LALR(1)- look ahead LR
𝑰9
𝑰5
S→ AA. ,$
𝑰6 A→ aA.,$
S’→ S., $ 𝑰𝟏 A→ a.A,$ 𝑰6
𝐺𝑜 𝑡𝑜 (𝐼2, 𝐴) A→ a.A,$
𝑰𝟐 A→. aA,$
𝑰𝟎 𝐺𝑜 𝑡𝑜 (𝐼0, 𝑆) A→. aA,$
A→. b, $
S→ A.A,$ A→. b, $
S’→.S,$ A→ b. ,$
A→.aA, $ 𝑰7 𝑰7
S→.AA,$
A→. b, $ A→ b. ,$
A→.aA, a|b 𝐺𝑜 𝑡𝑜 (𝐼2, 𝑏) 𝑰36 CLR
𝑰8
A→.b, a|b
𝑰3 A→ aA.,a|b A→a.A, a|b|$
𝑰3 A→.aA , a|b|$
A→a.A, a|b
𝐺𝑜 𝑡𝑜 (𝐼0, 𝑏)
A→. b, a|b|$
A→.aA ,a|b 𝐺𝑜 𝑡𝑜 (𝐼3, 𝑎) A→ a.A , a|b
A→. b, a|b A→.aA , a|b 𝑰47
A→ b., a|b 𝑰4
A→.b , a|b A→ b., a|b|$
S → AA 𝑰89
A → aA | b 𝑰4 A→ b., a|b A→ aA.,a|b|$

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 115


Example: LALR(1)- look ahead LR

Item Action Go to
set a b $ S A
0 S3 S4 1 2 Item Action Go to
1 Accept set a b $ S A
2 S6 S7 5 0 S36 S47 1 2
3 S3 S4 8 1 Accept
4 R3 R3 2 S36 S47 5
5 R1 36 S36 S47 89
6 S6 S7 9 47 R3 R3 R3
5 R1
7 R3
89 R2 R2 R2
8 R2 R2
9 R2

CLR Parsing Table LALR Parsing Table

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 116


Parsing Methods
Parsing

Top down parsing Bottom up parsing (Shift reduce)

Back tracking(Recursive
Operator precedence
descent )

Parsing without
backtracking (predictive LR parsing
Parsing)
SLR
LL(1)
CLR

LALR

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 117


Reading Assignment
• Operator Precedence

Mekonen M. # CoSc4072  Unit 3 – Syntax Analysis 118

You might also like