0% found this document useful (0 votes)

20 views

Chapter 4 Syntax Analysis

Uploaded by

sikaryoseph

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

Chapter 4 Syntax Analysis

Uploaded by

sikaryoseph

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 95

Principles of Compiler

Design

Chapter 4: Syntax
Analysis
Contents – Plan of attack
• Introduction of syntax analysis

• Role of parser

• Context free grammar

• Derivation

• Ambiguity

– Left Recursion & Left Factoring

• Classification of parsing

– Top down parsing

– Bottom up parsing
Class Discussion
1. What is the role of syntax analysis in the compilation process?
2. How does syntax analysis differ from lexical analysis?
3. What is a context-free grammar, and how is it used to describe the
syntax of a language?
4. What are the components of a context-free grammar?
5. Can a grammar have multiple derivations for the same sentence? If
so, what are the implications?
6. How can we identify and resolve ambiguity in a grammar?
7. How do top-down and bottom-up parsing differ in their approach?
Syntax Analysis
 Syntax analyzer receives the source code in

the form of tokens from the lexical analyzer

and performs syntax analysis, which create

a tree-like intermediate representation that

depicts the grammatical structure of the

token stream.
 Syntax analysis is also called parsing.
Syntax Analysis
• They are then checked for proper syntax
– the compiler checks to make sure the statements and
expressions are correctly formed.
– It checks whether the given input is in the correct syntax
of the programming language or not.
– It construct the Parse Tree for checking.

– It helps you to detect all types of Syntax errors.

The Role of the Parser

 Parser obtains a string of token from the lexical

analyzer and reports syntax error if any otherwise
generates syntax tree.
The Role of the Parser
 Major task conducted during parsing(syntax analysis):

– the parser obtains a stream of tokens and verifies that

token names can be generated by the grammar for the
source language.
– Determine the syntactic validity of a source string, a tree
is built for use by the subsequent phases of the compiler.
– Collecting information about various tokens into the
symbol table,

– Perform checking to make sure the statements and

expressions are correctly formed.

Context-Free Grammars (CFG)
• A grammar is a list of rules which can be used to
produce or generate all the strings of a language.
• According to Noam Chomsky, there are four types
of grammars
– Type 3  (Regular expression),
– Type 2  (Context Free Grammar)
– Type 1  (Context Sensitive Grammar)
and
– Type 0  (Unrestricted Grammar).
Type - 3 Grammar
• Type-3 grammars generate regular languages.

• Type-3 grammars must have a single non terminal on the left-hand side
and a single terminal or single terminal followed by a single non-terminal
on the right-hand side .

• The productions must be in the form

– X → a or

– X → aY, where X, Y ∈ N (Nonterminal) and a ∈ T (Terminal).

• The rule S → ε is allowed if S does not appear on the right side of any
rule.
Type - 2 Grammar

• Type-2 grammars generate context-free languages.

• The productions must be in the form:

A → γ, where A ∈ N Non-terminal and γ∈(T ∪ N)* string of

terminals and non-terminals.

• The languages generated by these grammars are recognized by a

non-deterministic pushdown automaton.

Type - 1 Grammar
• Type-1 grammars generate context-sensitive
languages.
• The productions must be in the form αAβ → αγβ,
where A∈N (Non-terminal) and α, β, γ ∈ T ∪ N*
(Strings of terminals and non-terminals).
– | αAβ | ≤ | αγβ |

• The strings α and β may be empty, but γ must be non

empty.
• The rule S → ε is allowed if S does not appear on the
right side of any rule.
Type - 0 Grammar
• Type-0 grammars generate recursively enumerable languages.
• The productions have no restrictions.
• They are any phase structure grammar including all formal grammars.
• This grammar generate the languages that are recognized by Turing
machine.
• The productions can be in the form of α → β where α is a string of
terminals and non-terminals with at least one non-terminal and α cannot
be null and β is a string of terminals and non-terminals.
Example
S → ACaB
Bc → acB
CB → DB
Context-Free Grammars
Context-Free Grammar is a powerful tool for describing
the syntax of programming languages.
A context-free grammar is a 4-tuple (V, T, S, P),
where
(i) V is a finite set called the variables(non-terminals)
(ii) T is a finite set, disjoint from V, called the terminals

(iii) S ∈ V is the start variable.

(iv) P is a finite set of rules, with each rule being a

variable and a string of variables and terminals.
Example #1 - Context-Free Grammars

• Assume the grammar G is given as;

G: E  E O E| (E) | -E | id

O+|-|*|/|↑
• Write terminals, non terminals, start symbol, and
productions for following grammar.

– Terminals: id + - * / ↑ ( )

– Non terminals: E, O

– Start symbol: E

– Productions: E  E O E| (E) | -E | id

O+|-|*|/|↑
Example #2 - Context-Free Grammars

• G: S  AB
A  aAA
A  aA
Aa
B  bB
Bb
1. Q1. Identify Start variable, Terminal symbols , Non
terminals and Production rules.

2. Q2. Check if the following input string is accepted or not

by G. Input string= ab, aab, aaab , aabba.
Context-Free Grammars

• Generally, a context-free grammar -

– Gives a precise syntactic specification of a

programming language.
– The design of the grammar is an initial phase of
the design of a compiler.
– A grammar can be directly converted into a parser
by some tools.
• Parser: program that takes tokens and grammars
(CFGs) as input and validates the output tokens
against the grammar.
Context-Free Grammars(CFG)

CFG

Right Linear Left Linear

Grammar Grammar
Conversion of Left-linear to Right-Linear Grammar

• Algorithm

• If the left linear grammar has a rule with the start symbol

S on the right hand side, simply add rule: S0 → S

• Symbols used by the algorithm

– Let S denote the start symbol

– Let A, B denote non-terminal symbols

– Let p denote zero or more terminal symbols

– Let ε denote the empty symbol

Conversion of Left-linear Grammar into Right-Linear
Grammar
1) If the left linear grammar has a rule S → p, then make
that a rule in the right linear grammar

2) If the left linear grammar has a rule A →p, then add the
following rule to the right linear grammar: S → p A

3) If the left linear grammar has a rule B → Ap, add the

following rule to the right linear grammar: A → pB

4) If the left linear grammar has a rule S → Ap, then add

the following rule to the right linear grammar: A → p

5) If the left linear grammar has a rule S → A, then add the

following rule to the right linear grammar: A →
Conversion of Left-linear Grammar into Right-Linear
Grammar

Left Linear
S → Aa
A → ab
Right Linear
left linear
S → abA
S → Aa
A → ab

2) If the left linear grammar has this rule A → p,

then add the following rule to the right linear
grammar: S → pA 21
Right hand side of S has non-terminal
Left Linear Right Linear
S → Aa S → abA
A → ab A→a

4) If the left linear grammar has S → Ap, then add the

following rule to the right linear grammar: A → p

Left Linear Right Linear

S → Aa S → abA
A → ab A→a
Both grammars generate this language: {aba}
22
Convert this left linear grammar

Original Grammar Left

Linear
S → Ab S0 → S
S → Sb
S → Ab
A → Aa
A→a
make a
new start
S → Sb
symbol A → Aa
A→a

Convert this

23
Right hand side has terminals

left linear right linear

S0 → S S0 → aA
S → Ab
S → Sb
A → Aa
A→a
2) If the left linear grammar has this rule A → p, then
add the following rule to the right linear grammar: S
→ pA 24
Right hand side has non-terminal

Left Linear Right

Linear
S0 → S S0 → aA
S → Ab A → bS
S → Sb A → aA
A → Aa S → bS
A→a

3) If the left linear grammar has a rule B → Ap,

add the following rule to the right linear
grammar: A → pB 25
Right hand side of start symbol has non-
terminal

Left Linear Right Linear

S0 → S S0 → aA
S → Ab A → bS
S → Sb A → aA
A → Aa S → bS
A→a S→ε

4) If the left linear grammar has S → Ap, then

add the following rule to the right linear
26
grammar: A → p
Equivalent!
Left Right
Linear Linear
S0 → S S0 → aA
S → Ab A → bS
S → Sb A → aA
A → Aa S → bS
A→a S→ε

Both grammars generate this language:

{a+b+}
27
Derivation & Ambiguity
• Derivation: Derivation is used to find whether the string
belongs to a given grammar or not.
– Derivation is a sequence of production rules.
Production Rules Derivations

• Types of derivations are:

1. Leftmost derivation

2. Rightmost derivation
Leftmost Derivation

• A derivation of a string 𝑊 in a grammar 𝐺 is a left most

derivation if at every step the left most non terminal is
replaced.
• Grammar: SS+S | S-S | S*S | S/S | a Output string: a*a-
a
Rightmost Derivation

• A derivation of a string 𝑊 in a grammar 𝐺 is a

rightmost derivation if at every step the right most
non terminal is replaced.
• It is all called canonical derivation.
• Grammar: SS+S | S-S | S*S | S/S | a Output string:
a*a-a
Example #2 – Leftmost/ rightmost Derivation

• Consider the grammar S  S+S | S*S | a| b. Find

leftmost and rightmost derivation for the string w =
a*a+b.
• Solution:
Leftmost derivation Rightmost
derivation
DERIVATION TREES
 A derivation tree is a graphical representation of
derivation that filters out the order of replacing non-
terminals.
– Root node = start symbol
– Interior nodes = non-terminals
– Leaves = terminals.

 Example:
 Rules: E  E+E | E*E | -E | (E) | id
 Input: –(id + id )
DERIVATION TREES
• Example -1: A grammar G which is context-free has the
productions
S → aAB
A → Bba
B → bB
B→c
• The word w = acbabc is derived as follows:
S ⇒ aAB
⇒ a(Bba)B
⇒ acbaB
⇒ acba(bB)
⇒ acbabc.
• Obtain the derivation tree.
DERIVATION TREES
Exercise- Derivation
1. Perform leftmost derivation and draw parse tree.
S  A1B

A  0A | 𝜖

B  0B | 1B | 𝜖

Output string: 1001

2. Perform leftmost derivation and draw parse tree.

S  0S1 | 01

Output string: 000111

3. Perform rightmost derivation and draw parse tree.

E  E+E | E*E | id | (E) | -E

Output string: id + id * id
Exercise- Derivation
Ambiguity
• Ambiguity, is a word, phrase, or statement
which contains more than one meaning.
Ambiguity
A grammar that produces more than one parse tree for some sentence is
said to be ambiguous. Or
Ambiguous grammar is one that produces more than one
leftmost or rightmost derivation for the same sentence.
Ambiguous grammar
Ambiguous grammar is one that produces more than
one leftmost or rightmost derivation for the same
sentence.
Grammar: S→S+S | (S) | a Output string: a+a+a

 Here, Two leftmost derivation for string a+a+a is possible

because Rule of associativity is not maintained.
Ambiguous grammar

Consider the CFG S → S + S | S* S | a | b and string w = a*a +b,

show the leftmost derivations.
Exercise
Shows that the following grammars are ambiguous or
not.
a) S → SS | a | b
b) S → A | B | b , A → aAB | ab, B → abB | ϵ
Causes of ambiguity
There are two causes of ambiguity in grammar:
1. Grammar lacks precedence.
2. Associativity is not preserved in grammar.
Associativity of Operators
 If an operand has operator on both the sides, the side on which
operator takes this operand is the associativity of that operator.
 In (a+b)+c b is taken by left
 +, -, *, / are left associative and ^, = are right associative.
 Example

» 1+2+3 first we evaluate (1+2)+3 left associative

» 1^2^3 => 1 ^(2^ 3) right associative
» a=b=c right associative
Precedence of Operator
 Most programming languages have operator
precedence rules that state the order in which
operators are applied.
 Operators precedence rules can be incorporated
directly into a Context Free Grammar to obtain
only one parse tree for the input expression.
Eliminating Ambiguity - Left Recursion

A grammar is said to be left recursive if it has a non terminal 𝐴

such that there is a derivation.

𝑨 → 𝑨𝜶 for some string 𝛼.

In other words , in the derivation process starting from any non – terminal A,

if the sentential form starts with the same non-terminal A, then we say that

the grammar is having left recursion.

Left Recursion Elimination
𝐴 → 𝐴𝛼| 𝛽  A →𝛽A
A → 𝛼 A | ϵ
Example #1:
Eliminate the left recursion from the following
grammars:

E → E+T | T
T → T* F | F
F → (E) | id
Left Recursion Elimination

Exercise #2: eliminate left recursion from the

following grammar
S → Ab | a
A → Ab | Sa
• Solution: substituting for S in A – production can
eliminate the indirect left recursion from S. So the
grammar can be written as,

?
S → Ab | a
A → aaA1
A1 →b A1 | ba A1 | ϵ
S → Ab | a
A → Ab | Aba | aa
Example #2: eliminate left recursion
Eliminating Ambiguity - Left Factoring

Left factoring is a grammar transformation that

is useful for producing a grammar suitable for
predictive parsing.
Two or more productions of a variable A of the
grammar G = (V, T, P, S) are said to be have left
factoring if A – productions are of the form

A → α1 | α2 | α3 |…….. |αn , where I (V T)* and

does not start with α.
All these A- productions have common left factor α.
Left Factoring - Elimination

Let the variable A has (left factoring ) productions

as follows:

A → α1 | α2 | α3 |…….. |αn |1 | 2 | …… | m ,

where

1, 2 ,3,…….. , n |1 , 2 , …… ,m don't contain α as

prefix, then we replace A-productions by:

A → αA’| 1 | 2 | …… | m , where

A’ → 1 | 2 | 3 |…….. |n
Left Factoring - Elimination

Example #1:

Example #2: consider the grammar S → aSa | aa,

and remove the left factoring(if any).
Solution –
S → aSa | aa have α = a as left factor, so removing
the left factoring, we get the productions :
S →aS’
S’ → Sa | a
Basic Parsing Techniques

Parsing is a technique that takes input string and

produces output either a parse tree if string is
valid sentence of grammar, or an error message
S
indicating that string is not a valid.
id = ;
id1= x=a+b*c;
id2+id2*id3; Parser E
Example:
E + T
S id|=|E|;
E E+T|T
T T * F
T T*F|F
F F id
F id
id id
Types of parsing:

1) Top down parsing - parser build parse tree from top

to bottom.
2) Bottom up parsing- parser starts from leaves and
Example: S →aABe, A → Abc|a, B →d Input: aabcde
work up to the root.
S S
Decision
a A e When to reduce?
B A
B
Decision
A b c Which production
to use? A
a d
a a b c d e
(Leftmost Derivation)
(Rightmost Derivation)- in reverse
Top down parsing approach
Bottom up parsing approach
Classification of Parsing Methods
Summary:

 Definition of parser
 Ways of generating Parse tree
 Classification of parsers
Top down Parsing - Backtracking
• Backtracking is top down parsing method that involve repeated scans of the
input.
• If any mismatch occurs then we try another alternative.

• Backtracking provides flexibility to handle ambiguous grammars or

situations where the parser encounters uncertainty in choosing the correct
production rule.
• Grammar: S  cAd Input string: cad
A  ab|a
Top down Parsing - LL(1) parser (predictive parser)

• LL(1) is non recursive top down parser.

1. First L indicates input is scanned from left to
right.
2. The second L means it uses leftmost
derivation for input string
3. 1 means it uses only input symbol to predict
the parsing process.
Top down Parsing - LL(1) parsing (predictive parsing)

• Steps to construct LL(1) parser:

1. Remove left recursion / Perform left factoring

(if any).
2. Compute FIRST and FOLLOW of non terminals.
3. Construct predictive parsing table.
4. Parse the input string using parsing table.
Top down Parsing - LL(1) parsing (predictive parsing)
• Rules to compute first of non terminal
• FIRST - is the set of terminals that appear at the beginning of some string
derived from that non-terminal.
1. If 𝐴 → 𝛼 and 𝛼 is terminal, add 𝛼 to 𝐹𝐼𝑅𝑆𝑇(𝐴).
2. If 𝐴 → ∈, add ∈ to 𝐹𝐼𝑅𝑆𝑇(𝐴).

3. If 𝑋 is nonterminal and 𝑋𝑌1 𝑌2 … . 𝑌𝑘 is a production, then place 𝑎 in

𝐹𝐼𝑅𝑆𝑇(𝑋) if for some 𝑖, a is in 𝐹𝐼𝑅𝑆𝑇(𝑌𝑖), and 𝜖 is in all of 𝐹𝐼𝑅𝑆𝑇(𝑌1),…

, 𝐹𝐼𝑅𝑆𝑇(𝑌𝑖−1 );that is 𝑌1 … 𝑌𝑖−1 ⇒ 𝜖. If 𝜖 is in 𝐹𝐼𝑅𝑆𝑇(𝑌𝑗) for all 𝑗 =

1,2, . . , 𝑘 then add 𝜖to 𝐹𝐼𝑅𝑆𝑇(𝑋).

Everything in 𝐹𝐼𝑅𝑆𝑇(𝑌1) is surely in 𝐹𝐼𝑅𝑆𝑇(𝑋) If 𝑌1 does not derive 𝜖

, then we
Top down Parsing - LL(1) parsing (predictive parsing)

Simplification of Rule 3

• If 𝐴 → 𝑌1𝑌2 … … . . 𝑌𝐾 ,

– If 𝑌1 does not derives ∈ 𝑡ℎ𝑒𝑛,

𝐹𝐼𝑅𝑆𝑇(𝐴) = 𝐹𝐼𝑅𝑆𝑇(𝑌1)

– If 𝑌1 derives ∈ 𝑡ℎ𝑒𝑛,

𝐹𝐼𝑅𝑆𝑇 (𝐴) = 𝐹𝐼𝑅𝑆𝑇 (𝑌1 )− 𝜖 U 𝐹𝐼𝑅𝑆𝑇(𝑌2)

– If 𝑌1 & Y2 derives ∈ 𝑡ℎ𝑒𝑛,

𝐹𝐼𝑅𝑆𝑇 (𝐴) = 𝐹𝐼𝑅𝑆𝑇 (𝑌1 ) − 𝜖 U 𝐹𝐼𝑅𝑆𝑇(𝑌2) − 𝜖

𝑈 𝐹𝐼𝑅𝑆𝑇(𝑌3)
Top down Parsing - LL(1) parsing (predictive parsing)

• If 𝐴 → 𝑌1𝑌2 … … . . 𝑌𝐾 ,
Simplification of Rule 3

– If 𝑌1 , Y2 & Y3 derives ∈ 𝑡ℎ𝑒𝑛,

𝐹𝐼𝑅𝑆𝑇 (𝐴) = 𝐹𝐼𝑅𝑆𝑇 (𝑌1) − 𝜖 𝑈 𝐹𝐼𝑅𝑆𝑇(𝑌2) − 𝜖

𝑈 𝐹𝐼𝑅𝑆𝑇(𝑌3) − 𝜖 𝑈 𝐹𝐼𝑅𝑆𝑇(𝑌4)

• If 𝑌1 , Y2 , Y3 …..YK all derives ∈ 𝑡ℎ𝑒𝑛,

𝐹𝐼𝑅𝑆𝑇 (𝐴) = 𝐹𝐼𝑅𝑆𝑇 (𝑌1) − 𝜖 𝑈 𝐹𝐼𝑅𝑆𝑇(𝑌2) − 𝜖

𝑈 𝐹𝐼𝑅𝑆𝑇(𝑌3) − 𝜖 𝑈 𝐹𝐼𝑅𝑆𝑇(𝑌4) − 𝜖 𝑈 … … … …

𝐹𝐼𝑅𝑆𝑇(𝑌𝑘) (note: if all non terminals derives ∈ then

add ∈ to FIRST(A))
Top down Parsing - LL(1) parsing (predictive parsing)
Rules to compute FOLLOW of non terminal
• The FOLLOW set of a non-terminal is the set of terminals that can appear
immediately to the right of that non-terminal in some "sentential“ form.

1. Place $ 𝑖𝑛 𝑓𝑜𝑙𝑙𝑜𝑤(𝑆) . (S is start symbol)

2. If A → 𝛼𝐵𝛽 , then everything in 𝐹𝐼𝑅𝑆𝑇(𝛽) except for 𝜖 is
placed in 𝐹𝑂𝐿𝐿𝑂𝑊(𝐵)
3. If there is a production A → 𝛼𝐵 or a production A → 𝛼𝐵𝛽,
where 𝐹𝐼𝑅𝑆𝑇(𝛽) contains 𝜖 then everything in
F𝑂𝐿𝐿𝑂𝑊(𝐴) = 𝐹𝑂𝐿𝐿𝑂𝑊(𝐵)
Top down Parsing - LL(1) parsing (predictive parsing)
Top down Parsing - LL(1) parsing (predictive parsing)

Rules to construct predictive parsing table:

1. For each production 𝐴 → 𝛼 of the grammar, do steps
2 and 3.
2. For each terminal 𝑎 in 𝑓𝑖𝑟𝑠𝑡(𝛼), Add 𝐴 → 𝛼 to 𝑀[𝐴,
𝑎].
3. If 𝜖 is in 𝑓𝑖𝑟𝑠𝑡(𝛼), Add 𝐴 → 𝛼 to 𝑀[𝐴, 𝑏] for each
terminal 𝑏 in 𝐹𝑂𝐿𝐿𝑂𝑊(𝐵). If 𝜖 is in 𝑓𝑖𝑟𝑠𝑡(𝛼),
and $ is in 𝐹𝑂𝐿𝐿𝑂𝑊(𝐴), add 𝐴 → 𝛼 to 𝑀[𝐴, $].
4. Make each undefined entry of M be error.
Example - predictive parsing - LL(1) parsing
Grammar G: A productions A A1| A2|…. |An|1| 2|…| m
E E + E| T where  do not start by A.
T T * F|F Replaced by,
F (E)|id A 1A’|2A’|….|mA’

A’ 1 A’|1 A’|. . . |nA’|

Example - predictive parsing - LL(1) parsing
Grammar G:
E E + E| T
T T * F|F
F (E)|id
2) Compute FIRST and FOLLOW:
FOLLOW: If following the variable, you have:
FIRST: If A→, then FIRST(A) ={}
Terminal – write it as it is.
If a production A→a, then FIRST(A)={a}
Non-terminal – write its first elements.
If a production is A→XYZ, then
Last element – write follow of LHS.
1) FIRST(A)= FIRST(X), if first(X) not
contain . If FIRST(X) contains ,
3) Compute FOLLOW:
2) FIRST(A) = FIRST(X) –{} FIRST(Y)
FOLLOW(E) ={), $ }
2) Compute FIRST: FOLLOW(E’)={), $ }
FIRST(E) = FIRST(T) =FIRST(F) ={(,id} FOLLOW(T) ={+, ), $ }
FIRST(E’) ={+, } FOLLOW(T’) = {+,),$ }
FIRST(T’) ={*,} FOLLOW(F) ={*, +,), $}
Example - predictive parsing - LL(1) parsing

+ * ( ) Id $
E E→TE’ E→TE’
E’ E’→+TE’ E’→ E’→
T T→FT’ T→FT’
T’ T’→ T’→*FT’ T’→ T’→
F F →(E) F→ id
Example - predictive parsing - LL(1) parsing

+ * ( ) Id $
E E→TE’ E→TE’
E’ E’→+TE’ E’→ E’→
T T→FT’ T→FT’
T’ T’→ T’→*FT’ T’→ T’→
F F →(E) F→ id

Explanations: FIRST(E), FIRST(T)and FIRST(F) contains {(, id}, hence place E,T& F
productions in respective terminals and FIRST(E’) &FIRST(T’) contains  then place E’
→ & T’→ in FOLLOW of E’ and T’.
Top down Parsing - Recursive Descent Parsing
• Recursive descent parser executes a set of
recursive procedure to process the input without
backtracking.
– There is a procedure for each non terminal in the
grammar.
– Consider RHS of any production rule as definition of
the procedure.
– As it reads expected input symbol, it advances input

pointer to next position.

Example - Recursive Descent Parsing
Bottom up parsing
Bottom-up parsing is a technique for analyzing the syntax of
programming languages, starting from the leaf nodes (tokens) and working
towards the root of the parse tree.

Here, we start from a sentence and then apply production rules in reverse
manner in order to reach start symbol.

Bottom up parsers are classified as:

Shift –Reduce

LR Parsing

SLR Parsing

CLR Parsing

LALR Parsing
Bottom up parsing - Shift-Reduce Parsers
Shift-reduce parsers are a type of bottom-up parser used in syntax analysis of
programming languages.

They operate by shifting input symbols onto a stack and reducing them to grammar rules
when a rule can be applied.

Ex.- Consider the grammar:

S → aABe, A → Abc | b, B → d

The sentence to be recognized is abbcde.

Reduction (Leftmost) Rightmost Derivation

abbcde (A → b) S →aABe
aAbcde ( A → Abc)
aAde (B → d)
aABe (S → aABe)
S

The reduction trace out the right – most derivation in reverse.

Bottom up parsing - Shift-Reduce Parsers
Handles - is a substring that matches the right side of a production, and whose

reduction to the non-terminal on the left side of the production.

 Example:

– Consider the grammar:

E → E+E
E → E*E
E →(E)
E → id
Input string id1+id2*id3

In the above derivation the underlined substrings are called handles.

Bottom up parsing - SHIFT-REDUCE PARSING
Key Operations in shift-reduce parser:
– Shift – The next input symbol is shifted onto the top
of the stack.
– Reduce – The parser replaces the handle within a
stack with a non-terminal.
– Accept – The parser announces successful
completion of parsing.
– Error – The parser discovers that a syntax error has
occurred and calls an error recovery routine.

Stack i/o string Actions

1 $ id*(id+id)

Stack
Bottom up parsing - SHIFT-REDUCE PARSING
Stack i/o string Actions
1 $ id*(id+id)$ Shift
2 $id *(id+id)$ Shift
3 $E *(id+id)$ Reduce (E →id)
4 $E* (id+id)$ Shift
5 $E*( id+id)$ Shift
6 $E*(id +id)$ Shift
7 $E*(E +id)$ Reduce (E →id)
id
E
) 8 $E*(E+ id)$ Shift
+++ 9 $E*(E+ id )$ Shift
id
EEE
E 10 $E*(E+ E )$ Reduce (E →id)
E
((( 11 $E*(E )$ Reduce (E →E+E)
***
* 12 $E*(E) $ Shift
E
E
id
EE 13 $E*E $ Reduce (E →(E))
$$$$$ 14 $E $ Accept
LR Parser

• LR(0) Parsers: non recursive, shift reduce, bottom up parser.

• LR parsers are also known as LR(k) parsers, where L stands for left
–to – right scanning of the input stream; R stands for the
construction of right –most derivation in reverse, and k denotes the
number of look ahead symbols to make decisions.
• LR(0) parsers can handle a smaller set of grammars.
• There are three widely used algorithms available for constructing an
LR parser as shown:
Types of LR Parser
• SLR(1) – Simple LR Parser:
– Works on smallest class of grammar
– Few number of states, hence very small table.
– Simple and fast construction.
• CLR(1) – Canonical LR Parser:
– Works on complete set of LR(1) grammar.
– Generates large table and large number of states.
– Slow construction.
• LALR(1) – Look-Ahead LR Parser:
– Works on intermediate size of grammar.
– Number of states are same as in SLR(1).
Parser Type Lookahead Power Complexity
LR(0) 0 tokens Low Low
SLR Follow sets Medium Medium
CLR (LR(1)) 1 token High High
LALR 1 token Medium-High Medium
Types of LR Parser
– Example#1: LR(0) parsing for the grammar G:
S → AA
A →aA | b
– Find augmented grammar

S S1 → S .
S →AA.

A A
S → A .A
A → . aA | . b
a a

b A → a.A |.b A
A → aA .
A → . aA|.b
b a
b
A →b .
S4
Types of LR Parser S-denotes shift
action and
– Example#1: LR(0) parsing for the grammar G:
4-indicates state
(1) S → AA number
(2) A →aA | b(3)
– Find prepare LR(0) parsing table
Action Goto
a b $ A S
0 S3 S4 2 1
1 Accept
2 S3 S4 5
3 S3 S4 6
4 r3 r3 r3
5 r1 r1 r1
6 r2 r2 r2

r3(r-denotes reduce action, States,

3-indicates production number) I0,I1…I6
SLR uses reduce action in
Follow of LHS . So we find
Types of SLR Parser – simple LR FOLLOW(S)={$}and FOLLOW(A)
={$}. We fill r1, r2 and r3 under $.
– Example#1: LR(0) parsing for the grammar:
1) S → A
2) S → a
3) A →a
– Find prepare SLR parsing
– Find augmented grammar
table
I1
S1 →S. Action Goto
I0 S
a $ S A

states
S 1 → .S
A
I2
S → . A/. a
S → A. 0 s3 1 2
A→. a
a 1 accept
I3 2 r1
S → a. 3 r1/r2
A → a.
If there is 2 entries in
same, we call it conflicts.
Meaning the parser is not
SLR parser.
Types of SLR Parser – CLR(1) & LALR(1)
– Example#1: CLR(1) parsing for the grammar:
S → AA
A →aA/b
– Find augmented grammar
Operator-precedence Parsing
The operator-precedence parser is a shift –reduce parser
that can be easily constructed by hand.
Operator precedence parser can be constructed from a
grammar called Operator-grammar.
These grammars have the property that
no production ε

no two adjacent nonterminal in the production.

Example: Consider the grammar:
E → EAE | (E) | -E | id
A→+|-|*|/|↑
Since the right side EAE has three consecutive non-
terminals, the grammar can be written as follows: E → E+E
| E-E | E*E | E/E | E↑E | -E | id
Operator-precedence Parsing
Operator-precedence relation:
In operator-precedence parsing, there are three disjoint
precedence relations namely:
<• - less than
=• - equal to
•> - greater than
The relations give the following meanings:
Bottom up parsing - Operator-precedence Parsing
Leading and Trailing:
Leading - refers to the set of terminals that can appear at the
beginning of strings derived from a non-terminal.
Trailing - refers to the set of terminals that can appear at the end of
strings derived from a non-terminal.

Example:
Given the production rules:  Leading(E): { ( , id }  Trailing(E): { +, *, ) , id }
1. E → E + T | T  Leading(T): { ( , id }  Trailing(T): { *, ) , id }
2. T → T * F | F
3. F → ( E ) | id  Leading(F): { ( , id }  Trailing(F): { ) , id }
Bottom up parsing - Operator-precedence Parsing
Example#2: Operator-precedence relations for the grammar
S→ a |  | (T)
T → T,S | S , is given in the following table
Step 01: Compute LEADING
– LEADING(S) = {a, ,( } Operator Precedence Relation Table
a  ( ) , $
– LEADING (T) = {, , a, , ( }
a
Step 02: Compute the TRAILING >. >. .>
– TRAILING(S) ={a, ,) }  >. >. .>
– TRAILING(T) ={, , a, ,) }( <. <. <. =. <.
) >. >. .>
, <. <. <. >. >.
$ <. <. <.
Syntax Error Handling
 Most programming language specifications do not describe how
a compiler should respond to errors
 Planning the error handling right from the start can both

– simplify the structure of a compiler and improve its handling

of errors.
 Error handler goals:
– Report the presence of errors clearly and accurately
– Recover from each error quickly enough to detect
subsequent errors
– Add minimal overhead to the processing of correct programs
Syntax Error Handling
• A good compiler should assist in identifying and locating errors

1) Lexical errors: occurs when the compiler does not recognize a sequence of
characters as a proper lexical token.

– Exceeding length of identifier or numeric constants.

– The appearance of illegal characters

– Unmatched string,(such as 2ab is not a valid C token)

– Example : printf("Geeksforgeeks");$

2) Syntax errors: misplaced semicolons, extra or missing braces; that is, " { "
or " } " Example : swich(ch)
{
• Typical syntax errors are:
.......
– Errors in structure
.......
– Missing operator
}
– Misspelled keywords
The keyword switch is incorrectly
– Unbalanced parenthesis
written as a swich. Hence, an
• Example - int 2; “Unidentified keyword/ identifier”
Syntax Error Handling
3) Semantic errors: type mismatches between operators
and operands.

• Typical semantic errors are

– Incompatible type of operands

– Undeclared variables

– Not matching of actual arguments with a formal one

4) Logical errors: hard or impossible to detect.

– In c programming use assignment operator = instead

of the comparison operator ==.

Error Recovery Strategies-
1) Panic mode
– Discard input until a token in a set of designated
synchronizing tokens is found.
– On discovering an error, the parser discards input
symbols one at a time until semicolon or end.
• Example: In case of an error like: a=b + c // no semi-colon
d=e + f ;

d=e + f
2) Phrase-level recovery

– Perform local correction on the input to repair the

error.
– The correction may be
• Replacing a prefix by some string
• Replacing comma by semicolon
• Inserting missing semicolon
Error Recovery Strategies-

3) Error productions - the parser is constructed using augmented

grammar with error productions.
– If an error production is used by the parser, appropriate
error diagnostics can be generated to indicate the
erroneous constructs recognized An
by the input. grammar is a
augmented
– Example – write 5X instead of 5*X.
grammar that has conditions
expressed using features
– Given a grammar: added to its productions.
S A Features can be associated
with any nonterminal symbol
A aA/bA/a/b
in a derivation.
B  cd
– Suppose the input string is: abcd. Using the grammar we
can’t derive the string abcd. So we use augmented
grammar as
E  SB
SA
Error Recovery Strategies-
4) Global correction
– Choose a minimal sequence of changes to obtain a
global least-cost correction.
– Given an incorrect input string x and grammar G, certain
algorithms can be used to find a parse tree for a string y,
such that the number of insertions, deletions and
changes of tokens is as small as possible.
– However, these methods are in general too costly in
terms of time and space.
Exercises
Question : Consider the following statements about the context free grammar
G = {S -> SS, S -> ab, S -> ba, S -> ?}

I. G is ambiguous
II. G produces all strings with equal number of a’s and b’s
III. G can be accepted by a deterministic PDA
Which combination below expresses all the true statements about G?

A. I only

B. I and III only

C. II and III only

D. I, II and III
Exercises
Solution : There are different LMD’s for string abab which can be
S => SS => SSS => abSS => ababS => abab
S => SS => abS => abab, So the grammar is ambiguous. Therefore statement I is true.
Statement II states that the grammar G produces all strings with equal number of a’s and b’s but it
can’t generate aabb string. So statement II is incorrect.
Statement III is also correct as it can be accepted by deterministic PDA. So correct option is (B).

Question : Which one of the following statements is FALSE?

A. There exist context-free languages such that all the context-free grammars generating
them are ambiguous.
B. An unambiguous context free grammar always has a unique parse tree for each string of
the language generated by it.
C. Both deterministic and non-deterministic pushdown automata always accept the same set
of languages.
D. A finite set of string from one alphabet is always a regular language.
Exercises

Solution : (A) is correct because for ambiguous CFL’s, all CFG corresponding to it are
ambiguous.

(B) is also correct as unambiguous CFG has a unique parse tree for each string of the
language generated by it.

(D) is also true as finite set of string is always regular.

So option (C) is correct option.

50 Multiple Choice
83% (6)
50 Multiple Choice
11 pages
Linear Algebra
From Everand
Linear Algebra
Georgi E. Shilov
2.5/5 (3)
Chapter 4 Syntax Analysis
No ratings yet
Chapter 4 Syntax Analysis
95 pages
Chapter 4 Syntax Analysis
No ratings yet
Chapter 4 Syntax Analysis
90 pages
Unit 2
No ratings yet
Unit 2
86 pages
Chapter Three Context Free Grammar
No ratings yet
Chapter Three Context Free Grammar
55 pages
Unit-3 Syntax Analysis
No ratings yet
Unit-3 Syntax Analysis
319 pages
2nd Phase Syntax Analyzer -1
No ratings yet
2nd Phase Syntax Analyzer -1
136 pages
Unit 2 - Sessions 1 - 2
No ratings yet
Unit 2 - Sessions 1 - 2
36 pages
ATCD PPT Module-3
No ratings yet
ATCD PPT Module-3
136 pages
Compiler Design CS_4
No ratings yet
Compiler Design CS_4
70 pages
09 Parsing
No ratings yet
09 Parsing
11 pages
Grammar and Language: Grammar: It Is System That Specifies
No ratings yet
Grammar and Language: Grammar: It Is System That Specifies
40 pages
3 Role of Parser
No ratings yet
3 Role of Parser
135 pages
Theory of Computation: Madhav Institute of Technology and Science
No ratings yet
Theory of Computation: Madhav Institute of Technology and Science
38 pages
Toc 3
No ratings yet
Toc 3
65 pages
Cp 324 Grammars l4
No ratings yet
Cp 324 Grammars l4
19 pages
Grammar
No ratings yet
Grammar
57 pages
Unit 2 - CFG
100% (1)
Unit 2 - CFG
65 pages
Unit Iii
No ratings yet
Unit Iii
95 pages
Lecture 6 (6-2-23)
No ratings yet
Lecture 6 (6-2-23)
9 pages
Unit 2 - Sessions 1 - 2
No ratings yet
Unit 2 - Sessions 1 - 2
133 pages
2-Role of Parser and Parse Tree-02!08!2024
No ratings yet
2-Role of Parser and Parse Tree-02!08!2024
69 pages
Samir Cfg
No ratings yet
Samir Cfg
105 pages
4th - Syntax Analysis
No ratings yet
4th - Syntax Analysis
29 pages
Context Free Grammar (CFG)
No ratings yet
Context Free Grammar (CFG)
18 pages
4.types of Grammars
No ratings yet
4.types of Grammars
40 pages
Chapter 4
No ratings yet
Chapter 4
23 pages
Chapter 3 (2)
No ratings yet
Chapter 3 (2)
41 pages
Automata Chapter 3
No ratings yet
Automata Chapter 3
14 pages
Grammar
No ratings yet
Grammar
44 pages
Class Three
No ratings yet
Class Three
74 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
Lesson 3: Syntax Analysis: Risul Islam Rasel
No ratings yet
Lesson 3: Syntax Analysis: Risul Islam Rasel
106 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
Chapter 4 - Context-Free Grammars and Languages
No ratings yet
Chapter 4 - Context-Free Grammars and Languages
60 pages
Chapter 3 Context Free Language
No ratings yet
Chapter 3 Context Free Language
84 pages
PPT_203105351_2
No ratings yet
PPT_203105351_2
66 pages
MITWPU - Unit 3-Theory of Computation
No ratings yet
MITWPU - Unit 3-Theory of Computation
72 pages
Types of Grammars:: Grammar
No ratings yet
Types of Grammars:: Grammar
10 pages
Formal Languages and Automata Theory: CH 4: Context Free Languages
No ratings yet
Formal Languages and Automata Theory: CH 4: Context Free Languages
59 pages
LanguagesandGrammars Unit 3
No ratings yet
LanguagesandGrammars Unit 3
65 pages
II. Parser: Syntax Analysis
No ratings yet
II. Parser: Syntax Analysis
18 pages
ATCD UT3 Material
No ratings yet
ATCD UT3 Material
20 pages
Unit-4 Context Free Grammar
No ratings yet
Unit-4 Context Free Grammar
106 pages
CSE322 #Automata Full Unit - 4 Context Free Languages (@rajkumar)
No ratings yet
CSE322 #Automata Full Unit - 4 Context Free Languages (@rajkumar)
74 pages
[Week 4] Syntax Analysis (CFG)
No ratings yet
[Week 4] Syntax Analysis (CFG)
50 pages
Unit-3 Context Free Grammar
No ratings yet
Unit-3 Context Free Grammar
57 pages
Unit Iv Context Free Languages
No ratings yet
Unit Iv Context Free Languages
74 pages
Chapter 1 Basics and Grammar 99
No ratings yet
Chapter 1 Basics and Grammar 99
10 pages
09 CFL
100% (1)
09 CFL
62 pages
Motivation For Formal Grammars
No ratings yet
Motivation For Formal Grammars
15 pages
Chapter 3 Syntax Analysis (Parsing)
No ratings yet
Chapter 3 Syntax Analysis (Parsing)
29 pages
Compiler Construction Week 04 Syntax Analysis I)
No ratings yet
Compiler Construction Week 04 Syntax Analysis I)
41 pages
Syntax Analysis
No ratings yet
Syntax Analysis
87 pages
Lec02-Syntax Analysis and LL
No ratings yet
Lec02-Syntax Analysis and LL
74 pages
6 CFG
No ratings yet
6 CFG
34 pages
Chapter 3 - Syntax Analysis Part One
No ratings yet
Chapter 3 - Syntax Analysis Part One
10 pages
Lec02-Syntax Analysis and LL
No ratings yet
Lec02-Syntax Analysis and LL
74 pages
2.2 - Syntax Analysis (Upto Top-down Parsing)
No ratings yet
2.2 - Syntax Analysis (Upto Top-down Parsing)
91 pages
Hilbert Space Methods in Partial Differential Equations
From Everand
Hilbert Space Methods in Partial Differential Equations
Ralph E. Showalter
4.5/5 (2)
Ddos Basic Handbook. Written by Bader Alresheedi DM at Twitter: @baderalresheedi
No ratings yet
Ddos Basic Handbook. Written by Bader Alresheedi DM at Twitter: @baderalresheedi
1 page
Stages and Process of Counseling
No ratings yet
Stages and Process of Counseling
14 pages
Hand book for K5D (1)
No ratings yet
Hand book for K5D (1)
139 pages
My Husband Is A Mafia Boss Season 3 Yana Wattpadtxt
No ratings yet
My Husband Is A Mafia Boss Season 3 Yana Wattpadtxt
59 pages
Syllabus PDF
No ratings yet
Syllabus PDF
2 pages
Types of Friends Essay
100% (2)
Types of Friends Essay
7 pages
Unit 7 (Invertebrates)
No ratings yet
Unit 7 (Invertebrates)
7 pages
Mahasweta Devi
No ratings yet
Mahasweta Devi
4 pages
Digital Landscape Architecture Logic Structure Method and Application Yuning Cheng all chapter instant download
100% (2)
Digital Landscape Architecture Logic Structure Method and Application Yuning Cheng all chapter instant download
40 pages
VMG Myanmar Company Profile
No ratings yet
VMG Myanmar Company Profile
10 pages
Q1. Write A Note On The Significance of Handwork.: "The Hand Is The Instrument of The Mind." - Dr. Maria Montessori
No ratings yet
Q1. Write A Note On The Significance of Handwork.: "The Hand Is The Instrument of The Mind." - Dr. Maria Montessori
3 pages
La Ley Del Cannabis en Uruguay: Pionera en Un Nuevo Paradigma
No ratings yet
La Ley Del Cannabis en Uruguay: Pionera en Un Nuevo Paradigma
24 pages
Metodologia 1
0% (1)
Metodologia 1
81 pages
IPASS-CBT MOCK TEST 5
No ratings yet
IPASS-CBT MOCK TEST 5
18 pages
Buy ebook Lonely Planet Pocket Abu Dhabi Lonely Planet cheap price
100% (1)
Buy ebook Lonely Planet Pocket Abu Dhabi Lonely Planet cheap price
62 pages
The Hunt For The Oliphaunt Part 2
No ratings yet
The Hunt For The Oliphaunt Part 2
7 pages
NOC Required (1) - DAMODAR KARMAKAR
No ratings yet
NOC Required (1) - DAMODAR KARMAKAR
4 pages
FIA
No ratings yet
FIA
1 page
Controlled Assesment final
No ratings yet
Controlled Assesment final
13 pages
Get - Out.2017.1080p.bluray.x264 SPARKS
No ratings yet
Get - Out.2017.1080p.bluray.x264 SPARKS
92 pages
Employment Contract Form
No ratings yet
Employment Contract Form
2 pages
Module-4 Notes (1)
No ratings yet
Module-4 Notes (1)
14 pages
Time Life Art of Woodworking Shaker Furniture
100% (2)
Time Life Art of Woodworking Shaker Furniture
147 pages
Hospital Staff 2025-26
No ratings yet
Hospital Staff 2025-26
3 pages
Hyatt Oyster Chips Group9
No ratings yet
Hyatt Oyster Chips Group9
45 pages
The Importance of Reading
No ratings yet
The Importance of Reading
5 pages
FINAL SOCIAL PROJECT REPORT - Adi PDF
100% (1)
FINAL SOCIAL PROJECT REPORT - Adi PDF
42 pages
SSAS Master
100% (1)
SSAS Master
16 pages