Syntax Analysis.pptx
Syntax Analysis.pptx
Syntax Analysis
6th Semester B.Tech. (CSE)
Course Code: 18CS1T08
• This technique may lead to semantic error or runtime error in further stages.
• No guarantee of not to go into an infinite loop.
CD / Module-III / Jasaswi 6 20 February 2024
Syntax Error Recovery Strategies – contd..
2. Phrase Level Recovery
• In this strategy, on discovering an error, parser performs local correction on the
remaining input.
• It may replace a prefix of the remaining input with some string that allows the
parser to continue.
• The local correction can be replacing a comma by a semicolon, delete an extraneous
semicolon or insert a missing semicolon.
• Example:
Every leftmost derivation can be written as 𝑤𝐴𝛾 𝑤𝛿𝛾 where w consists of terminals
𝑙𝑚
only, 𝐴 → 𝛿 is the production applied, and is a string of grammar symbols.
∗
If 𝑆 , then we say that is a left-sentential form of the grammar at hand.
𝑙𝑚
This grammar permits two distinct leftmost derivations for the following sentence:
if E1 then if E2 then S1 else S2
The corresponding parse trees appear in the following figure:
Note: Every else statement should match with closest unmatched then statement.
CD / Module-III / Jasaswi 25 20 February 2024
Lexical Versus Syntactic Analysis
Everything that can be described by a regular expression can also be described by a
grammar.
• Why do we use regular expressions to define the lexical syntax of a language?
These are several reasons:
• Separating the syntactic structure of a language into lexical and non-lexical parts
provides a convenient way of modularizing the front end of compiler into two
manageable-sized components.
• The lexical rules of a language are frequently quite simple and to describe them we
do not need a notation as powerful as grammars.
• Regular expressions generally provide a more concise and easier-to-
understand notations for tokens than grammars.
• More efficient lexical analyzers can be constructed automatically from regular
expressions than from arbitrary grammars.
EE+P
EE+E EP EE+P|P
EE*E PP*Q OR PP*Q|Q
E id PQ Q id
Q id
open_stmt
Example 1
Formula
|𝝐
|𝝐
Formula
LL,S|S
After Eliminating Left recursion:
L S 𝑳′
𝑳′ , S 𝑳′ |
S (L) | a
• 𝑪→𝒂|𝒃 • 𝑨→𝑫𝑬𝑭
FOLLOW(S) = { $ } • 𝑩→𝝐
FOLLOW(A) = FIRST(C) = { a, b } • 𝑪→𝝐
Example - 3
FOLLOW(D) = FOLLOW(S) = { $ } • 𝑫→𝝐
• 𝑬→𝝐
CFG 2:
• 𝑭→𝝐
• 𝑺→𝑨𝒂𝑨𝒃|𝑩𝒃𝑩𝒂
FOLLOW(S) = { $ }
Example - 2
• 𝑨→𝝐
FOLLOW(A) = FIRST(B) =
• 𝑩→𝝐 FIRST(C) = FOLLOW(S) = { $ }
FOLLOW(S) = { $ } FOLLOW(D) = = FIRST(E) =
FOLLOW(A) = { a, b } FIRST(F) = FOLLOW(A) = { $ }
FOLLOW(B) = { a, b }
CD / Module-III / Jasaswi 64 20 February 2024
Finding FOLLOW Set: Example - 4
CFG: FIRST:
𝑺 → 𝒊𝑬𝒕𝑺𝑺′ | 𝒂 FIRST(S) = { i, a }
𝑺′ → 𝒆𝑺 | 𝝐 FIRST(S ′ ) = { e, 𝝐 }
𝑬→𝒃 FIRST(E) = { b }
FOLLOW:
FOLLOW(S) = { $ } { FIRST(S ′ ) } = { $, e }
FOLLOW(S ′ )= FOLLOW(S) = { $, e }
FOLLOW(E) = { t }
FOLLOW:
FOLLOW(E) = { $, ) }
FOLLOW(E′ ) = FOLLOW(E) = { $, ) }
FOLLOW(T) = {FIRST(E ′ ) } FOLLOW(E) FOLLOW(E′ ) = { +, $, )}
FOLLOW(T ′ ) = FOLLOW(T) = { +, $, )}
FOLLOW(F) = {FIRST(T ′ ) } FOLLOW(T) FOLLOW(T ′ ) = { *, +, $, )}
FOLLOW:
FOLLOW(S) = { $ } {FIRST(𝑳′ ) - 𝝐 } FOLLOW(L) FOLLOW(𝑳′ ) = { $, , , ) }
FOLLOW(𝐿) = { ) }
FOLLOW(𝑳′ )= FOLLOW(𝐿) = { ) }
FOLLOW:
FOLLOW(S) = { $ }
FOLLOW(𝐴) = {FIRST(𝑪) - 𝝐 } {FIRST(𝑩) - 𝝐 } FOLLOW(S) = { h, g, $ }
FOLLOW(B) = FOLLOW(S) FIRST(a) { FIRST(C)- 𝝐 } FOLLOW(A) = {$, a, h, g }
FOLLOW(C) = {FIRST(𝑩) - 𝝐 } FOLLOW(S) FIRST(b) FOLLOW(A) = {g, $, b, h }
Parsers
Input: id+ id * id
NOTE: Top down parsing basically finds a leftmost derivation for an input string
Let us construct a parse tree top-down for the input string w = cad
Note that this pseudocode is non-deterministic, since it begins by choosing the A-production to
apply in a manner that is not specified.
CD / Module-III / Jasaswi 90 20 February 2024
Recursive-Descent Parsing – contd…
General recursive descent may require backtracking that is, it may require
repeated scans over the input.
The previous code needs to be modified to allow backtracking.
In general form it cannot choose an appropriate production easily. We need to try
all alternatives.
If one fails, the input pointer needs to be reset and another alternative has to be
tried.
Recursive descent parsers cannot be used for left-recursive grammars since it
can go into an infinite loop.
• When we try to expand a nonterminal A, we may eventually find ourselves
again trying to expand A without having consumed any input.
Parsing Table
CFG
CFG
Non-Terminal FOLLOW
E $, )
𝑬′ $, )
T +, $, )
CFG
Parsing and error recovery moves made by a predictive parser on the erroneous input )id *+id
CD / Module-III / Jasaswi 112 20 February 2024
Error Recovery in Predictive Parsing
Phrase-Level Recovery:
Phrase-level error recovery is implemented by filling in the blank entries in the predictive parsing
table with pointers to error routines.
These routines may change, insert, or delete symbols on the input and issue appropriate error
messages.
• They may also pop from the stack.
• Alteration of stack symbols or Pushing of new symbols onto the stack. This is questionable for
the following reasons:
The steps carried out by the parser might then not correspond to the derivation of any word
in the language at all.
We must ensure that there is no possibility of an infinite loop.