0% found this document useful (0 votes)
19 views51 pages

Module-2 1

The document discusses syntax analysis and parsing. It covers topics like context free grammars, derivation trees, ambiguity, elimination of left recursion, predictive parsing and LL(1) grammars. The goal of parsing is to determine if a string can be generated by a grammar by constructing a parse tree.

Uploaded by

AMAL KRISHNA A P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views51 pages

Module-2 1

The document discusses syntax analysis and parsing. It covers topics like context free grammars, derivation trees, ambiguity, elimination of left recursion, predictive parsing and LL(1) grammars. The goal of parsing is to determine if a string can be generated by a grammar by constructing a parse tree.

Uploaded by

AMAL KRISHNA A P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

 Role of the Syntax Analyser – Syntax error handling.

 Review of Context Free Grammars - Derivation and Parse Trees,


 Eliminating Ambiguity.
 Basic parsing approaches - Eliminating left recursion, left factoring.
 Top-Down Parsing - Recursive Descent parsing, Predictive Parsing, LL(1)
Grammars.
SYNTAX ANALYSIS
 The second phase of compiler is syntax analyzer or parser.

 The parser receives a steam of tokens from the lexical analyzer and verifies that the string can
be generated by the grammar for the source language by constructing a parse tree.

 The term parsing comes from Latin word pars which means part of speech.
SYNTAX ANALYSIS Scanner
[Lexical Analyzer]

Tokens

Parser
[Syntax Analyzer]
INTERACTION BETWEEN LEXICAL ANALYZER
AND PARSER
CONTEXT FREE GRAMMAR (CFG)
 Context free grammar is a grammar whose productions are of the form

where A is a non terminal and α is a set of terminals and non terminals (α can be
empty also)

 A formal grammar is "context free" if its production rules can be applied regardless
of the context of a nonterminal.
 No matter which symbols surround it, the single nonterminal on the left hand side
can always be replaced by the right hand side.
 A CFG consist of (NTPS)

 Terminals
 basic symbols from which strings are formed
 tokens
 Non terminals
 nonterminals define sets of strings that help define the language generated by the
grammar
 Production
 Start Symbol
Grammar for simple arithmetic expression
DERIVATION
• A derivation is basically a sequence of production rules, in order to get the input
string.

• Beginning with the start symbol, each replaces a non terminal by the body of one of
its productions.

• Types:

• Left Most Derivation - In left most derivation, the left most non terminal is replaced in each step

• Right Most Derivation - In right most derivation, the right most non terminal is replaced in each
step
Consider the grammar
PARSE TREE
 Parse tree is a hierarchical structure which represents the derivation of the grammar to yield
input strings.

 Simply it is the graphical representation of derivations.

 Derivation tree

 Parsing is the process of determining if a string of token can be generated by a grammar.


 Yield of the parse tree

 The leaves of the parse tree are labeled by non-terminals or terminals and read
from left to right, they constitute a sentential form, called the yield or frontier of
the tree.
 Parsing is the process of determining if a string of token can be
generated by a grammar.

 2 approaches
 Top Down Parsing - In top down parsing, parse tree is constructed from top (root) to the
bottom (leaves).

 Bottom Up Parsing - In bottom up parsing, parse tree is constructed from bottom


(leaves)) to the top (root).
Top Down Parsing Bottom Up Parsing
 Top down parsing can be viewed as an attempt to find a
leftmost derivation for an input string (that is expanding the
leftmost terminal at every step).

 TDP approaches:

 Recursive Descent Parser

 Predictive Parser
RECURSIVE DESCENT PARSING
IMPLEMENTATION
 Procedure S()
{ if nextsymbol = ‘c’
{ A();
if nextsymbol = ‘d’
return success;
}  Procedure A()
} { if nextsymbol = ‘a’
{ if nextsymbol = ‘b’
return;
else return;
}
error;
}
 It is the most general form of top-down parsing.

 It may involve backtracking, that is making repeated scans of input, to


obtain the correct expansion of the leftmost non-terminal.

 Unless the grammar is ambiguous or left-recursive, it finds a suitable


parse tree
Drawbacks of RDP

 A left-recursive grammar can cause a recursive-descent parser, to go into an infinite loop. That is when
we try to expand A, we may find ourselves again trying to expanding A, without having consumed any
input.

 Recursive-descent parsers are not very common as programming language constructs can be parsed
without using backtracking.

 Not suitable with ambiguous grammar


24
PREDICTIVE PARSER
 Predictive parser has the capability to predict which alternative production is to
be used to replace the input string.

 A predictive parsing is a special form of recursive-descent parsing, in which


the current input token unambiguously determines the production to be applied
at each step.

 The goal of predictive parsing is to construct a top-down parser that never


backtracks.
 It is possible to build a non-recursive predictive parser by maintaining a stack explicitly, rather
than implicitly via recursive calls.

Model of non-recursive predictive parser


 Input buffer :
 contains the string to be parsed, followed by $(used to indicate end of input
string)

 Stack:
 initialized with $, to indicate bottom of stack.

 Parsing table:
 2 D array M[A,a] where A is a nonterminal and a is terminal or the symbol $

 The parser is controlled by a program.


28
//Reverse and push into stack
EXAMPLE:
Input : id + id * id
Grammar :
ETE’
E’ +TE’ | є
TFT’
T’*FT’ | є
F(E) | id

30
Moves made by predictive parser for the input id+id*id

31
 Uses 2 functions:
 FIRST()
 FOLLOW()
 These functions allows us to fill the entries of
predictive parsing table

32
RULES TO COMPUTE FIRST SET

1) If X is a terminal , then FIRST(X) is {X}


2) If X--> є then add є to FIRST(X)
3) If X is a non terminal and X-->Y1Y2Y3...Yn , then put 'a' in FIRST(X) if for some i,
a is in FIRST(Yi) and є is in all of FIRST(Y1),...FIRST(Yi-1).
35
FOLLOW

 FOLLOW is defined only for non terminals of the grammar G.


 It can be defined as the set of terminals of grammar G , which can
immediately follow the non terminal in a production rule from
start symbol.
 In other words, if A is a nonterminal, then FOLLOW(A) is the set of
terminals 'a' that can appear immediately to the right of A in some
sentential form

36
RULES TO COMPUTE FOLLOW SET

1. If S is the start symbol, then add $ to the


FOLLOW(S).

2. If there is a production rule A--> αBβ then


everything in FIRST(β) except for є is placed in
FOLLOW(B).

3. If there is a production A--> αB , or a production


A--> αBβ where FIRST(β) contains є then
everything in FOLLOW(A) is in FOLLOW(B).

37
38
 Calculate First and Follow of the given
grammar
S → aBDh
B → cC
C → bC / ∈
D → EF
E→g/∈
F→f/∈
40
44
 A context-free grammar G , whose parsing table has no multiple entries is said to be LL(1).

 LL(l) grammars are the class of grammars from which the predictive parsers can be constructed

 In the name LL(1),

 the first L stands for scanning the input from left to right,

 the second L stands for producing a leftmost derivation,

 and the 1 stands for using one input symbol of lookahead at each step to make parsing
action decision.
Not LL(1)
Grammar
 The goal of predictive parsing is to construct a top-down parser that
never backtracks. To do so, we must transform a grammar in two ways:
 Eliminate Left Recursion
 Perform Left factoring

 These rules eliminate most common causes for backtracking


 The problem is that if we use this production for top-down derivation, we will fall into an
infinite derivation chain. This is called left recursion.

Eliminating Left Recursion


 The left-recursive pair of productions A  Aα|β could be replaced by two non-recursive
productions.
AMBIGUITY

An ambiguous sentence has two or more possible meanings within a single sentence or sequence
of words. This can confuse the reader and make the meaning of the sentence unclear.
AMBIGUOUS GRAMMAR
 An ambiguous grammar is one that produces more
than one leftmost or more than one rightmost
derivation for the same sentence.

 For most parsers, it is desirable that the grammar be


made unambiguous, for if it is not, we cannot
uniquely determine which parse tree to select for a
sentence.

You might also like