Unit 2-Part B
Unit 2-Part B
◦ Viable prefix property detects that an error has occurred as soon as they see
a prefix of input that cannot be completed to form a string of language.
◦ Syntactic errors will be detected when stream of token coming from lexical
analyzer does not match with grammatical rules.
Function of error handler in the parser is
1. Should report presence of errors clearly and accurately by specifying line
number of the error in source program
2. Should recover from each error as fast as possible so that subsequent errors
can be detected
3. Should not slow down processing of correct programs.
3. Error Production
◦ If user has knowledge of common errors that can be encountered then, these errors can
be incorporated by augmenting the grammar with error productions that generate
erroneous constructs.
◦ If this is used then, during parsing appropriate error messages can be generated and
parsing can be continued.
◦ Disadvantage is that its difficult to maintain.
4. Global correction
◦ The parser examines the whole program and tries to find out the closest match for it
which is error free.
◦ The closest match program has less number of insertions, deletions and changes of
tokens to recover from erroneous input.
◦ Due to high time and space complexity, this method is not implemented practically.
Construction of parse tree for the i/p string starting from root and
creating nodes of the parse tree in preorder (depth first)
Leftmost derivation for an i/p string
Terminals are basic symbols from which strings are formed
Leftmost derivation of input
.
Recursive Descent Parsing
Recursive descent is a top-down parsing technique that constructs the parse tree
from the top and the input is read from left to right.
It uses procedures for every terminal and non-terminal entity.
This parsing technique recursively parses the input to make a parse tree, which may
or may not require back-tracking.
But the grammar associated with it (if not left factored) cannot avoid back-tracking.
A form of recursive-descent parsing that does not require any back-tracking is
known as predictive parsing.
This parsing technique is regarded recursive as it uses context-free grammar which
is recursive in nature
Back-tracking
Top- down parsers start from the root node (start symbol) and match the input
string against the production rules to replace them (if matched)
Predictive Parser
Predictive parser is a recursive descent parser, which has the capability to predict
which production is to be used to replace the input string.
The predictive parser does not suffer from backtracking.
To accomplish its tasks, the predictive parser uses a look-ahead pointer, which
points to the next input symbols.
To make the parser back-tracking free, the predictive parser puts some constraints
on the grammar and accepts only a class of grammar known as LL(k) grammar.
Predictive parsing uses a stack and a parsing table to parse the input and generate a
parse tree.
Both the stack and the input contains an end symbol $ to denote that the stack is
empty and the input is consumed.
The parser refers to the parsing table to take any decision on the input and stack
element combination.
In recursive descent parsing, the parser may have more than one production to
choose from for a single instance of input, whereas in predictive parser, each step
has at most one production to choose. There might be instances where there is no
production matching the input string, making the parsing procedure to fail.
LL Parser
An LL Parser accepts LL grammar. LL grammar is a subset of context-free
grammar but with some restrictions to get the simplified version, in order to
achieve easy implementation. LL grammar can be implemented by means of both
algorithms namely, recursive-descent or table-driven.
LL parser is denoted as LL(k). The first L in LL(k) is parsing the input from left to
right, the second L in LL(k) stands for left-most derivation and k itself represents
the number of look aheads. Generally k = 1, so LL(k) may also be written as LL(1).
Grammar
E → T E’
E’ → +T E’ | ε
T → F T’
T’ → * F T’ | ε
F → ( E) | id
Top down parser for id+id*id
Advantage
◦ Parser can be constructed easily by hand using top down methods
array
[ simple ] of type
integer
Difficulties in Top Down Parsing
1. Left Recursion
◦ Grammar G is left recursive if it has a non terminal A such that there is a derivation A => A α for
some α
◦ Causes top down parser to go into infinite loop
◦ Left recursion must be eliminated from grammar before parsing using top down parser
2.Backtracking
◦ Choosing a wrong alternative
◦ If erroneous expansions are made and subsequently discovers a mismatch then we have to undo
all these erroneous expansions
◦ Involves exponential time complexity with respect to length of input
◦ To overcome consider a top down parsing that do not do backtracking
◦ eg : predictive parsers
3. Order of Alternatives
◦ Order in which alternatives are used affects language accepted S
◦ S → cAd
◦ A → ab | a c A d
◦ String is cabd
Grammar fails to accept string cabd . ca is already matched but failure a
occurs for the match of b with d
4. Report of failures
◦ When failure is reported, very little idea of where the error actually occurred.
◦ TD parser with backtrack returns failure no matter what the error is
◦ 2. Within newly expanded string, next(leftmost) nonterminal selected for expansion and its
first production is applied
◦ 3. step 2 is repeated for all subsequent non terminal that are selected until such time as
step 2 cannot be continued
◦ 5. If no other production are available to replace the production that caused error, error
causing expansion is replaced by non terminal itself and process is backed up again to
undo next most recently applied production.
ex: Consider the grammar Mismatch between second symbol c of input string
S→ aAd | aB and second symbol in sentinel abd
A→ b| c
B → ccd | ddc String :- accd 4.Backup
S
expression $ ID
S → iEtS | iEtSeS | a
E→b
Here a =iEtS and α2 = eS
The equivalent non left factored grammar
S → iEtSS’ | a
S’ → eS | ε
E→b
Left Recursion Example
Rule
A-> Aα | β
Then
A -> βA’
A’ -> αA’ | ε
Preprocessing steps required for predictive parsing
Predictive parsing associated with two functions associated with grammar
G
FIRST and FOLLOW :- used by Top down and Bottom up parsers
Top down parsing FIRST and FOLLOW allows us to choose which
production to apply based on next input symbol
To compute FIRST set
If α is any sting of grammar symbols then FIRST(α) be set of terminals that
begins string derived from α
If α → ε then ε is also in FIRST(α )
To compute FIRST(X) for all grammar symbols X apply following rules until
no more terminals or ε can be added to FIRST set
To compute FIRST
1. FIRST( E)
As E is a non terminal symbol 4th rule has to be applied. Production for E is E→TE’
Rule : If X→Y1Y2……Yk is a production then for all i such that all Y1Y2……Yi-1 are
non terminals and FIRST(Yj) contains ε for j = 1,2,……i-1 (i.e Y1Y2……Yi-1 → ε) th
en add every non ε symbol in FIRST(Yi) to FIRST(X)
FIRST (E ) = FIRST (T)
Again T is non terminal production for T , T→ FT’ is considered
FIRST (T ) = FIRST(F)
Again F is a non terminal consider F→(E)
FIRST(F) = FIRST( ( )
Here ‘( ‘ is a terminal symbol. The FIRST is applied to a terminal is a terminal itself
3. FIRST(T’)
T’ is a non terminal symbol. Consider the production T’ → *FT’ | ε
FIRST (T’) = FIRST( * ) = {*}
Another alternative production T’ → ε FIRST SET
FIRST( T’) = FIRST(ε) = {ε } FIRST(E) = { ( , id}
FIRST(E’) = { + , ε}
FIRST(T’) = {* , ε}
FIRST(T) ={ ( , id}
FIRST(T’) = { * , ε}
FIRST(F) = { ( , id}
To compute FOLLOW
1. FOLLOW( E)
Check for the presence of non terminal symbol E in right side of production. If present
consider the production
F → ( E)
Grammar
FOLLOW ( E) = { ) , $} E → TE’
E’ → +TE’ | ε
T → FT’
T’ → *FT’ | ε
F → (E ) | id
2. FOLLOW(E’)
Search for E’ on right side of the production
E → TE’ of the form A → α B where A= E, α = T and B = E’
So FOLLOW (T) = { + , ε }
But it contains ε and it should be removed
**As E’ derives to ε i.e E’ → ε then add FOLLOW( E) with the non ε symbols
Now as ε is present add FOLLOW(E ) to the set
RULE : If there is a production A→ αB or a production A → αBβ where FIRST(β)
contains ε then everything in FOLLOW(A) is in FOLLOW(B)
Grammar
FOLLOW( T) = { + , ) , $} E → TE’
E’ → +TE’ | ε
T → FT’
T’ → *FT’ | ε
F → (E ) | id
4. FOLLOW( T’)
Search for T’ on the right side
T → FT’ of the form A → α B where A= T, α = F and B = T’
FOLLOW (T ) = { + , ) , $}
So
FOLLOW( T’) = FOLLOW(T) = { + , ) , $} Grammar
E → TE’
E’ → +TE’ | ε
T → FT’
T’ → *FT’ | ε
F → (E ) | id
5. FOLLOW (F)
Search for F on right side of production
T → FT’ of the form A → αBβ where A = T, α = ε , B = F, β = T’ applying FIRST(β)
i.e FIRST( T’ ) = { * , ε }
So FOLLOW (T’) = { * , ε }
But it contains ε and it should be removed
**As T’ derives to ε i.e T’ → ε then add FOLLOW( T) with the non ε symbols
Now as ε is present add FOLLOW(T ) to the set
FOLLOW(F) = { + ,*, ) , $}
Grammar
FOLLOW SET E → TE’
Follow(E) = { ) , $} E’ → +TE’ | ε
Follow(E’) = { ) , $} T → FT’
Follow(T) ={ +, ) , $} T’ → *FT’ | ε
Follow(T’) = { +, ) , $} F → (E ) | id
Follow(F) = { *, +, ) , $}
Consider the Grammar
E → E+T | T
T→T*F|F
F → (E ) | id
Input : Grammar G
Output : Parsing table M
Method : For each production A → α of the grammar do the following
PARSING TABLE M
Sequence of
grammar symbols
M[ E, ( ] = E → TE’
M[ E, id ] = E → TE’
2. Top element = T
i/p symbol = id is searched for a production
M[ T, id] = T → FT’
Thus FT’ is pushed replacing T
Top element is F
3. Top element = F
i/p symbol = id
Thus M[F , id ] = F→ id
Terminal id is pushed onto stack replacing F
4. Top element = id
i/p symbol = i/p string = id
Match and are popped out.
1. Panic-mode error recovery is based on the idea of skipping symbols on the input
until a token in a selected set of synchronizing tokens.
Synchronizing sets are chosen in such a way that a parser recovers quickly from
errors.
Panic mode recovery mechanism is effective
D. If a nonterminal can generate the empty string, then the production deriving
can be used as a default. This may postpone some error detection, but cannot cause
an error to be missed. This approach reduces the number of nonterminals that have
to be considered during error recovery.
Blank entries in predictive parsing tables are filled with pointers to error routines
These error routines may change , insert or delete symbols on the input and issue
appropriate error messages
RECURSIVE PREDICTIVE DESCENT NON-RECURSIVE PREDICTIVE
PARSER DESCENT PARSER
It is a technique which may or may not require It is a technique that does not require any
backtracking process. kind of back tracking.
It uses procedures for every non terminal entity It finds out productions to use by replacing
to parse strings. input string.
It is a parsing strategy that first looks It is a parsing strategy that first looks at the
at the highest level of the parse tree lowest level of the parse tree and works up
and works down the parse tree by the parse tree by using the rules of
using the rules of grammar. grammar.
This parsing technique uses Left This parsing technique uses Right Most
Most Derivation. Derivation.
It’s main decision is to select what It’s main decision is to select when to use a
production rule to use in order to production rule to reduce the string to get
construct the string. the starting symbol.