CD - Unit 2
CD - Unit 2
Unit-2
TOP-DOWN PARSING
Role of Parser - Grammars - Error Handling - Context-Free Grammars -
Writing a grammar - Elimination of Ambiguity - Left Recursion - Left
Factoring - Top Down Parsing - Recursive Descent Parser - Predictive Parser -
LL(1) Parser - Computation of FIRST - Computation of FOLLOW -
Construction of a predictive parsing table - Predictive Parsers LL(1)
Grammars - Predictive Parsing Algorithm - Problems related to Predictive
Parser - Error Recovery in Predictive Parsing.
Symbol
table
• Grammars are capable of describing most, but not all, of the syntax
of programming languages.
• It defines the specific rule or structure for defining the languages
• A grammar G is defined by four tuples as G = (V, T, P, S)
where,
G − Grammar
V − Set of variables T − Set of terminals
P − Set of productions S − Start symbol
Types of Grammar
1. Regular Grammar 2. Context Free Grammar
3. Context Sensitive Grammar 4. Phrase Structure Grammar
COMPILER DESIGN _UNIT-2 6
ERROR HANDLING
Common programming errors
● Lexical errors - such as misspelling an identifier, keywords or operator
● Syntactic errors - such as an arithmetic expression with unbalanced
parenthesis
● Semantic errors - such as operator applied to an incompatible operand
● Lexical errors - such as infinitely recursive call
Error handler goals
The syntax analyzer is expected to take the following measures on the occurrence
of the syntax errors:
● Report the presence of errors clearly and accurately
● Recover from each error quickly enough to detect subsequent errors
● Add minimal overhead to the processing of correct programs
11
COMPILER DESIGN _UNIT-2 7
ERROR RECOVERY STRATEGIES
• Error recovery strategies are used by the parser to recover from errors
once it is detected.
• The simplest recovery strategy is to quit parsing with an error message for
the first error itself
• Recovery strategies
○ Panic mode recovery
○ Phrase level recovery
○ Error production
○ Global Correction
• A context-free grammar
– gives a precise syntactic specification of a programming language.
– the design of the grammar is an initial phase of the design of a compiler.
• Context Free Grammar is a set of recursive rewriting rules (or production
rules) used go generate patterns of strings.
• It has a large set of classes
• It is a recursive notation for defining the language.
• A context free grammar G is defined by four tuples as G = (V, T, P, S)
where,
G − Grammar
V − Set of variables T − Set of terminals
P − Set of productions S − Start symbol 5
COMPILER DESIGN _UNIT-2 13
CONTEXT FREE GRAMMAR
• Terminals are symbols from which strings are formed.
○ Lowercase letters, i.e., a, b, c.
○ Operators, i.e., +,−, ∗.
○ Punctuation symbols, i.e., comma, parenthesis.
○ Digits, i.e., 0, 1, 2, · · · ,9.
○ Boldface letters, i.e., id, if.
• Non-terminals are syntactic variables that denote a set of strings.
○ Uppercase letters, i.e., A, B, C.
○ Lowercase italic names, i.e., expr, stmt.
• Start symbol is the head of the production stated first in the grammar
• Production is of the form LHS → RHS or head → body, where head contains only
one non-terminal and body contains a collection of terminals and non-terminals.
COMPILER DESIGN _UNIT-2 14
CONTEXT FREE GRAMMAR
18
COMPILER DESIGN _UNIT-2 17
LEFTMOST DERIVATION
E → E + E | E * E | id
Let
• S → SS + | SS * | a
{use leftmost derivations to derive the string w=aa+a* using the above
productions}
• S → SS + | SS *| a
{Students use rightmost derivations to derive the string w=aa+a* using the
above productions}
23
COMPILER DESIGN _UNIT-2 22
AMBIGUITY
24
COMPILER DESIGN _UNIT-2 23
ELIMINATION OF AMBIGUITY
if E1 then if E2 then if E1 then S1 else if E2
S1 else S2 then S2 else S3
25
COMPILER DESIGN _UNIT-2 24
ELIMINATION OF AMBIGUITY CONT..
• Idea:
– A statement appearing between a then and an else must be matched
26
COMPILER DESIGN _UNIT-2 25
ELIMINATING LEFT-RECURSION
• A grammar is left recursive if it has a production of the form A→Aα, for some
string α. To eliminate left-recursion for the production, A→A α | β
• Rule
E -> TE’
E -> E+T | T E’ -> +TE’ | ε
T -> T*F | F T -> FT’
F -> (E) | id T’ -> *FT’ | ε
F -> (E) | id
28
COMPILER DESIGN _UNIT-2 27
LEFT FACTORING
• When a production has more than one alternatives with common prefixes,
then it is necessary to make right choice on production.
• To perform left-factoring for the production, A→ αβ1|αβ2
• Rule
• Top-down parsing constructs parse tree for the input string, starting from
root node and creating the nodes of parse tree in pre-order.
• Top-down parsing is characterized by the following methods:
• Brute-force method, accompanied by a parsing algorithm. All possible
combinations are attempted before the failure to parse is recognized.
• Recursive descent, is a parsing technique which does not allow backup.
Involves backtracking and left-recursion.
• Top-down parsing with limited or partial backup.
• Limitation
When a grammar with left recursive production is given, then the parser
might get into infinite loop.
• Limitation:
– If the given grammar has more number of alternatives then the cost of
backtracking will be high
• Initially the stack contains $ to indicate bottom of the stack and the start
symbol of grammar on top of $.
• The input string is placed in input buffer with $ at the end to indicate the
end of the string.
• Parsing algorithm refers the grammar symbol on the top of stack and
input symbol pointed by the pointer and consults the entry in M[A, a]
where A is in top of stack and a is the symbol read by the pointer.
• Based on the table entry, if a production is found then the tail of the
production is pushed onto stack in reversal order with leftmost symbol
on the top of stack.
• Process repeats until the entire string is processed. 42
COMPILER DESIGN _UNIT-2
PARSING OF INPUT - PROCESS
• When the stack contains $ (bottom end marker) and the pointer reads $
(end of input string), successful parsing occurs.
• If no entry is found, it reports error stating that the input string cannot be
parsed by the grammar
Step 2: Left-factoring No common prefixes for any production with same head,
i.e., no need of left-factoring
COMPILER DESIGN _UNIT-2 49
SOLUTION
Step 3: Compute first