0% found this document useful (0 votes)
32 views136 pages

2nd Phase Syntax Analyzer - 1

Uploaded by

Diya Shah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views136 pages

2nd Phase Syntax Analyzer - 1

Uploaded by

Diya Shah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 136

Syntax Analyzer (Parser)

Input: list of tokens produced by scanner/LA


Output: tree(syntax) which shows structure of
program
• Topics to be covered
• Role of parser
• Context free grammar
• Derivation & Ambiguity
• Left recursion & Left factoring
• Classification of parsing
• Backtracking
• LL(1) parsing
• Recursive descent paring
• Shift reduce parsing
• Operator precedence parsing
• LR parsing
• Parser generator
Role of parser

Parser obtains a string of token from the lexical analyzer and reports syntax
error if any otherwise generates parse tree.

Parsing is a technique that takes input string and produces output either a
parse tree if string is valid sentence of grammar, or an error message
indicating that string is not a valid.

There are two types of parser:


• Top-down parser- In top down parsing parser build parse tree from top
to bottom.
Grammar: String: abbcde
S->aABe
A->Abc | b
• Bottom-up parser- Bottom up parser starts from leaves and work up to
the root.
Top-down parser Bottom-up parser
Recap: Overview

temp1 := inttoreal(60)
temp2 := id3 * temp1
temp3 := id2 + temp2
id1 := temp3

Code optimization
temp1 := id3 * 60.0
id1 := id2 + temp1

Code generator()
MOVF ID3, R2
MULF #60.0, R2
MOVF ID3, R1
ADDF R2, R1
MOVF R1, ID1
Introduction

A program If it is a legal program,


represented by a then output some abstract
Parser representation of the
sequence of tokens
program

• Abstract representations of the input program:


- abstract-syntax tree + symbol table
- intermediate code
- object code
• Context free grammar (CFG) is used to specify the structure of
legal programs
From text to abstract syntax
program text 5 + (7 * x)

Lexical
Analyzer

token stream num + ( num * id )

Grammar:
E → id
E → num Parser
E
E→ E+E
E→ E*E syntax valid E + E parse tree
E→ (E) error
+ num ( E )

num * E * E

Abstract syntax tree num id


7 x
Goals of parsing
• Programming language has syntactic rules
– Context-Free Grammars
• Decide whether program satisfies syntactic structure
– Error detection
– Error recovery
– Simplification: rules on tokens
• Build Abstract Syntax Tree
Classes of Grammars
(The Chomsky Hierarchy)
• Type-0: Phrase structured (unrestricted) grammars
– generate recursively enumerable (unrestricted) languages
– include all formal grammars
– implemented with Turing machines
• Type-1 : Context-sensitive grammars
– generate context-sensitive languages
– implemented with linear-bounded automata
• Type-2 : Context-free grammars
– generate context-free languages
– single non-terminal on left
– non-terminals & terminals on right
– implemented with pushdown automata
• Type-3 : Regular grammars
– generate regular languages
– no terminals or non-terminals here
– implemented with finite state automata
Classes of Grammars
(The Chomsky Hierarchy)
Type 0, Phrase Structure (same as basic grammar definition)
Type 1, Context Sensitive
(1)a -> b where a is in (N U S)* N (N U S)*,
b is in (N U S)+, and length(a) ≤ length(b)
(2) g A d -> g b d where A is in N, b is in (N U S)+, and
g and d are in (N U S)*
Type 2, Context Free
A -> b where A is in N, b is in (N U S)*
Linear
A-> x or A -> x B y, where A and B are in N and x and y are in S*
Type 3, Regular Expressions
(1) left linear A -> B a or A -> a, where A and B are in N and a is in S
(2) right linear A -> a B or A -> a, where A and B are in N and a is in S
Type 3 grammer
A grammar is said to be type 3 grammar or regular grammar if all
productions in grammar are of the form A → a then A → aB or
equivalent of the form A→a or A→Ba.

In other words in any production (set of rules) the left hand string is single
nonterminal and the right hand string is either a terminal or a terminal
followed by non-terminal.

8
Type 2 grammer
A grammar is said to be type 2 grammar or context free grammar if
every production in grammar is of the form A → α .

In other words in any production left hand string is always a non-terminal


and a right hand string is any string on T U N .
• Example : A → aBc

9
Type 1 grammer
A grammar is said to type 1 grammar or context sensitive grammar if for
every production α→ß . The length of ß is larger than or equal to the
length of α.

for example:
• A→ab
• A→aA
• aAb→aBCb

10
Type 0 grammer
A grammar with no restriction is referred to as type 0 grammar . They
generate exactly all languages that can be recognized by a Turing
machine. These languages are also known as the recursively enumerable
languages.

Class 0 grammars are too general to describe the syntax of programming


languages and natural languages.

11
The Chomsky Hierarchy and the Block Diagram of
a Compiler

Type 3 Type 2
Int.
tokens tree Inter- code
Source mediate Code Object
program Scanner Parser Code Optimizer program
Generator
Generator

Symbol Error Error


Type 1 Table Handler messages
Manager

Symbol Table
CFG vs. Regular Expressions
A regular grammar puts the following restrictions on the productions:
• The LHS can only be a single non terminal
• The RHS can be any number of terminals, with (at most) a single non terminal as its last
symbol.
A CFG puts the following restrictions on the productions:
• The LHS can only be a single non terminal (just like the regular grammar)
• The RHS can be any combination of terminals and non terminals (this is the new part).

CFG is more expressive than RE


– Every language that can be described by regular expressions can also be
described by a CFG
Example : languages that are CFG but not RE
– if-then-else statement, {anbn | n>=1}
Non-CFG
– L1={wcw | w is in (a|b)*}
– L2={anbmcndm | n>=1 and m>=1}
Context Free Grammars

• CFGs
– Add recursion to regular expressions
• Nested constructions

– Notation
expression → identifier | number | - expression
| ( expression )
| expression operator expression
operator → + | - | * | /

• Terminal symbols
• Non-terminal symbols
• Production rule (i.e. substitution rule)
terminal symbol → terminal and non-terminal symbols
Context Free Grammars
Formal Definition of Grammar :
Any Grammar can be represented by 4 tuples – <N, T, P, S>

N – Finite Non-Empty Set of Non-Terminal Symbols.


T – Finite Set of Terminal Symbols.
P – Finite Non-Empty Set of Production Rules.
S – Start Symbol (Symbol from where we start producing our sentences or
strings).

Consider Grammar G2 = <N, T, P, S>

N = {A} #Set of non-terminals Symbols


T = {a} #Set of terminal symbols
P = {A->Aa, A->AAa, A->a, A->\epsilon} #Set of all production rules
S = {A} #Start Symbol
Context Free Grammars (CFG) can be classified on the basis of
following two properties:

1) Based on number of strings it generates.

• If CFG is generating finite number of strings, then CFG is Non-


Recursive (or the grammar is said to be Non-recursive grammar)
• If CFG can generate infinite number of strings then the grammar is said
to be Recursive grammar

2) Based on number of derivation trees.

• If there is only 1 derivation tree then the CFG is unambiguous.


• If there are more than 1 left most derivation tree or right most
derivation or parse tree , then the CFG is ambiguous.
Recursive Grammars Example

S->SaS
S->b
The language(set of strings) generated by the above grammar is :{b, bab,
babab,…}, which is infinite.

Non-Recursive Grammars Example

S->Aa
A->b|c
The language generated by the above grammar is :{ba, ca}, which is finite.
Example:
• Recursive CFG generate infinite no. of strings.
1. S -> Sa
S -> b

2. S -> aS
S -> b

3. S -> Aa
A -> Ab | c
Note: Check left hand side and right hand side – common non terminal.

• Non recursive CFG generate finite no. of strings.


1. S -> Aa
A -> b | c
Note: Only one parse tree
Derivation
A derivation is basically a sequence of production rules, in order to get the
input string.
To decide which non-terminal to be replaced with production rule, we can
have two options:
• Leftmost derivation
• Rightmost derivation
-
-
Example:

• Ambiguous Grammar –

S-> Sa | aS | a Input string (W) = aaa

Note: 2 Left Most Derivation, 2 Right Most Derivation

• Unambiguous Grammar –

S -> AB
A -> aA | b
B -> bB | a

Note: If the given grammar is unambiguous (Only one tree is generated)


then LMDT = RMDT.
Example:
Derive LMD & RMD
1. S -> aAB
A -> bBb
B -> A | ε
Input String (W) = abbbb

2. S -> aB | bA
A -> a | aS | bAA
B -> b | bS | aBB
Input String (W) = aaabbabbba
Derivations
• A derivation shows how to generate a syntactically valid string
– Given a CFG
– Example:
• CFG

expression → identifier
| number
| - expression
| ( expression )
| expression operator expression
operator → + | - | * | /

• Derivation of

slope * x + intercept
Derivation Example
• Derivation of slope * x + intercept

expression → expression operator expression


→ expression operator intercept
→ expression + intercept
→ expression operator expression + intercept
→ expression operator x + intercept
→ expression * x + intercept
→ slope * x + intercept

expression → slope * x + intercept

• Identifiers were not derived for simplicity


Parse Trees
• A parse tree is any tree in which
– The root is labeled with S
– Each leaf is labeled with a token a or e
– Each interior node is labeled by a nonterminal
– If an interior node is labeled A and has children labeled
X1,.. Xn, then A ::= X1...Xn is a production.
Parse Trees and Derivations

E ::= E + E | E * E | E - E | - E | ( E ) | id
Parse Trees
• A parse is graphical representation of a derivation
• Example
Ambiguous Grammars
• Alternative parse tree
– same expression
– same grammar

– This grammar is ambiguous


-
Problem with Top Down Parsing

• Backtracking- Backtracking is a technique in which for expansion of


non terminal symbol, we try all the possible alternatives of expansion
and pick the matching one.
• Left Recursion- A grammar in the form G = (V, T, S, P) is said to be
in left recursive form if it has the production rules of the form A → A α
|β. In the production rule above, the variable in the left side occurs at the
first position on the right side production, due to which the left recursion
occurs. (Infinite loop)
• Left Factoring- Left factoring is used to convert a left-factored
grammar into an equivalent grammar to remove the uncertainty for the
top-down parser. In left factoring, we separate the common prefixes
from the production rule.
• Ambiguity- The fact of something having more than one possible
meaning and therefore possibly causing confusion: We wish to
remove any ambiguity concerning our demands.
Ambiguity
• A grammar that produces more than one parse tree for some sentence is
said to be ambiguous.
• The string has to be chosen carefully because there may be some strings
available in the language produced by the unambiguous grammar which
has only one parse tree.

• Removal of Ambiguity :

• Precedence –
If different operators are used, we will consider the precedence of the
operators. The three important characteristics are :
1. The level at which the production is present denotes the priority of
the operator used.
2. The production at higher levels will have operators with less priority.
In the parse tree, the nodes which are at top levels or close to the root
node will contain the lower priority operators.
3. The production at lower levels will have operators with higher
priority. In the parse tree, the nodes which are at lower levels or close
to the leaf nodes will contain the higher priority operators.
• Associativity –
1. If the same precedence operators are in production, then we
will have to consider the associativity.
2. If the associativity is left to right, then we have to prompt a
left recursion in the production. The parse tree will also be
left recursive and grow on the left side.
3. +, -, *, / are left associative operators.
4. If the associativity is right to left, then we have to prompt
the right recursion in the productions. The parse tree will
also be right recursive and grow on the right side.
5. ^ is a right associative operator.
Ambiguous grammar to unambiguous
E -> E + E | id

Where, w = id+id+id / id+id*id

E -> E + T | T
T -> id

Determine Associativity and precedence of CFG


E -> E + T | T
T -> T * F | F
F -> G ^ F | G
G -> id

Note: Leftmost / Rightmost non terminal that why left or right associative.

Associativity: + , * , ^ [+ , * = Left Associative, ^ = Right Associative]

Precedence: ^ * + <- (decreasing order)


High----------------Low
Ambiguity
• The fact of something having more than one possible meaning and
therefore possibly causing confusion: We wish to remove any ambiguity
concerning our demands.
• Example:
• S = aSb | SS
• S=∈
• For the string aabb, the above grammar generates two parse trees:

• If the grammar has ambiguity then it is not good for a compiler


construction. No method can automatically detect and remove the
ambiguity but you can remove ambiguity by re-writing the whole
grammar without ambiguity.
Converting ambiguous to unambiguous grammar
Consider the grammar shown below, which has two different operators :
E -> E + E | E * E | id
Clearly, the above grammar is ambiguous as we can draw two parse trees
for the string “id+id*id” as shown below. Consider the expression :
3+2*5 // “*” has more priority than “+”
The correct answer is : (3+(2*5))=13
Converting ambiguous to unambiguous grammar
• The “+” having the least priority has to be at the upper level and has to
wait for the result produced by the “*” operator which is at the lower
level. So, the first parse tree is the correct one and gives the same result
as expected.

• The unambiguous grammar will contain the productions having the


highest priority operator (“*” in the example) at the lower level and vice
versa. The associativity of both the operators are Left to Right. So, the
unambiguous grammar has to be left recursive. The grammar will be :
• E -> E + P // + is at higher level and left associative
• E -> P
• P -> P * Q // * is at lower level and left associative
• P -> Q
• Q -> id E -> E + P | P
• or P -> P * Q | Q
Q -> id
The Precedence And Associativity Of Operators
Now, we parenthesize the given expression based on the precedence
and associativity of operators as-

(2+(3x(5x6)))+2

Now, we evaluate the parenthesized expression as-

=(2+(3x(5x6)))+2

= ( 2 + ( 3 x 30 ) ) + 2

= ( 2 + 90 ) + 2

= 92 + 2

= 94
Eliminating Ambiguity
• There is no deterministic way of finding out whether a grammar is ambiguous and
how to fix it. In order to remove ambiguity, we follow some heuristics.
• There are three parts to this:
1. Add a non-terminal for each precedence level
2. Isolate the corresponding part of the grammar
3. Force the parser to recognize the high-precedence sub
expressions first

E -> E + E | E − E
|E*E|E/E
| (E) | var

E -> E + T | E − T | T
T -> T * F | T / F |
FF -> (E) | id
Example:
Consider the following ambiguous CFG on Boolean expression

bExp -> bExp AND bExp | bExp OR bExp | NOT bExp |True | False

The precedence of the Boolean operators are NOT, AND, OR (high to low).
AND, Or have left to right associativity.
Left Recursion
• Non Recursively Grammar – It generate finite no. of string, its not
suitable for programming language.

• Recursively Grammar – It generate infinite no. of string.


S -> SaS | b :This is left recursion, right recursion and its ambiguous.

• Modifying the grammar – Left Recursion

A Grammar G (V, T, P, S) is left recursive if it has a production in the


form.
A → A α |β.

The above Grammar is left recursive because the left of production is


occurring at a first position on the right side of production. It can eliminate
left recursion by replacing a pair of production with
A → βA′
A’ → αA′|ϵ
Left Recursion
Left Recursion is not suitable for top down approach because it might go to
infinite.

A → A α |β.

A → βA′
A’ → αA′|ϵ

Example:

T -> T*F | F

T -> FT’
T’ -> *FT’ | ε
Left Recursion Examples
1. A -> Abd | Aa | a
B -> Be | b

2. S -> Aa | b

3. A -> Ac | Sd | ε

4. E → E + E / E x E / a

5. E → E + T / T
T→TxF/F
F → id

6. S → (L) / a
L→L,S/S
Eliminating Left-Recursion
• Direct left-recursion

A ::= Aa | b A ::= Aa1 | ... |Aam|b1|...|bn

A ::= bA' A ::= b1A' | ... |bnA'


A' ::= aA' | e A' ::= a1A' | ... | anA' | e
Left Recursion Examples
Consider the following grammar and eliminate left recursion-

S → S0S1S / 01

Solution:

he grammar after eliminating left recursion is-

S → 01A

A → 0S1SA / ∈
Right Recursion
• A production of grammar is said to have right recursion if the rightmost
variable of its RHS is same as variable of its LHS.
• A grammar containing a production having right recursion is called as
Right Recursive Grammar.

• Example:

S → aS / ∈ ------------------(Right Recursive Grammar)

• Right recursion does not create any problem for the Top down parsers.
• Therefore, there is no need of eliminating right recursion from the
grammar.
General Recursion
The recursion which is neither left recursion nor right recursion is called as
general recursion.

Example-

S → aSb / ∈
Eliminating Indirect Left-Recursion
• Indirect left-recursion

• Algorithm
S ::= Aa | b
A ::= Ac | Sd | e
Arrange the nonterminals in some order A1,...,An.
for (i in 1..n) {
for (j in 1..i-1) {
replace each production of the form Ai ::= Ajg by the
productions Ai ::= d1g | d2g |... | dkg where
Aj ::= d1 | d2 |... | dk
}
eliminate the immediate left recursion among Ai productions
}
Example:
Algorithm to remove Indirect Recursion with help of an example:

A1 ⇒ A2 A3
A2 ⇒ A3 A1 | b
A3 ⇒ A1 A1 | a
Where A1, A2, A3 are non terminals and a, b are terminals.

• Identify the productions which can cause indirect left recursion. In our
case,

A3 ⇒ A1 A1 | a

• Substitute its production at the place the terminal is present in any other
production: substitute A1–> A2 A3 in production of A3.

A3 ⇒ A2 A3 A1 | a
• Now in this production substitute A2 ⇒ A3 A1 | b

A3 ⇒ (A3 A1 | b) A3 A1 | a
and then distributing,

A3 ⇒ A3 A1 A3 A1 | b A3 A1 | a

• Now the new production is converted in the form of direct left recursion,
solve this by the direct left recursion method.

Eliminating direct left recursion as in the above, introduce a new


nonterminal and write it at the end of every terminal. We create a new
nonterminal A’ and write the new productions as:

A3 ⇒ b A3 A1 A' | aA'
A' ⇒ ε | A1 A3 A1 A'
Example:
1. C -> A | B | f
A -> Cd
B -> Ce
Indirect Left Recursion Examples
Consider the following grammar and eliminate left recursion-

X → XSb / Sa / b
S → Sb / Xa / a

Solution-
This is a case of indirect left recursion.

Step-01:

First let us eliminate left recursion from X → XSb / Sa / b


Eliminating left recursion from here, we get-

X → SaX’ / bX’
X’ → SbX’ / ∈
Now, given grammar becomes-

X → SaX’ / bX’
X’ → SbX’ / ∈
S → Sb / Xa / a

Step-02:

Substituting the productions of X in S → Xa, we get the following


grammar-

X → SaX’ / bX’
X’ → SbX’ / ∈
S → Sb / SaX’a / bX’a / a
Step-03:

Now, eliminating left recursion from the productions of S, we get the


following grammar-

X → SaX’ / bX’

X’ → SbX’ / ∈

S → bX’aS’ / aS’

S’ → bS’ / aX’aS’ / ∈

This is the final grammar after eliminating left recursion.


Left Factoring

A::= ab1 | ... | abn | g

A ::= aA' | g
A' ::= b1 | ... | bn
Left Factoring
If we have ,
Suppose the grammar is in the form:

A -> αβ1 | αβ2 | αβ3 | …… | αβn | γ

We will separate those productions with a common prefix and then add a new
production rule in which the new non-terminal we introduced will derive those
productions with a common prefix.

A ⇒ αA’ | γ 1 …. | γ n
A’ ⇒ β1 | β2 | β3 | …… | βn

The top-down parser can easily parse this grammar to derive a given string. So
this is how left factoring in compiler design is performed on a given grammar.
Left Factoring
In left factoring,

• We make one production for each common prefixes.


• The common prefix may be a terminal or a non-terminal or a
combination of both.
• Rest of the derivation is added by new productions.
• The grammar obtained after the process of left factoring is called as Left
Factored Grammar.
Example of Left Factoring
Do left factoring in the following grammar-

S → iEtS / iEtSeS / a

E→b

Solution-

The left factored grammar is-

S → iEtSS’ / a

S’ → eS / ∈

E→b
Example of Left Factoring
1. S -> C+E | C*E | C/E

2. S -> aAd | Ab
A -> a | ab
B -> ccd | ddc

3. S -> aab | aaA | c


A -> C
Parsing
Top down Parsing Bottom up Parsing
1. Process start with root. 1. Process start with leaves.
2. It start with start symbol of the 2. It ends with starting symbol of
grammar. the grammar.
3. This parsing technique use left 3. This parsing technique uses
most derivation. rightmost derivation.
4. It is categories by recursive 4. It is categories by operator
descent parser and predictive precedence parser and shift
parser. reduce parser.
5. It is not accepting ambiguous 5. It’s accepting ambiguous
grammar. grammar.
6. It is less powerful as compare 6. It is more powerful as compare
to bottom up parser. to top down parser.
7. It is simple to produce parser. 7. It is difficult to produce parser.
8. It uses LL(1) grammar to 8. It uses SLR, CLR, LALR
perform parsing. grammar to perform parsing.
9. Error detection is weak. 9. Error detection is strong.
Types of Parsers

Parsers

Universal Top Down Bottom Up

Without left
recursion/factoring Backtrack Non- Operator LR
Backtrack Precedence Parsing
Predictive
Recursive
Descent

Recursive Table SLR CLR LALR


Descent Driven-
Nonrecursive
Classification of parser

Parser
Top-Down Parsing
• Start from the start symbol and build the parse tree top-down
• Apply a production to a nonterminal. The right-hand of the
production will be the children of the nonterminal
• Match terminal symbols with the input
• May require backtracking
• Some grammars are backtrack-free (predictive)
TDP
• The parse tree is created top to bottom.
• Top-down parser
– Recursive-Descent Parsing
• Backtracking is needed (If a choice of a production rule does not work, we
backtrack to try other alternatives.)
• It is a general parsing technique, but not widely used.
• Not efficient
– Predictive Parsing
• no backtracking
• efficient
• needs a special form of grammars (LL(1) grammars).
• Recursive Predictive Parsing is a special form of Recursive Descent
parsing without backtracking.
• Non-Recursive (Table Driven) Predictive Parser is also known as LL(1)
parser.
Construct Parse Trees Top-Down

– Start with the tree of one node labeled with the start
symbol and repeat the following steps until the fringe
of the parse tree matches the input string
1.At a node labeled A, select a production with A on
its LHS and for each symbol on its RHS, construct
the appropriate child
2.When a terminal is added to the fringe that doesn't
match the input string, backtrack
3.Find the next node to be expanded
– Minimize the number of backtracks
Example

Left-recursive Right-recursive

E ::= T|E+T|E-T E ::= T E'


T ::= F|T*F|T/F E'::= + T E'
F ::= id | number | (E) | - T E'
|e
T::= F T'
T' ::= * F T'
| / F T'
|e
F ::= id
| number
x-2*y | (E)
Recursive-Descent Parsing (uses Backtracking)
• Backtracking is needed.
• It tries to find the left-most derivation.
• Grammar rule of a non-terminal “A” is viewed as a definition of a
procedure that will recognize “A”.

S → cAd S S
A → ab | a
c A d c A d

input: cad
c a b d c a d

fails, backtrack
Recursive Descent Parser- Example
• A separate recursive procedure is written for every non-terminals
Procedure S()
{
if input = ‘c’
{
Advance(); //procedure that is written to advance the input pointer to next position
A();
if input = ‘d’
{
Advance();
return true;
}
else return false;
else return false;
}
Cont.
Procedure A()
{
isave=in-ptr; // i-save saves the input pointer position before each alternate to facilitate backtracking
If input =‘a’
{ Advance();
if input = ‘b’
{
Advance();
return true;
}
}
In-ptr=isave
If input =‘a’
{ Advance();
return true;
}
return false;
return false;
}
Cont.
• Problems??
- Left recursion – ambiguity as how many times to call? Solution –
eliminate it
- Backtracking – when more than one alternative in the rule. Solution –
left factoring
- Very difficult to identify the position of the errors
Brute force technique

• Whenever a non terminal is expanding first time, then go with


the first alternative and compare with the input string. If does
not matches, go for the second alternative and compare with
input string, if does not match go with the 3rd alternatives and
continue each and every alternative.
• If the matching occurs for at least one alternative, then the
parsing is successful, otherwise parsing fail.
Brute force technique
• Backtracking is needed.

S → cAd S S
A → ab | a
c A d c A d

input: cad
Input :cada c a b d c a d

fails, backtrack
Advantages of a brute-force algorithm

• This algorithm finds all the possible solutions, and it also


guarantees that it finds the correct solution to a problem.
• This type of algorithm is applicable to a wide range of
domains.
• It is mainly used for solving simpler and small problems.
• It can be considered a comparison benchmark to solve a
simple problem and does not require any particular domain
knowledge.
Disadvantages of a brute-force algorithm

• It is an inefficient algorithm as it requires solving each and


every state.
• It is a very slow algorithm to find the correct solution as it
solves each state without considering whether the solution is
feasible or not.
• The brute force algorithm is neither constructive nor creative
as compared to other algorithms.
Top-Down Parsing

• Recursive descent parser is also known as the Brute force


parser or the backtracking parser.

• It basically generates the parse tree by using brute force


and backtracking.

• Non-recursive descent parser is also known as LL(1)


parser or predictive parser or without backtracking parser
or dynamic parser.
Predictive Parser
a grammar ➔ ➔ a grammar suitable for predictive
eliminate left parsing (a LL(1) grammar)
left recursion factor

• When re-writing a non-terminal in a derivation step, a predictive parser


can uniquely choose a production rule by just looking the current
symbol in the input string.

A → α 1 | ... | αn input: ... a .......

current token
Predictive Parser (example)

stmt → if ...... |
while ...... |
begin ...... |
for .....

• When we are trying to write the non-terminal stmt, if the current token
is if we have to choose first production rule.
• When we are trying to write the non-terminal stmt, we can uniquely
choose the production rule by just looking the current token.
• We eliminate the left recursion in the grammar, and left factor it. But it
may not be suitable for predictive parsing (not LL(1) grammar).
Predictive Parser

Recursive Predictive Parsing Predictive Parser

• Parse may have more than one • Each step has at most one
production to choose for a production to choose.
single instance of input.
• Example:
• Example:
Input string : acdb
Input String : cad
S-> aABb
S-> cAd A->c | epsilon
A-> ab | a B-> d | epsilon
Recursive Predictive Parsing
• Each non-terminal corresponds to a procedure.

Ex: A → aBb (This is only the production rule for A)

proc A {
- match the current token with a, and move to the next token;
- call ‘B’;
- match the current token with b, and move to the next token;
}
Recursive Predictive Parsing (cont.)
A → aBb | bAB

proc A {
case of the current token {
‘a’: - match the current token with a, and move to the next token;
- call ‘B’;
- match the current token with b, and move to the next token;
‘b’: - match the current token with b, and move to the next token;
- call ‘A’;
- call ‘B’;
}
}
Recursive Predictive Parsing (cont.)
• When to apply s-productions.

A → aA | bB | s

• If all other productions fail, we should apply an s-production. For


example, if the current token is not a or b, we may apply the
s-production.
• Most correct choice: We should apply an s-production for a non-
terminal A when the current token is in the follow set of A (which
terminals can follow A in the sentential forms).
Recursive Predictive Parsing (Example)
A → aBe | cBd | C
B → bB | s
C→ f
proc C { match the current token with f,
proc A { and move to the next token; }
case of the current token {
a: - match the current token with a,
and move to the next token; proc B {
- call B; case of the current token {
- match the current token with e, b: - match the current token with b,
and move to the next token; and move to the next token;
c: - match the current token with c, - call B
and move to the next token; e,d: do nothing
- call B; }
- match the current token with d, }
and move to the next token; follow set of B
f: - call C
}
first set of C
}
Non-Recursive Predictive Parsing - LL(1) Parser

• A form pf Non-Recursive Predictive Parsing - LL(1) Parser


that does not require any backtracking is know as predictive
parsing.
• Predictive parsing has the capability to predict which
production is to be used to replace the input string.
• The predictive parser does not suffer from backtracking.
• To accomplish its tasks, the predictive parser uses a llokahead
pointer, which point to the next input symbol.
• To make the parser back tracking free, the predictive parser
puts some constraints on the grammar and accepts only a class
of grammar known as LL(k) grammar.
Non-Recursive Predictive Parsing - LL(1) Parser
• An LL parser is a top-down parser for a subset of the context-free
grammars. It parses the input from Left to right, and constructs
a Leftmost derivation of the sentence
• Non-Recursive predictive parsing is a table-driven parser.
• It is a top-down parser.
• It is also known as LL(1) Parser .

An LL parser is called an LL(k)


parser if it
uses k tokens of lookahead when
parsing a sentence
Non-Recursive Predictive Parsing - LL(1) Parser
• Predictive parsing uses a
“stack” and a “parsing table”
to parse the input and generate
a parse tree.
• Both the “stack” and the
“input” contains an end
symbol $ to denote that the .
stack is empty and the input is
consumed.
• The parser refers to the parsing
table to take any decision on
the input and stack element
combination
• Stack is used to maintain
relationship
Compute FIRST
• If α is a terminal symbol ‘a’ then FIRST(α)={a}
For example, for grammar rule A -> a,
FIRST(a)={a}
• If α is a non-terminal symbol ‘X’ and X ->
aα, then FIRST(X)=FIRST(aα)={a}

For example for grammar rule A->aBC, FIRST(A) = FIRST(aBC) = {a}


• If α is a non-terminal ‘X’ and X-> Ɛ, then
FIRST(X)={Ɛ} For example for grammar rule A->Ɛ,
FIRST(A)={Ɛ}

• If X -> Y1,Y2,..Yn then add to FIRST(Y1,Y2,..Yn) all the non- Ɛ symbols of


FIRST(Y1). Also add the non- Ɛ symbols of FIRST(Y2) if Ɛ is in FIRST(Y1), the
non- Ɛ symbols of FIRST(Y3) if Ɛ is in both FIRST(Y1) and in FIRST(Y2), and so
on. Finally add Ɛ to FIRST(Y1,Y2,..Yn) if, for all i, FIRST(Yi) contains Ɛ.

For example for rules: X -> Yb and Y -> a


| Ɛ FIRST(X)=FIRST(Yb)=FIRST(Y)={a,
b}
FIRST Example
E → TE’
E’ → +TE’ | s
T → FT’
T’ → *FT’ | s
F → (E) | id

FIRST(F) = {(,id}
FIRST(T’) = {*, s}
FIRST(T) = {(,id}
FIRST(E’) = {+, s}
FIRST(E) = {(,id}
Finding First()
• First() : contains all terminals present in fist place of every string derived
by non terminal.
• For example: First(A) : contains all terminals present in fist place of
every string derived by A non terminal.

• S-> abc | def | ghi

First(S) = {a, d, g}

• First(terminal) = terminal

• First(∈) = ∈
Rule to derive first set

1. If A -> a , then First(A) = {a}

2. If A -> ∈ , then First(A) = {∈}

3. If A -> X1, X2, X3…..,Xn then,

i) First(A)= First(X1)

ii) if First(X1) contain ∈, then First(A)=First(X2) i..e


Add First(X2) to First(A)

iii) if First(Xi) contain ∈, then (i-1 to n) add ∈ to First(X)


Example of First()
1. S -> ABC | ghi | jkl ……….{a, b, c, g, j}
A -> a | b | c ………….{a, b, c}
B -> b ……………{b}
D -> d …………....{d}

First(S) = {a, b, c, g, j}
First(A) = {a, b, c}
First(B) = {b}
First(D) = {d}

2. S -> ABC ……….{a, b, c, d, e, f, ∈}


A -> a | b | ∈ ………….{a, b, ∈}
B -> c | d | ∈ ……………{c, d, ∈}
C -> e | f | ∈ …………....{e, f, ∈}

First(S) = {a, b, c, d, e, f, ∈ }
First(A) = {a, b, ∈}
First(B) = {c, d, ∈}
First(C) = {e, f, ∈}
Example of First()
1. S -> aA | AB
A -> b | ∈
B -> d | ∈

2. S -> aA | Bb
A -> BC
B -> d | ∈
C -> e | ∈

3. S -> Ab | Ba
A -> c | d
B -> b | ∈

4. S -> AA
A -> aA | ∈
Compute FOLLOW (for non-terminals)
FOLLOW of a non-terminal A is a set of terminals that follow or occur to
the right of A

• If S is the start symbol ➔ $ is in FOLLOW(S)

• if A → αBβ is a production rule


➔ everything in FIRST(β) is FOLLOW(B) except s

• If ( A → αB is a production rule ) or
( A → αBβ is a production rule and s is in FIRST(β) )
➔ everything in FOLLOW(A) is in FOLLOW(B).
We apply these rules until nothing more can be added to any follow set.
FOLLOW Example
E → TE’
E’ → +TE’ | s
T → FT’
T’ → *FT’ | s
F → (E) | id

FOLLOW(E) = { $, ) }
FOLLOW(E’) = { $, ) }
FOLLOW(T) = { +, ), $ }
FOLLOW(T’) = { +, ), $ }
FOLLOW(F) = {+, *, ), $ }
First(E’)={+,ep}
First(T’)={*,ep}
Finding Follow()
• Follow(A) : for non terminal A is the set of terminal that can appear
immediately to right of A.

• Example:

• S -> Aa

Follow(A) = {a}

• S -> AB
B -> d

Follow(A) = {d}
Rule to derive Follow set

• Follow(s) = {$} where, S is starting non terminal.

• A -> αBβ

i) Follow(B) = First(β)

ii) if First(β) contain epsilon, then

Follow(β) = Follow(A)
Example of Follow()
1. S -> ACD
C -> a | b

Follow(A) = First(C) = {a, b}


Follow(D) = Follow(S) = {$}

2. S -> aSbS | bSaS | ∈

Follow(S) = {$, b, a}

Note: Never contain ∈.

3. S-> AaAb | BbBa


A -> ∈
B -> ∈
Follow(S) = { $ }
Follow(A) = First(a) ∪ First(b) = { a , b }
Follow(B) = First(b) ∪ First(a) = { a , b }
Example of Follow()
1. S ->Aa | Ac
A ->b

2. E -> TE’
E’ -> +T E’|Є
T -> F T’
T’ -> *F T’ | Є
F -> (E) | id

3. S- > ACB | Cbb | Ba


A -> da | BC
B -> g | Є
C -> h | Є
E -> TE’
E’ -> +T E’|Є
T -> F T’
T’ -> *F T’ | Є
F -> (E) | id

FIRST set
FIRST(E) = FIRST(T) = { ( , id }
FIRST(E’) = { +, Є }
FIRST(T) = FIRST(F) = { ( , id }
FIRST(T’) = { *, Є }
FIRST(F) = { ( , id }

FOLLOW Set
FOLLOW(E) = { $ , ) }
FOLLOW(E’) = FOLLOW(E) = { $, ) }
FOLLOW(T) = { FIRST(E’) – Є } U FOLLOW(E’) U FOLLOW(E) = { + , $ , ) }
FOLLOW(T’) = FOLLOW(T) = { + , $ , ) }
FOLLOW(F) = { FIRST(T’) – Є } U FOLLOW(T’) U FOLLOW(T) = { *, +, $, ) }
Construction of Predictive Parse Table
1. Compute First and Follow set for each non terminal.
2. If A-> α, then add A-> α into M [A, First(α)]
Example: S-> aA | b
a b
S S->aA S->b

3. If A -> α is a production and First(α) contain epsilon, then add A -> α to


M [A, Follow (A)]

Example: S-> aSb | epsilon

a b $
S S->aSb S->epsilon S-> epsilon
Construction of Predictive Parse Table

• Step1.1− Elimination of Left Recursion

- As there is no left recursion in Grammar so, we will


proceed as it is. Also, there is no need for Left Factoring.

• Step1.2 - Perform Left Factoring

• Step2− Computation of FIRST

• Step3− Computation of FOLLOW

• Step4− Construction of Predictive Parsing Table

• Step5- Parse the input string


Construction of Predictive Parse Table – Example

Example :
E -> E+T | T
T -> T*F | F
F -> (E) | id

• Step1− Elimination of Left Recursion & perform Left Factoring


E → TE′
E′ → +TE′|ε
T′ → FT′
T′ → FT′|ε
F → (E)|id

• Step2− Computation of FIRST

FIRST(E) = FIRST(T) = FIRST(F) = {(, id}


FIRST (E′) = {+, ε}
FIRST (T′) = {*, ε}
Construction of Predictive Parse Table – Example

• Step3− Computation of FOLLOW

FOLLOW (E) = FOLLOW(E′) = {), $}


FOLLOW (T) = FOLLOW(T′) = {+, ), $}
FOLLOW (F) = {+,*,),$}

• Step4− Construction of Predictive Parsing Table


Construction of Predictive Parse Table – Example

• Step3− Computation of FOLLOW

FOLLOW (E) = FOLLOW(E′) = {), $}


FOLLOW (T) = FOLLOW(T′) = {+, ), $}
FOLLOW (F) = {+,*,),$}

• Step4− Construction of Predictive Parsing Table

• E → TE′
Comparing E → TE′ with A → α

E→ TE′
A→ α

∴ A = E, α = TE′

∴ FIRST(α) = FIRST(TE′) = FIRST(T) = {(, id}


Construction of Predictive Parse Table – Example

Applying Rule (1) of Predictive Parsing Table

∴ ADD E → TE′ to M[E, ( ] and M[E, id]

∴ write E → TE′ in front of Row (E) and Columns {(, id} (1)

• Step4− Construction of Predictive Parsing Table


• E′ → +TE′|𝛆
Comparing it with A → α

E→ +TE′
A→ α
∴ A = E′ α = +TE′ ∴ FIRST(α) = FIRST(+TE′) = {+}

∴ ADD E → +TE′ to M[E′, +]

∴ write production E′ → +TE′ in front of Row (E′)and Column (+) (2)


Construction of Predictive Parse Table – Example

• Step4− Construction of Predictive Parsing Table


• E′ → 𝛆
E→ ε
A→ α
∴α=ε

∴ FIRST(α) = {ε}

∴ Applying Rule (2)of the Predictive Parsing Table.

Find FOLLOW (E′) = { ), $}

∴ ADD Production E′ → ε to M[E′, )] and M[E′, $]

∴ write E′ → ε in front of Row (E′)and Column {$, )}


(3)
Construction of Predictive Parse Table – Example
• Step4− Construction of Predictive Parsing Table

• T → FT′
Comparing it with A → α

T→ FT′
A→ α
∴ A = T, α = FT′

∴ FIRST(α) = FIRST (FT′) = FIRST (F) = {(, id}

∴ ADD Production T → FT′ to M[T, (] and M[T, id]

∴ write T → FT′ in front of Row (T)and Column {(, id}


(4)
Construction of Predictive Parse Table – Example
• Step4− Construction of Predictive Parsing Table
• T′ →*FT′
Comparing it with A → α

T → *FT′
A →α
∴ FIRST(α) = FIRST (* FT′) = {*} ∴ ADD Production T → +FT′ to M[T′,*]

∴ write T′ →∗ FT′ in front of Row (T′)and Column {*} (5)

• T′ → ε
Comparing it with A → α
T′ → ε A→ α ∴ A = T′
α=ε ∴ FIRST(α) = FIRST {𝜀} = {ε}
∴ Applying Rule (2)of the Predictive Parsing Table.

Find FOLLOW (A) = FOLLOW (T′) = {+, ), $}

∴ ADD T′ → ε to M[T′, +], M[T′, )] and M[T′, $]


∴ write T′ → ε in front of Row (T′)and Column {+, ), $} (6)
Construction of Predictive Parse Table – Example
• Step4− Construction of Predictive Parsing Table
• F →(E)
Comparing it with A → α

F → (E)
A →A
∴ A = F, α = E

∴ FIRST(α) = FIRST ((E)) = {(}

∴ ADD F → (E) to M[F, (] ∴ write F → (E) in front of Row (F)and Column (( ) (7)

• F → id
Comparing it with A → α

F → Id
A →A
∴ FIRST(α) = FIRST (id) = {id}

∴ ADD F → id to M[F, id] ∴ write F → id in front of Row (F)and Column (id) (8)
Construction of Predictive Parse Table – Example

• Construction of Predictive Parsing Table


Construction of Predictive Parse Table – Example
• Step5- Parse the input string

Checking Acceptance of String id + id * id using Predictive Parsing


Program

Initially, the stack will contain the starting symbol E and $ at the bottom of
the stack. Input Buffer will contain a string attached with $ at the right end.

If the top of stack = Current Input Symbol, then symbol from the top of the
stack will be popped, and also input pointer advances or reads next symbol.

The sequence of Moves by Predictive Parser


Construction of Predictive Parse Table – Example
• Step5- Parse the input string
Construction of Predictive Parse Table – Example

1. S -> aA | AB
A -> b | 𝛆
B -> c | 𝛆
Input String: a, ab, bc

2. S -> AB
A -> aA | 𝛆
B -> b | 𝛆
Input String: aab

3. S -> AB | a
A -> aA | 𝛆
B -> b
Input String: a, ab
LL(1) Parser
input buffer
– our string to be parsed. We will assume that its end is marked with a special symbol $.

output
– a production rule representing a step of the derivation sequence (left-most derivation) of the
string in the input buffer.

stack
– contains the grammar symbols
– at the bottom of the stack, there is a special end marker symbol $.
– initially the stack contains only the symbol $ and the starting symbol S. $S 🡰 initial
stack
– when the stack is emptied (ie. only $ left in the stack), the parsing is completed.

parsing table
– a two-dimensional array M[A,a]
– each row is a non-terminal symbol
– each column is a terminal symbol or the special symbol $
– each entry holds a production rule.
LL(1) Parser – Parser Actions

parsing table
LL(1) Parser – Parser Actions
• The symbol at the top of the stack (say X) and the current symbol in the input string
(say a) determine the parser action.
• There are four possible parser actions.
1. If X and a are $ parser halts (successful completion)
2. If X and a are the same terminal symbol (different from $)
➔ parser pops X from the stack, and moves the next symbol in the input buffer.
3. If X is a non-terminal
➔ parser looks at the parsing table entry M[X,a]. If M[X,a] holds a production rule
X→Y1Y2...Yk, it pops X from the stack and pushes Yk,Yk-1,...,Y1 into the stack. The
parser also outputs the production rule X→Y1Y2...Yk to represent a step of the
derivation.
4. none of the above error
– all empty entries in the parsing table are errors.
– If X is a terminal symbol different from a, this is also an error case.
LL(1) Parser – Example1
S → aBa a b $ LL(1) Parsing
B → bB | 𝛆 S S → aBa Table

B B→ s B → bB

stack input output


$S abba$ S → aBa
$aBa abba$
$aB bba$ B → bB
$aBb bba$
$aB ba$ B → bB
$aBb ba$
$aB a$ B→ 𝛆
$a a$
$ $ accept, successful completion
LL(1) Parser – Example1 (cont.)

Outputs: S → aBa B → bB B → bB B→ s

Derivation(left-most): S⇒aBa⇒abBa⇒abbBa⇒abba

S
parse tree
a B a

b B

b B

𝛆
LL(1) Parser – Example2
Input= id+id$
E → TE’
E’ → +TE’ | s
T → FT’ s – consider as 𝛆
T’ → *FT’ | s
F → (E) | id

id + * ( ) $
E E → TE’ E → TE’
E’ E’ → +TE’ E’ → s E’ → s
T T → FT’ T → FT’
T’ T’ → s T’ → *FT’ T’ → s T’ → s
F F → id F → (E)
LL(1) Parser – Example2
1. E → TE’ FIRST(F) = {(,id}
2. E’ → +TE’ FOLLOW(E) = { $, ) }
FIRST(T’) = {*, s}
3. E’ → s FOLLOW(E’) = { $, ) }
4. T → FT’ FIRST(T) = {(,id} FOLLOW(T) = { +, ), $ }
FOLLOW(T’) = { +, ), $ }
5. T’ → *FT’
FIRST(E’) = {+, s} FOLLOW(F) = {+, *, ), $ }
6. T’ →s
7. F → (E) FIRST(E) = {(,id}
8. F →id

id + * ( ) $
1 1
E
E’
T
T’
F
48
LL(1) Parser – Example2
stack input output
$E$ id+id$ E → TE’
$E’T id+id$ T → FT’
$E’ T’F id+id$ F → id
$ E’ T’id id+id$
$ E’ T’ +id$ T’ → s
$ E’ +id$ E’ → +TE’
$ E’ T+ +id$
$ E’ T id$ T → FT’
$ E’ T’ F id$ F → id
$ E’ T’id id$
$ E’ T’ $ T’ → s
$ E’ $ E’ → s
$ $ accept
Constructing LL(1) Parsing Tables
1. Eliminate left recursion in grammar G
2. Perform left factoring on the grammar G
3. Find FIRST and FOLLOW for each NT of grammar G
4. Construct the predictive parse table OR LL(1) parse table
5. Check if the given input string can be accepted by the parser
Constructing LL(1) Parsing Table -- Algorithm
• for each production rule A → α of a grammar G
– for each terminal a in FIRST(α)
➔ add A → α to M[A,a]
– If s in FIRST(α)
➔ for each terminal a in FOLLOW(A) add A → α to M[A,a]
– If s in FIRST(α) and $ in FOLLOW(A)
➔ add A → α to M[A,$]

• All other undefined entries of the parsing table are error entries.
Constructing LL(1) Parsing Table -- Example
E → TE’ FIRST(TE’)={(,id} ➔ E → TE’ into M[E,(] and M[E,id]
E’ → +TE’ FIRST(+TE’ )={+} ➔ E’ → +TE’ into M[E’,+]

E’ → s FIRST(s)={s} ➔ none
but since s in FIRST(s)
and FOLLOW(E’)={$,)} ➔ E’ → s into M[E’,$] and M[E’,)]

T → FT’ FIRST(FT’)={(,id} ➔ T → FT’ into M[T,(] and M[T,id]


T’ → *FT’ FIRST(*FT’ )={*} ➔ T’ → *FT’ into M[T’,*]

T’ → s FIRST(s)={s} ➔ none
but since s in FIRST(s)
and FOLLOW(T’)={$,),+}➔ T’ → s into M[T’,$], M[T’,)] and
M[T’,+]

F → (E) FIRST((E) )={(} ➔ F → (E) into M[F,(]

F → id FIRST(id)={id} ➔ F → id into M[F,id]


LL(1) Grammars
• A grammar whose parsing table has no multiply-defined entries is said
to be LL(1) grammar.

one input symbol used as a look-head symbol to determine parser action

LL(1) left most derivation


input scanned from left to right

• The parsing table of a grammar may contain more than one


production rule. In this case, we say that it is not a LL(1) grammar.
A Grammar which is not LL(1)
S→ iCtSE | a
E→ eS | s
C→ b a b e i t $
S S→ a S → iCtSE
FIRST(iCtSE) = {i}
E E→ eS E→ s
FIRST(a) = {a}
E→ s
FIRST(eS) = {e}
C C→ b
FIRST(s) = {s}
FIRST(b) = {b}
two production rules for M[E,e]
FOLLOW(S) = { $,e }
FOLLOW(E) = { $,e }
FOLLOW(C) = { t }
Problem ➔ ambiguity
A Grammar which is not LL(1) (cont.)
• What do we have to do if the resulting parsing table contains multiply
defined entries?
– If we didn’t eliminate left recursion, eliminate the left recursion in the grammar.
– If the grammar is not left factored, we have to left factor the grammar.
– If its (new grammar’s) parsing table still contains multiply defined entries, that grammar is
ambiguous or it is inherently not a LL(1) grammar.
• A left recursive grammar cannot be a LL(1) grammar.
– A → Aα | β
➔ any terminal that appears in FIRST(β) also appears FIRST(Aα) because Aα ⇒ βα.
➔ If β is s, any terminal that appears in FIRST(α) also appears in FIRST(Aα) and FOLLOW(A).

• A grammar is not left factored, it cannot be a LL(1) grammar


• A → αβ1 | αβ2
➔ any terminal that appears in FIRST(αβ1) also appears in FIRST(αβ2).

• An ambiguous grammar cannot be a LL(1) grammar.


Properties of LL(1) Grammars
• A grammar G is LL(1) if and only if the following conditions hold for
two distinctive production rules A → α and A → β

1. Both α and β cannot derive strings starting with same terminals.

2. At most one of α and β can derive to s.

3. If β can derive to s, then α cannot derive to any string starting


with a terminal in FOLLOW(A).
Example
• Construct predictive parse table for the following grammar. Also show
parser actions for the input string - (a,a)
S->a | ↑ | (T)
T->T,S | S

- Eliminate left recursion


- left factor
- First
- Follow
- Construct parsing table – check multiple entries
- Show Actions
Cont.
• Eliminate left recursion
S-> a | ↑ | (T)
T-> ST’
T’-> ,ST’ | Ɛ
• Its not needed to left factor
• FIRST
FIRST(S)={a, ↑, ( }
FIRST(T)=FIRST(ST’)=FIRST(S)={a, ↑,( }
FIRST(T’)={, , Ɛ}
• FOLLOW
FOLLOW(S)={$ , ) and ,}
FOLLOW(T)={ ) }
FOLLOW(T’)={ ) }
• Is following grammar LL(1)? Also trace input string - ibtaea

S -> iCtSS` | a
S` -> eS | Ɛ
C ->b
• Is following grammar LL(1)? Also trace input string – int*int

E→T+E|T
T → int | int * T | ( E )

E→TX
X→+E|ε
T → ( E ) | int Y
Y→*T|ε

First( T ) = {int, ( } Follow( + ) = { int, ( } Follow( * ) = { int, ( }


First( E ) = {int, ( } Follow( ( ) = { int, ( } Follow( E ) = {), $}
Follow( X ) = {$, ) } Follow( T ) = {+, ) , $}
First( X ) = {+, ε }
Follow( ) ) = {+, ) , $} Follow( Y ) = {+, ) , $}
First( Y ) = {*, ε }
Follow( int) = {*, +, ) , $}

64
Motivation Behind First & Follow
First: Is used to help find the appropriate production to follow
given the top-of-the-stack non-terminal and the current input
symbol.
Example: If A → α , and a is in First(α), then when
a=input, replace A with α (in the stack).
( a is one of first symbols of α, so when A is on the stack and
a is input, POP A and PUSH α.

Follow: Is used when First has a conflict, to resolve choices, or when


First gives no suggestion. When α → ϵ or α ⇒ * ϵ , then
what follows A dictates the next choice to be made.
Example: If A → α , and b is in Follow(A ), then when
α⇒ * ϵ and b is an input character, then we expand A with α
, which will eventually expand to ϵ , of which b follows!
(α ⇒
* ϵ : i.e., First(α ) contains ϵ.)

You might also like