0% found this document useful (0 votes)
24 views9 pages

Lecture 10

Uploaded by

sikandar khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views9 pages

Lecture 10

Uploaded by

sikandar khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Atif Ishaq - Lecturer GC University, Lahore

Compiler Construction
CS-4207
Lecture – 10

Left Factoring
Left Factoring is a grammar transformation technique. It consists in “factoring out” prefixes
which are common to two or more productions. Left factoring is removing the common left
factor that appears in two or more production of the same non-terminal. It is done to avoid
backtracking by parser.
For example if we have the two productions
stmt → if exp then stmt else stmt
| if exp then stmt
In this grammar, if the input symbol is ‘if’, we cannot immediately tell which production to
choose to expand stmt.
Consider another grammar
E→T+E|T
T → int | int T | (E)
It is impossible to predict because for T, two productions starts with int. For E, it is not clear how
to predict; the two productions starts with the non-terminal T. A grammar must be left factored
before use for predictive parsing. The procedure to left factor a grammar is as follows
If α ≠ ∈ replace all productions
A  1 | 2 | 3 | 4 | …. |  n | 
Replace with
A  Z | 
Z  1 | 2 | 3 | 4 | …. | n
A graphical explanation can be represented as below
Atif Ishaq - Lecturer GC University, Lahore

We have already discussed recursive decent parsing. Let’s have a look at the function
(algorithm) again

LL(K) Grammar
LL (k) is a grammar in which purser scans the input from left to right and generate the leftmost
derivation, and it can take a decision about reduction by looking next k symbol only. That is why;
it is called LL (k) grammar

Constructing LL(1) Tables


If A   and A   both appears in the grammar, we would like
FIRST()  FIRST() = 
The LL(1) property allows the parser to make a correct choice with a look ahead of exactly one
symbol. If an ∈ is used in production then it will complicate the definition of LL(1)
If A   and A   and ε ∈ FIRST(), then we need to make sure that FIRST() is disjoint
from FOLLOW(), too
LL(1) Parsing Algorithms
The input buffer contains the string to be parsed; $ is the end-of- input marker. The stack contains
a sequence of grammar symbols. Initially, the stack contains the start symbol of the grammar on
the top of $. The parser is controlled by a program that behaves as follows:
The program considers X, the symbol on top of the stack, and a, the current input symbol. These
two symbols, X and a determine the action of the parser.
There are three possibilities.
1. X = a = $, the parser halts and announces successful completion.
2. X = a ≠ $ the parser pops X off the stack and advances input pointer to next input symbol.
3. If X is a non-terminal, the program consults entry M[X,a] of parsing table M.
Atif Ishaq - Lecturer GC University, Lahore

a. If the entry is a production M[X,a] = {X? UVW}, the parser replaces X on top of the
stack by WVU(with U on top). As output, the parser just prints the production used:
X? UVW. However, any other code could be executed here.
b. If M[X,a] =error, the parser calls an error recovery routine

FIRST and FOLLOW


First (α) is a set of terminal symbols that begin in strings derived from α. For example, for the
production rule A → abc / def / ghi
We have
First(A) = { a , d , g }
Rules for calculating First Function
Rule-01
For a production rule X → ∈, First(X) = { ∈ }
Rule-02
For any terminal symbol ‘a’, First(a) = { a }
Rule-03
For a production rule X → Y1Y2Y3,
Calculating First (X)
 If ∈ ∉ First(Y1), then First(X) = First(Y1)
 If ∈ ∈ First(Y1), then First(X) = { First(Y1) – ∈ } ∪ First(Y2Y3)
Calculating First (Y2Y3)
 If ∈ ∉ First(Y2), then First(Y2Y3) = First(Y2)
 If ∈ ∈ First(Y2), then First(Y2Y3) = { First(Y2) – ∈ } ∪ First(Y3)
Similarly we can make expansion for any production rule X → Y1Y2Y3…..Yn.
Follow Function
Follow(α) is a set of terminal symbols that appear immediately to the right of α.
Rules for calculating Follow Function
Rule-01
For the start symbol S, place $ in Follow(S).
Rule-02
For any production rule A → αB, Follow(B) = Follow(A)
Atif Ishaq - Lecturer GC University, Lahore

Rule-03
For any production rule A → αBβ,
 If ∈ ∉ First(β), then Follow(B) = First(β)
 If ∈ ∈ First(β), then Follow(B) = { First(β) – ∈ } ∪ Follow(A)

Important Points to Note


 ∈ may appear in the first function of a non-terminal.
 ∈ will never appear in the follow function of a non-terminal.
 Before calculating the first and follow functions, eliminate Left Recursion from the grammar,
if present.
 We calculate the follow function of a non-terminal by looking where it is present on the RHS
of a production rule.
Example to understand First and Follow Computation
Example - 01
Consider the Grammar
S → ABCDE
A→a|∈
B→b|∈
C→c
D→d|∈
E→e|∈

We will compute First(S) and then Follow(S). Before computing Follow we first need to compute
FIRST. So what is First (S) we need to compute First(A) and we can see that First (A) is ‘a’ and
‘∈’. First (B) is {b, ∈} and First(C) is {c}, First(D) is {D, ∈} and First(E ) is {e, ∈}.
First(S) is {a,b,c} because to Compute First(S) we first need to Compute Frist(A). First(A) includes
an ∈ so if we substitute ∈ in place of A we are left with BCDE so we then need to compute is
First(B), which again contains an ∈ so we are left with CDE and then we need to compute First(C)
which is c only. Here we make use of rule-03 to compute First(S).
Next we need to compute Follow(S). Follow of start symbol is always $. Then we need to find
Follow(A) which is BCDE and Follow(A) is nothing else but First(B) and First(C). First(B) is {b,
∈) so if we replace B with ∈ we are left with C and First(C) is {C} so the follow(A) is {b,c}. the
complete result of First and Follow are shown in the table below
Atif Ishaq - Lecturer GC University, Lahore

Grammar FIRST FOLLOW


S → ABCDE {a,b,c} {$}
A→a|∈ {a, ∈} {b,c}
B→b|∈ {b, ∈} {c}
C→c {c} {d,e,$}
D→d|∈ {d, ∈} {e,$}
E→e|∈ {e, ∈} {$}

Note one thing in the above table, whenever a variable in a production on the right hand side has
nothing after it then the follow of that variable is whatever is the ‘follow of the symbol at the left
hand side’. As in our example the Symbol E has nothing after it so the Follow(E) is Follow(S).
Example - 02
Grammar FIRST FOLLOW
S → Bb | Cd {a,b,c,d} {$}
B → aB | ∈ {a, ∈} {b}
C → cC | ∈ {c, ∈} {d}

Example – 03
Grammar FIRST FOLLOW
E → TE’ {id, (} {$,)}
E’ → +TE’ | ∈ {+, ∈} {$,)}
T → FT’ {id, (} {+,$,)}
T’ → *FT’ | ∈ {*, ∈} {+,$,)}
F → id | (E) {id, (} {*,+,$,)}

Explanation of Example – 03

This is the example that we have already discussed in our previous lectures. In the given example
there is no left recursion as we eliminated it though elimination method. We need to compute
First(E) as E is the start symbol. First(E) is nothing but First(T) and First(T) is First(F). First we
find out the easy terms. First(F) is {id, (}, First(T) is actually First(F) that we have already
computed. So Frist(T) is {id, (}. Now we are able to compute First(E) which is actually First(T)
that is ultimately First(F). So First(E) is {id, (}. Similarly we can compute all others.
Now we need to compute Follow(E). As we have already mentioned that follow of start symbol is
always $. Now we need to look that whether there exists an E on the right hand side or not. We
have E in the right hand side in our last production. So whatever the follow of that E is also become
the Follow(E) which is {$,)}. And in the same we can compute the rest.
Atif Ishaq - Lecturer GC University, Lahore

Example – 04
Grammar FIRST FOLLOW
S → ACB | CbB | Ba {d,g,h, ∈,b,a} {$}
A → da | BC {d,g,h, ∈} {h,g,$}
B→g|∈ {g, ∈} {$,a,h,g}
C→h|∈ {h, ∈} {g,$,b,h}

Explanation
To compute First(S) we first need to compute First(A) which is d and B. B is a nonterminal so we
need to compute First(B) which G and ∈. So if we place ∈ in place of B we are left with C so we
then need to compute First(C) which is h. the important thing is that if both B and C are replaced
with ∈ we are left with ∈ so the First(A) will also become ∈. Similarly for the production 2 and 3
if we replace ∈ we are left with b and a and ultimately the computed First(A) are {d,g,h, ∈.b.a}.
we will only discuss one case of computing Follow and rest is left to you to understand as the
computation is already given. We compute Follow(C), so look at right hand side wherever we find
C we will see what is following C. We have C in the start production. The Follow of C is First(B)
which is already computed {g. ∈}. If we replace B with ∈ we are left with nothing after C so the
Follow(C) is whatever the Follow(S) that is $ in our case. We also have C in the production 1
where C is followed by b. we also have C in the production where A is on the right hand side.
And there is nothing after C so the Follow(C) is whatever is the Follow(A). What we have
computed at the end is Follow(C) is {g,$,b,h}.
Solve the following exercises to compute First and Follow
Exercise-1
S → aABb
A→c|∈
B→d|∈
Exercise-2
S → aBDh
B → cC
C → bC | ∈
D → EF
E→g|∈
F→f|∈

Constructing LL(1) Parsing Table

The method to construct the LL(1) parse table is summarized below. We will have
explanation on this table construction after this. First understand what is LL(1) table.
Atif Ishaq - Lecturer GC University, Lahore

Explanation
The input is Grammar G and output is Table M. for each production A  a of grammar we need
to do the following actions
1. For each terminal a in FIRST(A), add A  a to M[A,a]
2. If epsilon  belongs to FIRST(A) then for each terminal b in FOLLOW(A) add A  a to
M[A,b]
3. If epsilon belongs to FIRST(A) then for each terminal $ in FOLLOW(A) add A  a to
M[A,$]
The empty entry in the parse table represents the error.
Example

Consider the production E  TE’ , since


FIRST(TE’) = FIRST(T) = {id,(}
This production needs to be added in table to M[ E , id ] and M[ E , ( ].
The production E’  +TE’ |  is added to M[ E’ , + ] as FIRST(E’) is {+,  }. Since the FOLLOW(E’)
is {$.( } so we need to add production E’   to M [ E’ , ) ] and M [ E’ , $ ].
Atif Ishaq - Lecturer GC University, Lahore
Atif Ishaq - Lecturer GC University, Lahore

Arithmetic Expression

You might also like