Lecture 10
Lecture 10
Compiler Construction
CS-4207
Lecture – 10
Left Factoring
Left Factoring is a grammar transformation technique. It consists in “factoring out” prefixes
which are common to two or more productions. Left factoring is removing the common left
factor that appears in two or more production of the same non-terminal. It is done to avoid
backtracking by parser.
For example if we have the two productions
stmt → if exp then stmt else stmt
| if exp then stmt
In this grammar, if the input symbol is ‘if’, we cannot immediately tell which production to
choose to expand stmt.
Consider another grammar
E→T+E|T
T → int | int T | (E)
It is impossible to predict because for T, two productions starts with int. For E, it is not clear how
to predict; the two productions starts with the non-terminal T. A grammar must be left factored
before use for predictive parsing. The procedure to left factor a grammar is as follows
If α ≠ ∈ replace all productions
A 1 | 2 | 3 | 4 | …. | n |
Replace with
A Z |
Z 1 | 2 | 3 | 4 | …. | n
A graphical explanation can be represented as below
Atif Ishaq - Lecturer GC University, Lahore
We have already discussed recursive decent parsing. Let’s have a look at the function
(algorithm) again
LL(K) Grammar
LL (k) is a grammar in which purser scans the input from left to right and generate the leftmost
derivation, and it can take a decision about reduction by looking next k symbol only. That is why;
it is called LL (k) grammar
a. If the entry is a production M[X,a] = {X? UVW}, the parser replaces X on top of the
stack by WVU(with U on top). As output, the parser just prints the production used:
X? UVW. However, any other code could be executed here.
b. If M[X,a] =error, the parser calls an error recovery routine
Rule-03
For any production rule A → αBβ,
If ∈ ∉ First(β), then Follow(B) = First(β)
If ∈ ∈ First(β), then Follow(B) = { First(β) – ∈ } ∪ Follow(A)
We will compute First(S) and then Follow(S). Before computing Follow we first need to compute
FIRST. So what is First (S) we need to compute First(A) and we can see that First (A) is ‘a’ and
‘∈’. First (B) is {b, ∈} and First(C) is {c}, First(D) is {D, ∈} and First(E ) is {e, ∈}.
First(S) is {a,b,c} because to Compute First(S) we first need to Compute Frist(A). First(A) includes
an ∈ so if we substitute ∈ in place of A we are left with BCDE so we then need to compute is
First(B), which again contains an ∈ so we are left with CDE and then we need to compute First(C)
which is c only. Here we make use of rule-03 to compute First(S).
Next we need to compute Follow(S). Follow of start symbol is always $. Then we need to find
Follow(A) which is BCDE and Follow(A) is nothing else but First(B) and First(C). First(B) is {b,
∈) so if we replace B with ∈ we are left with C and First(C) is {C} so the follow(A) is {b,c}. the
complete result of First and Follow are shown in the table below
Atif Ishaq - Lecturer GC University, Lahore
Note one thing in the above table, whenever a variable in a production on the right hand side has
nothing after it then the follow of that variable is whatever is the ‘follow of the symbol at the left
hand side’. As in our example the Symbol E has nothing after it so the Follow(E) is Follow(S).
Example - 02
Grammar FIRST FOLLOW
S → Bb | Cd {a,b,c,d} {$}
B → aB | ∈ {a, ∈} {b}
C → cC | ∈ {c, ∈} {d}
Example – 03
Grammar FIRST FOLLOW
E → TE’ {id, (} {$,)}
E’ → +TE’ | ∈ {+, ∈} {$,)}
T → FT’ {id, (} {+,$,)}
T’ → *FT’ | ∈ {*, ∈} {+,$,)}
F → id | (E) {id, (} {*,+,$,)}
Explanation of Example – 03
This is the example that we have already discussed in our previous lectures. In the given example
there is no left recursion as we eliminated it though elimination method. We need to compute
First(E) as E is the start symbol. First(E) is nothing but First(T) and First(T) is First(F). First we
find out the easy terms. First(F) is {id, (}, First(T) is actually First(F) that we have already
computed. So Frist(T) is {id, (}. Now we are able to compute First(E) which is actually First(T)
that is ultimately First(F). So First(E) is {id, (}. Similarly we can compute all others.
Now we need to compute Follow(E). As we have already mentioned that follow of start symbol is
always $. Now we need to look that whether there exists an E on the right hand side or not. We
have E in the right hand side in our last production. So whatever the follow of that E is also become
the Follow(E) which is {$,)}. And in the same we can compute the rest.
Atif Ishaq - Lecturer GC University, Lahore
Example – 04
Grammar FIRST FOLLOW
S → ACB | CbB | Ba {d,g,h, ∈,b,a} {$}
A → da | BC {d,g,h, ∈} {h,g,$}
B→g|∈ {g, ∈} {$,a,h,g}
C→h|∈ {h, ∈} {g,$,b,h}
Explanation
To compute First(S) we first need to compute First(A) which is d and B. B is a nonterminal so we
need to compute First(B) which G and ∈. So if we place ∈ in place of B we are left with C so we
then need to compute First(C) which is h. the important thing is that if both B and C are replaced
with ∈ we are left with ∈ so the First(A) will also become ∈. Similarly for the production 2 and 3
if we replace ∈ we are left with b and a and ultimately the computed First(A) are {d,g,h, ∈.b.a}.
we will only discuss one case of computing Follow and rest is left to you to understand as the
computation is already given. We compute Follow(C), so look at right hand side wherever we find
C we will see what is following C. We have C in the start production. The Follow of C is First(B)
which is already computed {g. ∈}. If we replace B with ∈ we are left with nothing after C so the
Follow(C) is whatever the Follow(S) that is $ in our case. We also have C in the production 1
where C is followed by b. we also have C in the production where A is on the right hand side.
And there is nothing after C so the Follow(C) is whatever is the Follow(A). What we have
computed at the end is Follow(C) is {g,$,b,h}.
Solve the following exercises to compute First and Follow
Exercise-1
S → aABb
A→c|∈
B→d|∈
Exercise-2
S → aBDh
B → cC
C → bC | ∈
D → EF
E→g|∈
F→f|∈
The method to construct the LL(1) parse table is summarized below. We will have
explanation on this table construction after this. First understand what is LL(1) table.
Atif Ishaq - Lecturer GC University, Lahore
Explanation
The input is Grammar G and output is Table M. for each production A a of grammar we need
to do the following actions
1. For each terminal a in FIRST(A), add A a to M[A,a]
2. If epsilon belongs to FIRST(A) then for each terminal b in FOLLOW(A) add A a to
M[A,b]
3. If epsilon belongs to FIRST(A) then for each terminal $ in FOLLOW(A) add A a to
M[A,$]
The empty entry in the parse table represents the error.
Example
Arithmetic Expression