Lecture CFG 02
Lecture CFG 02
than one rightmost derivation or more than one parse tree for the given input string. If the
grammar is not ambiguous, then it is called unambiguous.
If the grammar has ambiguity, then it is not good for compiler construction. No method can
automatically detect and remove the ambiguity, but we can remove ambiguity by re-writing the
whole grammar without ambiguity.
Example 1:
1. E→I
2. E→E+E
3. E→E*E
4. E → (E)
5. I → ε | 0 | 1 | 2 | ... | 9
Solution:
For the string "3 * 2 + 5", the above grammar can generate two parse trees by leftmost
derivation:
Since there are two parse trees for a single string "3 * 2 + 5", the grammar G is ambiguous.
Example 2:
1. E → E + E
2. E → E - E
3. E → id
Solution:
From the above grammar String "id + id - id" can be derived in 2 ways:
1. E → E + E
2. → id + E
3. → id + E - E
4. → id + id - E
5. → id + id- id
1. E → E - E
2. → E + E - E
3. → id + E - E
4. → id + id - E
5. → id + id - id
Since there are two leftmost derivation for a single string "id + id - id", the grammar G is
ambiguous.
Example 3:
1. S → aSb | SS
2. S → ε
Solution:
For the string "aabb" the above grammar can generate two parse trees
Since there are two parse trees for a single string "aabb", the grammar G is ambiguous.
Example 4:
1. A → AA
2. A → (A)
3. A → a
Solution:
For the string "a(a)aa" the above grammar can generate two parse trees:
Since there are two parse trees for a single string "a(a)aa", the grammar G is ambiguous.
A grammar can be unambiguous if the grammar does not contain ambiguity that means if it does
not contain more than one leftmost derivation or more than one rightmost derivation or more
than one parse tree for the given input string.
To convert ambiguous grammar to unambiguous grammar, we will apply the following rules:
1. If the left associative operators (+, -, *, /) are used in the production rule, then apply left
recursion in the production rule. Left recursion means that the leftmost symbol on the right side
is the same as the non-terminal on the left side. For example,
1. X → Xa
2. If the right associative operates(^) is used in the production rule then apply right recursion in
the production rule. Right recursion means that the rightmost symbol on the left side is the same
as the non-terminal on the right side. For example,
1. X → aX
Example 1:
1. S → AB | aaB
2. A → a | Aa
3. B → b
Solution:
1. S → AB
2. A → Aa | a
3. B → b
Example 2:
Show that the given grammar is ambiguous. Also, find an equivalent unambiguous grammar.
1. S → ABA
2. A → aA | ε
3. B → bB | ε
Solution:
The given grammar is ambiguous because we can derive two different parse tree for string aa.
The unambiguous grammar is:
1. S → aXY | bYZ | ε
2. Z → aZ | a
3. X → aXY | a | ε
4. Y → bYZ | b | ε
Example 3:
Show that the given grammar is ambiguous. Also, find an equivalent unambiguous grammar.
1. E → E + E
2. E → E * E
3. E → id
Solution:
1. E→E+T
2. E→T
3. T→T*F
4. T→F
5. F → id
Example 4:
Check that the given grammar is ambiguous or not. Also, find an equivalent unambiguous
grammar.
1. S→S+S
2. S→S*S
3. S→S^S
4. S→a
Solution:
The given grammar is ambiguous because the derivation of string aab can be represented by the
following string:
Unambiguous grammar will be:
1. S→S+A|
2. A→A*B|B
3. B→C^B|C
4. C→a
As we have seen, various languages can efficiently be represented by a context-free grammar.
All the grammars are not always optimized that means the grammar may consist of some extra
symbols (non-terminal). Having extra symbols, unnecessary increase the length of grammar.
Simplification of grammar means reduction of grammar by removing useless symbols. The
properties of reduced grammar are given below:
1. Each variable (i.e. non-terminal) and each terminal of G appears in the derivation of some word
in L.
2. There should not be any production as X → Y where X and Y are non-terminal.
3. If ε is not in the language L then there need not to be the production X → ε.
A symbol can be useless if it does not appear on the right-hand side of the production rule and
does not take part in the derivation of any string. That symbol is known as a useless symbol.
Similarly, a variable can be useless if it does not take part in the derivation of any string. That
variable is known as a useless variable.
For Example:
Production A → aA is also useless because there is no way to terminate it. If it never terminates,
then it can never produce a string. Hence this production can never take part in any derivation.
To remove this useless production A → aA, we will first find all the variables which will never
lead to a terminal string such as variable 'A'. Then we will remove all the productions in which
the variable 'B' occurs.
Elimination of ε Production
The productions of type S → ε are called ε productions. These type of productions can only be
removed from those grammars that do not generate ε.
Step 1: First find out all null able non-terminal variable which derives ε.
Step 2: For each production A → a, construct all production A → x, where x is obtained from a
by removing one or more non-terminal from step 1.
Step 3: Now combine the result of step 2 with the original production and remove ε productions.
Example:
Remove the production from the following CFG by preserving the meaning of it.
1. S → XYX
2. X → 0X | ε
3. Y → 1Y | ε
Solution:
Now, while removing ε production, we are deleting the rule X → ε and Y → ε. To preserve the
meaning of CFG we are actually placing ε at the right-hand side whenever X and Y have
appeared.
Let us take
1. S → XYX
1. S → YX
If Y = ε then
1. S → XX
1. S → X
1. S → Y
Now,
1. S → XY | YX | XX | X | Y
1. X → 0X
1. X → 0
2. X → 0X | 0
Similarly Y → 1Y | 1
1. S → XY | YX | XX | X | Y
2. X → 0X | 0
3. Y → 1Y | 1
The unit productions are the productions in which one non-terminal gives another non-terminal.
Use the following steps to remove unit production:
Step 3: Repeat step 1 and step 2 until all unit productions are removed.
For example:
1. S → 0A | 1B | C
2. A → 0S | 00
3. B→1|A
4. C → 01
Solution:
S → C is a unit production. But while removing S → C we have to consider what C gives. So,
we can add a rule to S.
1. S → 0A | 1B | 01
1. B → 1 | 0S | 00
1. S → 0A | 1B | 01
2. A → 0S | 00
3. B → 1 | 0S | 00
4. C → 01
CNF stands for Chomsky normal form. A CFG(context free grammar) is in CNF(Chomsky
normal form) if all production rules satisfy one of the following conditions:
For example:
1. G1 = {S → AB, S → c, A → a, B → b}
2. G2 = {S → aA, A → a, B → c}
The production rules of Grammar G1 satisfy the rules specified for CNF, so the grammar G1 is
in CNF. However, the production rule of Grammar G2 does not satisfy the rules specified for
CNF as S → aZ contains terminal followed by non-terminal. So the grammar G2 is not in CNF.
Step 1: Eliminate start symbol from the RHS. If the start symbol T is at the right-hand side of
any production, create a new production as:
1. S1 → S
Step 2: In the grammar, remove the null, unit and useless productions. You can refer to the
Simplification of CFG.
Step 3: Eliminate terminals from the RHS of the production if they exist with other non-
terminals or terminals. For example, production S → aA can be decomposed as:
1. S → RA
2. R → a
Step 4: Eliminate RHS with more than two non-terminals. For example, S → ASB can be
decomposed as:
1. S → RS
2. R → AS
Example:
Convert the given CFG to CNF. Consider the given grammar G1:
1. S → a | aA | B
2. A → aBB | ε
3. B → Aa | b
Solution:
Step 1: We will create a new production S1 → S, as the start symbol S appears on the RHS. The
grammar will be:
1. S1 → S
2. S → a | aA | B
3. A → aBB | ε
4. B → Aa | b
Step 2: As grammar G1 contains A → ε null production, its removal from the grammar yields:
1. S1 → S
2. S → a | aA | B
3. A → aBB
4. B → Aa | b | a
1. S1 → S
2. S → a | aA | Aa | b
3. A → aBB
4. B → Aa | b | a
Also remove the unit production S1 → S, its removal from the grammar yields:
1. S0 → a | aA | Aa | b
2. S → a | aA | Aa | b
3. A → aBB
4. B → Aa | b | a
Step 3: In the production rule S0 → aA | Aa, S → aA | Aa, A → aBB and B → Aa, terminal a
exists on RHS with non-terminals. So we will replace terminal a with X:
1. S0 → a | XA | AX | b
2. S → a | XA | AX | b
3. A → XBB
4. B → AX | b | a
5. X→a
Step 4: In the production rule A → XBB, RHS has more than two symbols, removing it from
grammar yield:
1. S0 → a | XA | AX | b
2. S → a | XA | AX | b
3. A → RB
4. B → AX | b | a
5. X→a
6. R → XB