Flat Module 3
Flat Module 3
Context free grammar is a formal grammar which is used to generate all possible strings in a
given formal language.
G= (V, T, P, S)
Where,
In CFG, the start symbol is used to derive the string. You can derive the string by repeatedly
replacing a non-terminal by the right hand side of the production, until all non-terminal have
been replaced by terminal symbols.
Example:
Production rules:
1. S → aSa
2. S → bSb
3. S → c
Now check that abbcbba string can be derived from the given CFG.
1. S ⇒ aSa
2. S ⇒ abSba
3. S ⇒ abbSbba
4. S ⇒ abbcbba
By applying the production S → aSa, S → bSb recursively and finally applying the production S
→ c, we get the string abbcbba.
Capabilities of CFG
CFL
Derivation is a sequence of production rules. It is used to get the input string through these
production rules. During parsing we have to take two decisions. These are as follows:
We have two options to decide which non-terminal to be replaced with production rule.
Left-most Derivation
In the left most derivation, the input is scanned and replaced with the production rule from left to
right. So in left most derivatives we read the input string from left to right.
Example:
Production rules:
1. S = S + S
2. S = S - S
3. S = a | b |c
Input:
a-b+c
1. S = S + S
2. S = S - S + S
3. S = a - S + S
4. S = a - b + S
5. S = a - b + c
Right-most Derivation
In the right most derivation, the input is scanned and replaced with the production rule from right
to left. So in right most derivatives we read the input string from right to left.
Example:
1. S = S + S
2. S = S - S
3. S = a | b |c
Input:
a-b+c
1. S = S - S
2. S = S - S + S
3. S = S - S + c
4. S = S - b + c
5. S = a - b + c
Parse tree
o Parse tree is the graphical representation of symbol. The symbol can be terminal or non-
terminal.
o In parsing, the string is derived using the start symbol. The root of the parse tree is that
start symbol.
o It is the graphical representation of symbol that can be terminals or non-terminals.
o Parse tree follows the precedence of operators. The deepest sub-tree traversed first. So,
the operator in the parent node has less precedence over the operator in the sub-tree.
Example:
Production rules:
T= T + T | T * T
1. T = a|b|c
Input:
a*b+c
Step 1:
Step 2:
Step 3:
Step 4:
Step 5:
Ambiguity
A grammar is said to be ambiguous if there exists more than one leftmost derivation or more
than one rightmost derivative or more than one parse tree for the given input string. If the
grammar is not ambiguous then it is called unambiguous.
Example:
1. S = aSb | SS
2. S = ∈
For the string aabb, the above grammar generates two parse trees:
If the grammar has ambiguity then it is not good for a compiler construction. No method can
automatically detect and remove the ambiguity but you can remove ambiguity by re-writing the
whole grammar without ambiguity.
For example:
1. G1 = {S → AB, S → c, A → a, B → b}
2. G2 = {S → aA, A → a, B → c}
The production rules of Grammar G1 satisfy the rules specified for CNF, so the grammar G1 is
in CNF. However, the production rule of Grammar G2 does not satisfy the rules specified for
CNF as S → aZ contains terminal followed by non-terminal. So the grammar G2 is not in CNF.
Step 1: Eliminate start symbol from the RHS. If the start symbol T is at the right-hand side of
any production, create a new production as:
1. S1 → S
Step 2: In the grammar, remove the null, unit and useless productions. You can refer to
the Simplification of CFG.
Step 3: Eliminate terminals from the RHS of the production if they exist with other non-
terminals or terminals. For example, production S → aA can be decomposed as:
1. S → RA
2. R → a
Step 4: Eliminate RHS with more than two non-terminals. For example, S → ASB can be
decomposed as:
1. S → RS
2. R → AS
Example:
Convert the given CFG to CNF. Consider the given grammar G1:
1. S → a | aA | B
2. A → aBB | ε
3. B → Aa | b
Solution:
Step 1: We will create a new production S1 → S, as the start symbol S appears on the RHS. The
grammar will be:
1. S1 → S
2. S → a | aA | B
3. A → aBB | ε
4. B → Aa | b
Step 2: As grammar G1 contains A → ε null production, its removal from the grammar yields:
1. S1 → S
2. S → a | aA | B
3. A → aBB
4. B → Aa | b | a
1. S1 → S
2. S → a | aA | Aa | b
3. A → aBB
4. B → Aa | b | a
Also remove the unit production S1 → S, its removal from the grammar yields:
1. S0 → a | aA | Aa | b
2. S → a | aA | Aa | b
3. A → aBB
4. B → Aa | b | a
Step 3: In the production rule S0 → aA | Aa, S → aA | Aa, A → aBB and B → Aa, terminal a
exists on RHS with non-terminals. So we will replace terminal a with X:
1. S0 → a | XA | AX | b
2. S → a | XA | AX | b
3. A → XBB
4. B → AX | b | a
5. X → a
Step 4: In the production rule A → XBB, RHS has more than two symbols, removing it from
grammar yield:
1. S0 → a | XA | AX | b
2. S → a | XA | AX | b
3. A → RB
4. B → AX | b | a
5. X → a
6. R → XB
GNF stands for Greibach normal form. A CFG(context free grammar) is in GNF(Greibach
normal form) if all the production rules satisfy one of the following conditions:
For example:
The production rules of Grammar G1 satisfy the rules specified for GNF, so the grammar G1 is
in GNF. However, the production rule of Grammar G2 does not satisfy the rules specified for
GNF as A → ε and B → ε contains ε(only start symbol can generate ε). So the grammar G2 is
not in GNF.
If the context free grammar contains left recursion, eliminate it. You can refer the following
topic to eliminate left recursion: Left Recursion
Step 3: In the grammar, convert the given production rule into GNF form.
If any production rule in the grammar is not in GNF form, convert it.
Example:
1. S → XB | AA
2. A → a | SA
3. B → b
4. X → a
Solution:
As the given grammar G is already in CNF and there is no left recursion, so we can skip step 1
and step 2 and directly go to step 3.
1. S → XB | AA
2. A → a | XBA | AAA
3. B → b
4. X → a
1. S → aB | AA
2. A → a | aBA | AAA
3. B → b
4. X → a
1. S → aB | AA
2. A → aC | aBAC | a | aBA
3. C → AAC | AA
4. B → b
5. X → a
The CFG which accepts deterministic PDA accepts non-deterministic PDAs as well. Similarly,
there are some CFGs which can be accepted only by NPDA and not by DPDA. Thus NPDA is
more powerful than DPDA.
Example:
Suppose the language consists of string L = {aba, aa, bb, bab, bbabb, aabaa, ......]. The string can
be odd palindrome or even palindrome. The logic for constructing PDA is that we will push a
symbol onto the stack till half of the string then we will read each symbol and then perform the
pop operation. We will compare to see whether the symbol which is popped is similar to the
symbol which is read. Whether we reach to end of the input, we expect the stack to be empty.
This PDA is a non-deterministic PDA because finding the mid for the given string and reading
the string from left and matching it with from right (reverse) direction leads to non-deterministic
moves. Here is the ID.
Simulation of abaaba
1. δ(q1, abaaba, Z) Apply rule 1
2. ⊢ δ(q1, baaba, aZ) Apply rule 5
3. ⊢ δ(q1, aaba, baZ) Apply rule 4
4. ⊢ δ(q1, aba, abaZ) Apply rule 7
5. ⊢ δ(q2, ba, baZ) Apply rule 8
6. ⊢ δ(q2, a, aZ) Apply rule 7
7. ⊢ δ(q2, ε, Z) Apply rule 11
8. ⊢ δ(q2, ε) Accept
CFG to PDA Conversion
The first symbol on R.H.S. production must be a terminal symbol. The following steps are used
to obtain PDA from CFG is:
Step 3: The initial symbol of CFG will be the initial symbol in the PDA.
1. δ(q, ε, A) = (q, α)
Example 1:
Convert the following grammar to a PDA that accepts the same language.
1. S → 0S1 | A
2. A → 1A0 | S | ε
Solution:
1. S → 0S1 | 1S0 | ε
Now we will convert this CFG to GNF:
1. S → 0SX | 1SY | ε
2. X → 1
3. Y → 0
Example 2:
Construct PDA for the given CFG, and test whether 0104 is acceptable by this PDA.
1. S → 0BB
2. B → 0S | 1S | 0
Solution:
Example 3:
1. S → aSb
2. S → a | b | ε
Solution:
|vxy| >=n
|vy| # ε
For all k>=0, the string uvkxyyz∈L
The steps to prove that the language is not a context free by using pumping lemma are explained
below −
The Deterministic Pushdown Automata is a variation of pushdown automata that accepts the
deterministic context-free languages.
A PDA is said to be deterministic if its transition function δ(q,a,X) has at most one member for -
a ∈ Σ U {ε}
So, for a deterministic PDA, there is at most one transition possible in any combination of state,
input symbol and stack top.
Union
Concatenation
Kleene Star operation
Union
Let L1 and L2 be two context free languages. Then L1 ∪ L2 is also context free.
Example
Let L1 = { anbn , n > 0}. Corresponding grammar G1 will have P: S1 → aAb|ab
Let L2 = { cmdm , m ≥ 0}. Corresponding grammar G2 will have P: S2 → cBb| ε
Union of L1 and L2, L = L1 ∪ L2 = { anbn } ∪ { cmdm }
The corresponding grammar G will have the additional production S → S1 | S2
Concatenation
If L1 and L2 are context free languages, then L1L2 is also context free.
Example
Union of the languages L1 and L2, L = L1L2 = { anbncmdm }
The corresponding grammar G will have the additional production S → S1 S2
Kleene Star
If L is a context free language, then L* is also context free.
Example
Let L = { anbn , n ≥ 0}. Corresponding grammar G will have P: S → aAb| ε
Kleene Star L1 = { anbn }*
The corresponding grammar G1 will have additional productions S1 → SS1 | ε
Context-free languages are not closed under −
Intersection − If L1 and L2 are context free languages, then L1 ∩ L2 is not necessarily
context free.
Intersection with Regular Language − If L1 is a regular language and L2 is a context
free language, then L1 ∩ L2 is a context free language.
Complement − If L1 is a context free language, then L1’ may not be context free.