TPL Lect 17-20
TPL Lect 17-20
Context-free grammar is a formal grammar that is used to generate all possible strings in
a given formal language.
1. G= (V, T, P, S)
Where,
In CFG, the start symbol is used to derive the string. You can derive the string by
repeatedly replacing a non-terminal by the right hand side of the production, until all
non-terminal have been replaced by terminal symbols.
Capabilities of CFG
There are the various capabilities of CFG:
Construct CFG for the language having any number of a's over the set ∑={a}
Solution
Regular Expression= a*
S->aS rule 1
S-> ε rule 2
Now if we want to derive a string "aaaaaa" we can start with start symbol
S rule
aS 1
aaS 1
aaaS 1
aaaaS 1
aaaaaS 1
aaaaaaS 1
Aaaaaa 2
Example:
Production rules:
1. S → aSa
2. S → bSb
3. S → c
Now check that abbcbba string can be derived from the given CFG.
1. S ⇒ aSa
2. S ⇒ abSba
3. S ⇒ abbSbba
4. S ⇒ abbcbba
By applying the production S → aSa, S → bSb recursively and finally applying the
production S → c, we get the string abbcbba.
Exercise
Grammar: S → aS | bS | a | b
Derive: S abbab
Grammar: S → aA | bB
A → aS | a
B→ bS | b
Derive: bbaaaa
Derivation
Derivation is a sequence of production rules. It is used to get the input string through
these production rules. During parsing we have to take two decisions. These are as follows:
We have two options to decide which non-terminal to be replaced with production rule.
Left-most Derivation
In the leftmost derivation, the input is scanned and replaced with the production rule from
left to right. So in left-most derivatives we read the input string from left to right.
Example:
Production rules:
1. S = S + S
2. S = S - S
3. S = a | b |c
Input:
a-b+c
1. S = S + S
2. S = S - S + S
3. S = a - S + S
4. S = a - b + S
5. S = a - b + c
Right-most Derivation
In the right most derivation, the input is scanned and replaced with the production rule
from right to left. So in right most derivatives we read the input string from right to left.
Example:
1. S = S + S
2. S = S - S
3. S = a | b |c
Input:
a-b+c
1. S = S - S
2. S = S - S + S
3. S = S - S + c
4. S = S - b + c
5. S = a - b + c
Parse tree
o Parse tree is the graphical representation of symbol. The symbol can be terminal or non-
terminal.
o In parsing, the string is derived using the start symbol. The root of the parse tree is that
start symbol.
o It is the graphical representation of symbol that can be terminals or non-terminals.
o Parse tree follows the precedence of operators. The deepest sub-tree traversed first. So,
the operator in the parent node has less precedence over the operator in the sub-tree.
The parse tree follows these points:
o All leaf nodes have to be terminals.
o All interior nodes have to be non-terminals.
o In-order traversal gives original input string.
Example:
Production rules:
1. T= T + T | T * T
2. T = a |b |c
Input:
a * b + c
Step 1:
Step 2:
Step 3:
Step 4:
Step 5:
Ambiguity
A grammar is said to be ambiguous if there exists more than one leftmost derivation or
more than one rightmost derivative or more than one parse tree for the given input string.
If the grammar is not ambiguous then it is called unambiguous.
Example:
1. S = aSb | SS
2. S = ∈
For the string aabb, the above grammar generates two parse trees:
If the grammar has ambiguity, then it is not good for a compiler construction. No method
can automatically detect and remove the ambiguity but you can remove ambiguity by re-
writing the whole grammar without ambiguity.