Module 4
Context-Free and Non-context-Free
Languages
Introduction
Language L = {an bn} n ≥ 1 } is context-free
The language L = {an bn cn } n ≥ 1 } is not a context-free
Language, because a PDA's stack can not count all three of the
letter regions and compare them.
L= { wwR : w € {a, b}*} is context-free.
Language L = { ww| : w € {a. b }*} is not context-free, because
a
stack can not pop the characters of w off in the same order in
which they were pushed.
Given a new language L, how can we know whether or not it is context-
free?
The regular languages are a proper subset of the context-free
languages.
• Every regular languages are Context Free Language.
• Every CFLs are not Regular language.
• There is a countable infinite number of context –Free
languages.
Showing That a Language is Context-Free
A language L is CFL:
• L exhibit a context-free grammar for it.
• L exhibit a (possibly nondeterministic) PDA for it.
The Pumping Theorem for Context-Free Languages
• Height of tree(h) –The height of a tree is the number of
edges on the longest downward path between the root
and a leaf.
• The branching factor (b) of a tree is the largest number of
daughters of any node in the tree. The length of the
longest right-hand side of any rule in P.
• The yield of a tree is the ordered sequence of its leaf
nodes. (String or word generated from G)
E →E+E
E →E*E
E → (E) | a| b |c
Height of the tree(h) = 3
Branching factor = 3
Length of the yield of tree = 5
ie yield = a+b*c
The length of the yield of any tree T with height h and branching factor b is ≤ bh
Basis:
If height of tree h is 1 ie: a grammar G contains single rule: say S → a
then the yield length = 1
ie: the longest yield of length ≤ b
longest yield of length ≤ bh is true for h = n, then it is true for h= n+1
also.
Let G = (V, T, P, S) be a CFG and n is the variables or Non-terminal
symbols used in CFG. ie n = V
b be the branching factor of G
For any parse tree constructed from G, with no non-terminal
appears more than once on any one path from root node of the
parse tree to a non-terminal must have height ≤ bn
E →E+E
E →E*E
E → (E) | a| b |c
So the longest string (yield of a tree) generated from G, with no
Non-terminal appears more than once on any one path is ≤ bn
|w| ≤ bn
Suppose the string w generated from G with |w| > bn .
Then any parse tree that generates string w from G must contain at
least one path that contains at least one repeated non-terminal.
Or Grammar G must have at least one recursive rules.
Where X is some repeated Non-terminal X → aYb
Y → bXa
X → ab
Generated string w is split up into 5 pieces: u, v, x, y, z
1.There is another derivation in G:
The above string uxz is also in L(G)
Show that L = { anbncn | n 1 } is not a context free language.
Assume that the language L is context free and there exist some
constant k ≥ 1 such that any string w, where |w| ≥ k must satisfy the
conditions of the theorem.
Let w = akbkck ; since |w| k we can write w = uvxyz, where
|vxy| k and |vy| 1
Let us consider the string vxy within ak.
ie: uvxy = ak and z = bkck
v = ai
y = aj , so the |vy| = i + j 1
According to pumping lemma, uvqxyqz € L for all q 0
When q =2, uv2xy2z € L
ie: ak + i + j bk ck where i + j 1
= ak + 1 bk ck not in L when i + j =1,
But according to pumping lemma language L to be regular
uv2xy2z € L, which is a contradiction. So the language L = { an bn cn
| n 1 } is not a context free language.
Closure Properties of Context-Free Languages
1. Context free languages are closed under Union,
Concatenation, star closure and homomorphism.
2. Context free languages are not closed under intersection
3. Context free languages are not closed under
complementation.
4. Intersection of a context-free language and a regular
language is context-free.
5. The difference of a context-free language and a regular
language is context-free
Context free languages are closed under Union
If L1 and L2 are CFLs, then there exists context Free grammars
G1 = (V1, T1, P1, S1) and G2 = (V2, T2, P2, S2), such that
L1 =L(G1) and L2 = L(G2).
We will build a new grammar G such that
L(G) = L(G1 ) U L(G2 )
= L1 U L 2
construct a grammar G = (V, T, P, S), where
P = P1P2 {SS1|SS2},
V = V1 V2 {S} T = T1 T2
It is clear that grammar G is context free and language generated by
this grammar is context free, so the language L1U L2 is context free
language.
Context free languages are closed under Concatenation
If L1 and L2 are CFLs, then there exists context Free grammars
G1 = (V1, T1, P1, S1) and G2 = (V2, T2, P2, S2), such that
L1 =L(G1) and L2 = L(G2).
We will build a new grammar G such that
L(G) = L(G1 ) . L(G2 )
= L1 . L2
construct a grammar G = (V, T, P, S), where
P = P1P2 {SS1 . SS2},
V = V1 V2 {S} T = T1 T2
It is clear that grammar G is context free and language generated by
this grammar is context free, so the language L1. L2 is context free
language.
Context free languages are closed under star closure
If L1 is a CFL, then there exist a context Free grammar
G1 = (V1, T1, P1, S1) such that L1 =L(G1)
We will build a new grammar G such that L(G) = L(G 1)*
G will contain all the rules of G1, we add to G a new start symbol,
S, and two new rules, S→ ε and S→SS1
So G = (V1 U {S}, T, P1 U {S→ ε, S→SS1}, S)
Context free languages are closed under homomorphism
Suppose L is CFL over alphabet Σ, and h is homomorphism on Σ.
Let s be the substitution that replaces each symbol a in Σ by the
language consisting of the one string that is h(a). That is ,
s(a) = { h(a) }, for all a in Σ. Then h(L) = s(L).
example: S → 0S0 | 1S1 | ε and h(0) = ab and h(1) = bb, then
h(G) is given by
S → abSab | bbSbb | ε
Context free languages are not closed under Intersection
Let L1 and L2 are CFLs.
L1 = { 0n 1n 2i | n 1, i 1 }
L2 = { 0i 1n 2n | n 1, i 1 }
A grammar for L1 is:
S → AB
A → 0A1 | 01
B → 2B| 2
A grammar for L2 is:
S → AB
A → 0A | 0
B → 1B2| 12
L = L1 ∩ L2, where L1 requires that there be the same number
of 0’s and 1s, while L2 requires the numbers of 1s and 2s to be
equal.
A string in both languages must have equal numbers of all
three symbols, that is in L
L = { 0n 1n 2n | n 1 }
But according to pumping lemma, L = { 0n 1n 2n | n 1 } is not
a context free language.
So L1 ∩ L2 is not a CFL.
Context free languages are not closed under Complementation
If L is CFL and R is regular language, then show that L∩R is CFL.
Let L be a CFL, and let R be a regular language.
Let P be the PDA that accept the language L by final state
and let M be the DFA that accept the language R.
L = L(P) for PDA P = (KL, , Γ, L, sL, Z0, AL), and
R = L(M) for DFA M = (KR, , R, sR, AR).
Construct a PDA P’ for L R by simulating P and M in
parallel.
The states of new machine P’ recognizing the language
L∩R, are pairs of states, where one element of the pair
corresponds to the state of P and other corresponds to
the state of M. That means whenever P’ reads the input
symbol, both P and M are simulated.
P’ = (KLKR, , Γ, , [sL, sR], Z0, ALAR), where
([p, q], a, X) contains ([p, q], ) if and only if
R(p, a) = p, L(q, a, X) contains (q, ).
The difference of a context-free language and a regular language
is context-free
If L is CFL and R is regular language, then show that L -
R is CFL.
• We know that complement of a regular language is regular, so
the complement of R is also regular.
• But L – R = L∩ ¬R
• Also we know that intersection of context free language and
regular language is context free language.
• Therefore L – R = L∩ ¬ R is context free.
• So the difference of context free language and regular
language is context free.
Deterministic CFL
A language L is said to be deterministic CFL if it is accepted by some
Deterministic PDA.
Prove that if L is a regular language then L = L(P) for some DPDA
Any language L is a subset of DCFL
L = a* b* is a regular language.
L = an bn is a DCFL---------- accepted by DPDA
Therefore L = a* b* is also accepted by DPDA.
Every RL is accepted by DFA and every DFA is a DPDA without stack
Hence RL L = L(P) there exists some DPDA P
Algorithms and Decision Procedures for CFL
Decidable Questions.
Various decision properties of CFLs are:
1.Membership
2.Emptiness
3.Finiteness
Membership
• There exists an algorithm which tells whether the given
string belongs to given grammar G.
• To check whether the given grammar generates desired
string; derivation tree (Parse tree) can be drawn.
S → PQa | b
P → Xb
Q → bPP
X→ aaa
Check whether it generates string: aaabbaaabaaaba
The leaves of the parse tree is the string:aaabbaaabaaaba
Emptiness
There exists an algorithm which can determine whether or not the
given CFG can generate any word or string at all.
1. Put the dot(.) above every terminal symbol.(Also put dot over ɛ if null
string is defined) on RHS
2.If all the symbols (V or T) of RHS of the production rule are dotted then
dot the corresponding Non-terminal on LHS. Repeat the step as long as
possible.
3.If the start symbol of CFG gets dotted then answer is YES otherwise
answer is NO
Check whether the following CFG generates any string
S → PQ
P → AP | AA
A→a
Q→ BQ | BB
B→b
Since the start symbol of CFG gets
dotted so the answer is YES
Finiteness
There exists an algorithm to determine whether the given CFG
generates a finite or infinite language
1.Make the grammar G simplified by eliminating unit and useless
symbols.
2.Check whether there is any self-embedded non-terminal or not
by following steps:
a) Underline some non-terminal symbol say A which is at LHS of
rule.
b) Then put dot over all the A’s which are at RHS throughout the
grammar. The dotted A and underlined A are treated as two
different symbols
c. Put the dot over any non-terminal on LHS whose RHS contains
any dotted symbols. Dot this non-terminal throughout the
grammar.
d. If underlined A gets dotted then A is called self embedded
otherwise it is not self embedded.
3. If grammar contains any self embedded Non-terminal then it
generates infinite language otherwise it generates finiite
language.
Whether the given CFG generates a finite or infinite language
Sis→
Z PQa |symbol
a useless bPZ |b
S
P→→ PQa |b
Xb | bZa
P
Q→→ Xb
bPP
Q
X→→aZa
bPP| aaa
X→ aaa
Z → ZPbP
Grammar generates Finite language, since it is not containing
self embedded NT