Module-4 Normal Forms
Module-4 Normal Forms
TRUST’S
HIRASUGAR INSTITUTE OF TECHNOLOGY, NIDASOSHI
Accredited at 'A+' Grade by NAAC
Programmes Accredited by NBA: CSE, ECE
Prof. A. A. Daptardar
Asst. Prof. , Dept. of Computer Science &
Engg.,
1
Hirasugar Institute of Technology, Nidasoshi
Module-4
Content to be covered:
• Normal Forms for Context-Free
Grammars, The Pumping Lemma for
Context-Free Languages, Closure
Properties of Context-Free Languages.
• TEXT BOOK: Sections 7.1, 7.2, 7.3
2
NORMAL FORMS FOR
CONTEXT FREE GRAMMARS
Eliminating Useless Symbols
Computing the Generating and Reachable Symbols
Eliminating €-Productions
Eliminating Unit Productions
Chomsky Normal Form
3
Introduction
5
7.1.1 Eliminating Useless Symbols
6
Algorithm
• Stage 1 : Obtain the set of variables and
productions which derive only string of
terminals.
• The algorithm to obtain a set of variables :
• Step 1: ( Initialize old_ variables denoted by ov
to ᴓ)
ov = ᴓ
• Step 2: Take all productions of the form A x where x ϵ
T+ i.e if the RHS of the production contains only string of
terminals consider those productions and corresponding
non terminals on the LHS are added to new_variables
denoted by nv.
7
nv = {A | A x and x ϵ T+ }
• Step 3: Compare ov and nv: As long as
the elements in ov and nv are not equal,
repeat the following statements. Otherwise
go to step 4.
• a. [copy new variables to old variables]
ov = nv
• b. add all the elements in ov to nv. Also
add the variables which derive a string
consisting of terminals and non-terminals
which are in ov i.e.
nv = ov U { A | A y and y ϵ (ov U T)*}
8
• Step 4: After completion of step 3, nv(or
ov) contains all those non-terminals from
which only the string of terminals are
derived and add those variables to V1 i.e
V 1 = ov
• Step 5 : [Terminate the algorithm]
return V1
9
• Stage 2: Obtain the set of variables and
terminals which are reachable from the start
symbol. The productions which are not used are
useless. This can be obtained as shown below:
• Given a CFG G1 = (V1,T 1,P 1,S), we can find an
equivalent grammar G1=(V1,T 1,P 1,S) such that
for each X in (V1 U T1) there exists some α such
that
10
• The algorithm for this is shown below:
V 1 = {S}
For each A in V1
If A α then
Add the variables in α to V1
Add the terminals in α to T1
End if
End for
• Using this algorithm all those symbols (
whether variables or terminals) that are
not reachable from the start symbol are
eliminated. 11
ELIMINATING
€-PRODUCTIONS
15
Introduction
16
Definition
• Let G = ( V,T,P,S) be a CFG. A production
in P of the form
A ϵ
is called an ϵ–production or NULL
production.
• After applying the production the variable
A is erased. For each A in V, if there is a
derivation of the form
17
• Then A is a nullable variable.
• A nullable variable is defined as follows:
– 1. If A ϵ is a production in P, then A is a
nullable variable.
– 2. If A B1B2 ….Bn is a production in P, and
if B1,B2,……Bn are nullable variables, then A
is also nullable variable.
– 3. The variables for which there are
productions of the form shown in step 1 and
step 2 are nullable variables.
18
19
20
21
22
23
ELIMINATING
UNIT PRODUCTIONS
25
Unit productions
CNF-
CHOMSKY NORMAL FORM
31
CNF Definition:
Computing the Generating
and Reachable Symbols
40
Computing the Generating
Symbols
41
• By the basis, a and b are generating. For the
induction, we can use the production A b to
conclude that A is generating, and we can use
the production S a to conclude that S is
generating.
• At that point, the induction is finished.
• We cannot use the production S AB, because
B has not been established to be generating.
• Thus, the set of generating symbols is { a, b, A,
S}. 42
Computing Reachable Symbols
• Let us consider the inductive algorithm
whereby we find the set of reachable
symbols for the grammar G = (V, T, P, S).
• BASIS: S is surely reachable.
• INDUCTION: Suppose we have
discovered that some variable A is
reachable. Then for all productions with A
in the head, all the symbols of the bodies
of those productions are also reachable.
43
• By the basis, S is reachable. Since S has
production bodies AB and a, we conclude
that A, B, and a are reachable. B has no
productions, but A has A b. We
therefore conclude that b is reachable.
Now, no more symbols can be added to
the reachable set, which is {S,A,B,a,b}.
44
THE PUMPING LEMMA FOR
CONTEXT FREE LANGUAGES
The size of the parse tree
Statement of the Pumping Lemma
Applications of the Pumping Lemma
45
Introduction
• Now, we shall develop a tool for showing that
certain languages are not contextfree.
• The theorem, called the "pumping lemma for
context-free languages," says that in any
sufficiently long string in a CFL, it is possible to
find at most two short, nearby substrings, that
we can "pump" in tandem.
• That is, we may repeat both of the strings i
times, for any integer i, and the resulting string
will still be in the language.
46
The Size of the Parse Trees
• Our first step in deriving a pumping lemma
for CFVs is to examine the shape and size
of parse trees. One of the uses of CNF is
to turn parse trees into binary trees. These
trees have some convenient properties,
one of which we exploit here.
47
48
The Statement of the Pumping
Lemma
49
Applications of Pumping Lemma
50
51
52
53
54
55
56
57
58
59
60
61
62
63