0% found this document useful (0 votes)
16 views3 pages

Note

Variability of the subsequent components of mapreduce

Uploaded by

Sundar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views3 pages

Note

Variability of the subsequent components of mapreduce

Uploaded by

Sundar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

MA513: Formal Languages and Automata Theory

Topic: Properties of Context-free Languages


Lecture Number 28 Date: October 17, 2011

1 Eliminating Unit Productions


Definition: A unit production U in a CFG G = (V, T, R, S) is a production (i.e.,
member of R) of the form U : A → B, where A and B are both variables. The
variable B is said to be A-derivable.
For a variable A ∈ V of the CFG G = (V, T, R, S), the Algorithm 1 find the set of
A-derivable variables.

Algorithm 1 Derivable(G, A)
1: Input: CFG G = (V, T, R, S) and a variable A ∈ V .
2: Output: Set W of A-derivable variables in the grammar G.
3: W ← ∅, W 0 ← ∅
4: for (each production A → B ∈ R) do
5: W = W ∪ {B}
6: end for
7: while (W 0 6= W ) do
8: W0 = W
9: for (each C ∈ W 0 ) do
10: for (each C → B ∈ R such that B 6= A) do
11: W = W ∪ {B}
12: end for
13: end for
14: end while
15: Return (W )

1
• If G = (V, T, R, S) is a CFG with no null production, then we can design a algo-
rithm (see Algorithm 2) to find a CFG G1 = (V, T, R1 , S) having no unit production
such that L(G1 ) = L(G).

Algorithm 2 Elimination of unit Productions(G)


1: Input: CFG G = (V, T, R, S)
2: Output: CFG G1 = (V, T, R1 , S) having no unit production s. t. L(G1 ) = L(G)
3: R1 ← R
4: for (each A ∈ V ) do
5: W = Derivable(G, A) /* Using the Algorithm 1 */
6: for (each B ∈ W ) do
7: for (each non-unit production B → α ∈ R) do
8: if (A → α 6∈ R1 ) then
9: R1 = R1 ∪ {A → α}
10: end if
11: end for
12: end for
13: end for
14: Delete all unit productions from R1
15: Return (G1 = (V, T, R1 , S))

Now, we can summarize the various simplifications on grammar described so far. We


want to convert any CFG G into an equivalent CFG that has no useless symbols,
-productions, or unit productions. Some care must be taken in the order of
application of the constructions. A safe order is:

1. Eliminate -productions.

2. Eliminate unit productions.

3. Eliminate useless symbols.

Theorem: If G is a CFG generating a language that contains at least one string


other than , then there is another CFG G1 such that L(G1 ) = L(G) − {}, and G1
has no -productions, unit productions, or useless symbols.

2 Chomsky Normal Form (CNF)


In this section, we shall show that every nonempty CFL without  has a grammar
G in which all productions are in one of two simple forms, either:

1. A → BC, where each of the A, B, and C are variables, or

2. A → a, where A is a variable and a is a terminal.

2
Further, G has no useless symbols. Such a grammar is said to be Chomsky Normal
Form, or CNF.
To put a grammar in CNF, start with one that satisfies the following restrictions:

1. The grammar has no -productions,

2. The grammar has no unit productions, and

3. The grammar has no useless symbols.

Every production of such a grammar is either of the form A → a, which is already


in a form allowed by CNF, or it has a body 1 of length 2 or more. Our tasks are to:

a) Arrange all that bodies of length 2 or more consists only of variables.

b) Break bodies of length 3 or more into a cascade of productions, each with a


body consisting of two variables.

Construction of (a): For every terminal a that appears in a body of length 2 or


more, create a new variable, say A. This variable has only one production
A → a. Now, we use A in place of a everywhere a appears in body of length
2 or more. At this point, every production has a body that is either a single
terminal or at least two variables and no terminals.

Construction of (b): We break all the productions of the form A → B1 B2 . . . Bk ,


for k ≥ 3 into a group of productions with two variables in each body. We
introduce k − 2 new variables, C1 , C2 , . . . , Ck−2. The original production is
replaced by the following k − 1 productions:
A → B1 C1 , C1 → B2 C2 , C2 → B3 C3 , . . ., Ck−2 → Bk−1 Bk

Theorem: If G is a CFG whose language contains at least one string other than
, then there is a grammar G1 in Chomsky Normal Form, such that L(G1 ) =
L(G) − {}.

1
If A → α is a production, then the part α is said to be body of that production.

You might also like