Homework Four Solution - CSE 355
Homework Four Solution - CSE 355
Please note that there is more than one way to answer most of
these questions. The following only represents a sample solution.
Let R be the set of all regular expressions over {a, b}. Then any element in R must have one of the
seven forms listed in the definition of regular expression given in Definition 3.1. These are precisely
the only strings that G can generate, with each production representing one of the generation rules
for regular expressions.
More formally (but not required in your solution), let G = ({S}, {a, b, λ, ∅, +,∗ , (, )}, S, P ),
where P is the set of productions given above.
Now, we must show that L(G) = R. Let r ∈ L(G). We will show that r ∈ R by induction on
the number of derivations needed to generate r.
– Base Case: Assume r is generated using 1 derivation. The only productions in P that end in
a terminal will generate one of the following strings: r = a, r = b, r = λ or r = ∅. In each
case r is a regular expression and hence r ∈ R.
1
– S → S ∗ . In this case S ⇒ S ∗ ⇒∗ r1∗ , where r1 is a regular expression from the induction
hypothesis. Therefore, r = r1∗ and we get that r ∈ R.
– S → (S). In this case S ⇒ (S) ⇒∗ (r1 ), where r1 is a regular expression from the
induction hypothesis. Therefore, r = (r1 ) and we get that r ∈ R.
In each case we get that r ∈ R
– Therefore, we conclude that if r ∈ L(G) then r ∈ R. Whence, L(G) ⊆ R.
We still have to show that R ⊆ L(G). Assume r ∈ R. We will show that r ∈ L(G) by induction
on the length of r.
– Base Case: Assume |r| = 1. Then r is a primitive regular expression and must have one of
the following forms: r = x, where x ∈ {a, b, λ, ∅}. In each case S ⇒ x will generate r. Thus,
r ∈ L(G).
– Induction Hypothesis: Assume that if |r| < n then S ⇒∗ r, that is r ∈ L(G).
– Induction Step: Assume that |r| = n, with n ≥ 2. Since |r| > 1, r must have one of the
following forms:
– r = r1 + r2 , where r1 and r2 are regular expressions, each with length less than n. By
the induction hypothesis S ⇒∗ r1 and S ⇒∗ r2 . Therefore, S ⇒ S + S ⇒∗ r1 + r2 = r.
Thus, r ∈ L(G).
– r = r1 r2 , where r1 and r2 are regular expressions, each with length less than n. By the
induction hypothesis S ⇒∗ r1 and S ⇒∗ r2 . Therefore, S ⇒ SS ⇒∗ r1 r2 = r. Thus,
r ∈ L(G).
– r = r1∗ , where r1 is a regular expressions with length less than n. By the induction
hypothesis S ⇒∗ r1 . Therefore, S ⇒ S ∗ ⇒∗ r1∗ = r. Thus, r ∈ L(G).
– r = (r1 ), where r1 is a regular expressions with length less than n. By the induction
hypothesis S ⇒∗ r1 . Therefore, S ⇒ (S) ⇒∗ (r1 ) = r. Thus, r ∈ L(G).
In each case we get that r ∈ L(G)
– Therefore, we conclude that if r ∈ L(G) then r ∈ R. Whence, R ⊆ L(G).
Therefore, we have that L(G) = R and the claim is proved.
2
Removing unit-productions(by Theorem 6.4):
S → abAB|abA|abB|ab,
A → baB|ba,
B → BAa|baB|ba|Ba|Aa|a.
S → CDAB|CDA|CDB|CD,
A → DCB|DC,
B → BAC|DCB|DC|BC|AC|a,
C → a,
D → b.
By step 2 of Theorem 6.6, we introduce variables to shorten the right sides of the production.
S → EF |EA|EB|CD
A → GB|DC,
B → BH|GB|DC|BC|AC|a,
E → CD,
F → AB,
G → DC,
H → AC,
C → a,
D → b.
b = Q ∪ {qs , qf }.
1. Add new states qs and qf to Q. That is Q
3
3. Add a new stack symbol $ to Γ. That is Γ b = Γ ∪ {$}. This will be used to ensure that M
c
does not accept in a nonfinal state just because the stack of M is empty after reading the
input.
4. Create δb as follows:
b s , λ, z) = {(q0 , z$)}. This places our new stack symbol on the
– Add the transition δ(q
bottom of the stack.
b λ, a) = {(qf , λ)}, for all p ∈ F , a ∈ Γ.
– Add the transitions δ(p, b This nondeterministically
checks that if we are in a final state, have no input, but have elements on the stack.
b f , λ, a) = {(qf , λ)}, for all a ∈ Γ.
– Add the transition δ(q b This empties the stack in the
new state, thus accepting.
– All other transitions are the same as in δ. That is, δ(q,b b, a) = δ(q, b, a), for all q ∈ Q,
b ∈ Σ, and a ∈ Γ.
Now we must show that L(M ) = N (M c). If a string is not accepted by L(M ), then every state
that it can reach with no input left to read must be a non-final state. This string will end in
the same set of states in M c (since the only transition to qf is from a final state). However, the
stack must have the symbol $ on it since the only way to pop it is from an accepting state or qf .
Therefore, the string is also not in N (Mc).
Conversely, if a string is in L(M ), then there must be some sequence of moves so that it ends
in a final state with no input remaining to be read. Then the string would be in the same set of
states in Mc but would also be in qf with an empty stack by following the transition from the fi-
nal state and then emptying the stack. Therefore, the string is also in N (M
c). Thus, L(M ) = N (Mc).
More formally (again not required), w ∈ L(M ) iff there is a sequence of moves in M , (q0 , w, z) `∗M
(p, λ, ab) with p ∈ F , a ∈ Γ ∪ {λ} and b ∈ Γ∗ iff there is a sequence of moves in M c, (qs , w, z) ` c
M
(q0 , w, z$) `∗c (p, λ, ab$) `M (qf , λ, b$) `∗c (qf , λ, λ) (since M
c has the same moves as M except for
M c M
the new start and from final states) iff w ∈ N (M c).
S → aABB|aAA,
A → aBB|a,
B → bBB|A.
The grammar does not have any λ−productions. So we can convert it into Greibach normal form.
By applying Theorem 6.1 on last production,
S → aABB|aAA,
A → aBB|a,
B → bBB|aBB|a.
4
By applying construction in Theorem 7.1, Define M = ({q0 , q1 , qf }, Σ, {S, A, B, z}, δ, q0 , z, {qf })
with
δ(q0 , λ, z) = {(q1 , Sz)},
δ(q1 , a, S) = {(q1 , ABB), (q1 , AA)},
δ(q1 , a, A) = {(q1 , BB), (q1 , λ)},
δ(q1 , b, B) = {(q1 , BB)},
δ(q1 , a, B) = {(q1 , BB), (q1 , λ)},
δ(q1 , λ, z) = {(qf , z)}.
S → a|aA|B|C,
A → aB|λ,
B → Aa,
C → cCD,
D → ddd.
Following the algorithm given in the proof of Theorem 6.2, we will first remove productions that
cannot generate any strings. Let V1 = ∅. Since S → a, A → λ and D → ddd are productions we
add S, A and D to V1 . Then, since B → Aa is a production we add B to V1 . There are no more
variables that can produce a string following the algorithm. Removing these useless productions
yields the grammar
S → a|aA|B,
A → aB|λ,
B → Aa,
D → ddd.
Next we remove the unreachable variables. We will start with V2 = {S} since S is the start
variabe, we always reach it. We can reach the variables A and B from S so they are added to V2 .
A and B can only reach each other so our algorithm terminates. Since D is not in S2 it cannot be
5
reached by any derivation of this grammar. Therefore we remove all productions with D, yielding
the final grammar free of useless productions and variables
S → a|aA|B,
A → aB|λ,
B → Aa,
6
(d) L = {an bm+n cm : n ≥ 0, m ≥ 1}.
We will give the machine definition. It is recommended you draw this out.
Define M = ({q0 , q1 , q2 , q3 , q4 }, Σ, {a, b, z}, δ, q0 , z, {q4 }) with δ(q0 , λ, z) = {(q1 , z)}, δ(q0 , a, z) =
{(q0 , az)}, δ(q0 , a, a) = {(q0 , aa)}, δ(q0 , b, a) = {(q1 , λ)}, δ(q0 , b, z) = {(q1 , bz)}, δ(q1 , b, a) =
{(q1 , λ)}, δ(q1 , b, z) = {(q2 , bz)}, δ(q2 , b, b) = {(q2 , bb)}, δ(q2 , c, b) = {(q3 , λ)}, δ(q3 , c, b) = {(q3 , λ)},
δ(q3 , λ, z) = {(q4 , λ)}.
Then M accepts L. It does not accept the empty string. If it initially encounters a a (if any),
it pushes an a onto the stack. Then it pushes an a onto the stack for each a seen (if any). Then
it pops an a for every b seen, if any. Once the stack is empty and it sees a b or if it initially
encountered a b, it pushes a b on to the stack. It then pops one b for each c seen guaranteeing that
na (w) + nc (w) = nb (w) and they are in the right order and n ≥ 0, m ≥ 1. Finally it accepts if the
input is done and there are no more bs on the stack.
(f ) L = {an bm : n ≥ m ≥ 3n}.
We will give the machine definition. It is recommended you draw this out.
Define M = ({q0 , q1 , q2 }, Σ, {a, b, z}, δ, q0 , z, {q2 }) with δ(q0 , a, z) = {(q0 , az)}, δ(q0 , a, z) =
{(q0 , aaz)}, δ(q0 , a, z) = {(q0 , aaaz)}, δ(q0 , a, a) = {(q0 , aa)}, δ(q0 , a, a) = {(q0 , aaa)}, δ(q0 , a, a) =
{(q0 , aaaa)}, δ(q0 , b, a) = {(q1 , λ)}, δ(q1 , b, a) = {(q1 , λ)}, δ(q1 , λ, z) = {(q2 , z)}.
Then M accepts L. It accepts empty string. The state q0 keeps track of the n as encountered in
the first part of the string and provides a chance for state q1 to match the bs to anywhere between
n and 3n. It proceeds to state q2 from state q1 if the number of bs encountered is anywhere between
n and 3n.
7
(h) L = {w : na (w) = 2nb (w)}.
We will give the machine definition. It is recommended you draw this out.
Define M = ({q0 , q1 , q2 }, Σ, {a, b, z}, δ, q0 , z, {q2 }) with δ(q0 , λ, z) = {(q2 , z)}, δ(q0 , c, z) = {(q0 , z)},
δ(q0 , c, a) = {(q0 , a)}, δ(q0 , c, b) = {(q0 , b)}, δ(q0 , a, z) = {(q0 , az)}, δ(q0 , b, z) = {(q0 , bbz)}, δ(q0 , a, a) =
{(q0 , aa}, δ(q0 , b, b) = {(q0 , bbb)}, δ(q0 , a, b) = {(q0 , λ)}, δ(q0 , b, a) = {(q0 , b)}, δ(q0 , b, a) = {(q1 , λ)},
δ(q1 , λ, a) = {(q0 , λ)}.
Then M accepts L. It accepts empty string. If it sees a c, it just reads it and does not modify
the stack. If it sees a a while the stack is empty or the top of stack has a, it pushes a a on to
the stack. If it sees a b while the stack is empty or the top of stack has b, it pushes two bs to the
stack. If it sees a a while top of stack has b, it pops b off the stack. If it encounters a b while top
of stack has a, it could take two paths dependent on whether it is the last b it is encountering or
not. If it is not the last b, it replaces the stack with a b. If it is the last b, it goes to state q1 and
pops off a from the stack. It then pops off another a from the stack (if any) and returns to state
q0 . Pushing two bs while encountering each b and one a for each input a, and popping off symbols
in the aforementioned manner ensures M accepts strings with na (w) = 2nb (w).
8
Define M = ({q0 , q1 }, Σ, {a, b, z}, δ, q0 , z, {q1 }) with δ(q0 , a, z) = {(q0 , az)}, δ(q0 , a, a) = {(q0 , aa)},
δ(q0 , a, b) = {(q0 , λ)}, δ(q0 , b, z) = {(q0 , bz)}, δ(q0 , b, a) = {(q0 , λ)}, δ(q0 , b, b) = {(q0 , bb)}, δ(q0 , λ, b) =
{(q1 , λ)}, δ(q0 , c, z) = {(q0 , c)}, δ(q0 , c, a) = {(q0 , a)}, δ(q0 , c, b) = {(q0 , b)}.
Then M accepts L since it follows Example 7.4 to tell if the number of as and bs are the same,
but now it only accepts if there is a b on top of the stack. The symbol on top of the stack is always
the symbol we have read more of thus far while processing the string. Therefore, if the input is
done and there is a b at the top of the stack then there are more bs than as. Additionally, if it sees
a c, it just reads it and does not modify the stack.
This satisfies condition 2 but not 1 (conditions in section 7.2 Context−F ree Grammars f or P ushdown Automata).
In order to satisfy condition 1, we add two more rules to the CFG and make q2 the only accepting
state (q1 now empties the stack).
The last three transitions are of the form (7.5). So they yield the corresponding productions
(q0 Aq1 ) → a,
(q1 Aq1 ) → λ,
(q1 zq2 ) → λ.
(q0 zq0 ) → a(q0 Aq0 )(q0 zq0 )|a(q0 Aq1 )(q1 zq0 )|a(q0 Aq2 )(q2 zq0 ),
(q0 zq1 ) → a(q0 Aq0 )(q0 zq1 )|a(q0 Aq1 )(q1 zq1 )|a(q0 Aq2 )(q2 zq1 ),
(q0 zq2 ) → a(q0 Aq0 )(q0 zq2 )|a(q0 Aq1 )(q1 zq2 )|a(q0 Aq2 )(q2 zq2 ),
(q0 Aq0 ) → b(q0 Aa0 )(q0 Aq0 )|b(q0 Aq1 )(q1 Aq0 )|b(q0 Aq2 )(q2 Aq0 ),
(q0 Aq1 ) → b(q0 Aa0 )(q0 Aq1 )|b(q0 Aq1 )(q1 Aq1 )|b(q0 Aq2 )(q2 Aq1 ),
(q0 Aq2 ) → b(q0 Aa0 )(q0 Aq2 )|b(q0 Aq1 )(q1 Aq2 )|b(q0 Aq2 )(q2 Aq2 ).
(q1 zq0 ), (q1 Aq0 ), (q1 zq1 ), (q1 Aq2 ), (q2 zq0 ), (q2 zq1 ), (q2 zq2 ), (q2 Aq0 ), (q2 Aq1 ) and (q2 Aq2 ) do
not occur at the left side of any production, so they are useless and can be eliminated. Also noting
that the production (q0 Aq0 ) → b(q0 Aq0 )(q0 Aq0 ) has no way of terminating and is useless we can
9
eliminate it. Then there is no production with (q0 Aq0 ) on the left hand side so we eliminate any
production containing it on the right hand side. We are then left with the folloing shorter grammar
(q0 Aq1 ) → a
(q1 Aq1 ) → λ,
(q1 zq2 ) → λ,
(q0 zq2 ) → a(q0 Aq1 )(q1 zq2 ),
(q0 Aq1 ) → b(q0 Aq1 )(q1 Aq1 ).
10