Unit 1 - Merged
Unit 1 - Merged
Text Books:
1.Introduction to languages & Theory of computations – John C.
Martin (MGH) –Chapters 1, 2,3,4,5,6,7,8 2.
2. Discrete Mathematical Structures with applications to Computer
Science—J .P.Trembley & R.Manohar (MGH) Chapter 1,
Unit 1: Mathematical Induction, Regular Languages &
Finite Automata
• The Principle of Mathematical Induction
• Recursive Definitions,
• Definition & types of grammars & languages,
• Regular expressions and corresponding regular
languages, examples and applications,
• unions, intersection & complements of regular
languages,
• Finite automata-definition and representation,
• Non-deterministic F.A.,NFA with null transitions,
• Equivalence of FA’s ,
• NFA’s and NFA’s with null transitions.
The Principle of Mathematical Induction
• Mathematical Induction is a technique of proving a statement,
theorem or formula which is thought to be true, for each and
every natural number n.
• By generalizing this in form of a principle which we would use to
prove any mathematical statement is 'Principle of Mathematical
Induction'.
Proof:
• This is going to be proven by (general) induction
following the recursive definition of regular
language.
• The Inductive proofs includes the 2 steps as
– Basis Step
– Inductive Step
Basis Step:
As shown below the languages Φ , {} and { a } for
any symbol a in Σ are accepted by an FA.
Inductive Step:
• We are going to show that for any languages L1 and
L2 if they are accepted by FAs, then L1. L2 , L1UL2 and
L1* are accepted by FAs.
• Since any regular language is obtained from {} and {a}
for any symbol a in by using union, concatenation and
Kleene star operations, that together with the Basis Step
would prove the theorem.
• Suppose that L1 and L2 are accepted by
FAs M1 = < Q1 , ∑ , q1,0 , δ1 , A1 > and
M2 = < Q2 , ∑, q2,0 , δ2 , A2 > , respectively.
We assume that Q1 ∩ Q2 = Φ without loss of
generality since states can be renamed if necessary.
• Then L1. L2 , L1UL2 and L1* are
accepted by the FAs
L1 U L2 is Mu = < Qu , Σ , qu,0 , δu , Au > ,
S→aB | bX
A → Bad | bSX | a
B → aSB | bBX
X → SBd | aBX | ad
Here;
A and X can directly generate terminal symbols. So, A
and X are generating symbols. As we have the
productions A→ a and X→ ad.
Also,
S→bX and X generates terminal string so S can also
generate terminal string. Hence, S is also generating
symbol.
B can not produce any terminal symbol, so it is non-
generating.
Hence, the new grammar after removing
non-generating symbols is:
S → bX
A → bSX | a
X → ad
• Here,
• A is non-reachable as there is no any derivation
of the form S→* α A β in the grammar. Thus
eliminating the non-reachable symbols, the
resulting grammar is:
S→ bX
X→ ad
This is the grammar with only useful symbols.
Exercise
1) Remove useless symbol from the following
grammar:
S→ xyZ | XyzZ
X → Xz | xYZ
Y → yYy | XZ
Z → Zy | z
2) Remove useless symbol from the following grammar
S → aC | SB
A → bSCa
B → aSB | bBC
C → aBc | ad
2)Eliminating“Null-productions” :
A grammar is said to have Є-
productions if there is a production of
the form
A → Є.
Here our strategy is to begin by
discovering which variables are
“nullable”.
S → 0A0 | 1B1 | BB
A→C
B→S|A
C→S|Є
1)CNF- Chomsky Normal Form:
A context-free grammar is said to be in Chomsky
normal form if every production is of one of
these two types:
A → BC (where B and C are variables)
A → a (where a is a terminal symbol)
and Thus a grammar in CNF is one which should
not have;
• Є-production
• Unit production
• Useless symbols
Algorithm to convert CFG into CNF:
• Step 1: Eliminate Є-production, Unit production
Useless symbols from given CFG.
• Step2: If all the productions are of the form
A→ a and A→BC with A, B, C ε V and
a ε T, we have done.
Otherwise, we have to do two task as:
1. Arange that all bodies of length 2 or more
consist only of variable.
2. Break bodies of length 3 or more into a
cascade of production , each with a body
consisting of two variable.
• The construction for task (1) is as follows :
if the productions are of the form:
A→ X1, X2, ………………Xm, m>2 and if some
Xi is terminal a,
then we replace the Xi by Ca having Ca→ a
where Ca is a variable itself.
Thus as result we will have all productions of
the form:
A→ B1B2…………Bm, m>2;
where all Bi‘s are non-terminal.
• The construction for task (2) is as follows :
We break those production
A→ B1B2…………Bm for m>=3, into group of
production with two variables in each body.
We introduce m-2 new variables
C1,C2,…………Cm-2.
The original production is replaced by the m-1
productions :
A→B1C1,
C1→B2C2,
……..
…….. …
Ck-2→Bk-1Bk
• Finally, all of the productions are achieved
in the form as:
A→ BC or A→a
This is certainly a grammar in CNF and
generates a language without Є-
productions.
• Consider an example: Convert CFG to CNF,
S→ AAC
A→ aAb | Є
C → aC | a
Solution: 1) First, removing Є- productions;
Here, A is nullable symbol as A→Є
So, eliminating such Є-productions,
we have;
S→ AAC | AC | C
A→ aAb | ab
C→ aC | a
2) Removing unit-productions:
Here, the unit pair we have is (S, C) as S→C
So, removing unit-production,
we have CFG as ;
S→ AAC | AC | aC| a
A→ aAb | ab
C→ aC | a