Lectures 1 To 31
Lectures 1 To 31
Spring 2024
Disclaimer
This is a compiled version of class notes scribed by students registered for CS
208 (Automata Theory and Logic) in Spring 2024. Please note this document
has not received the usual scrutiny that formal publications enjoy. This may
be distributed outside this class only with the permission of the instructor.
Contents
1 Propositional Logic 6
1.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Important Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Proof Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Natural Deduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Soundness and Completeness of our proof system . . . . . . . . . . . . . . . . . . . . 13
1.6 What about Satisfiability? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.7 Algebraic Laws and Some Redundancy . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.7.1 Distributive Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.7.2 Reduction of bi-implication and implication . . . . . . . . . . . . . . . . . . . 16
1.7.3 DeMorgan’s Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.8 Negation Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.9 From DAG to NNF-DAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.10 An Efficient Algorithm to convert DAG to NNF-DAG . . . . . . . . . . . . . . . . . 21
1.11 Conjunctive Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.12 Satisfiability and Validity Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.13 DAG to Equisatisfiable CNF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.14 Tseitin Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.15 Towards Checking Satisfiability of CNF and Horn Clauses . . . . . . . . . . . . . . . 29
1.16 Counter example for Horn Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.16.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.17 Davis Putnam Logemann Loveland (DPLL) Algorithm . . . . . . . . . . . . . . . . . 31
1.18 DPLL in action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.18.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.19 Applying DPLL Algorithm to Horn Formulas . . . . . . . . . . . . . . . . . . . . . . 34
1.20 DPLL on Horn Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.21 Rule of Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.21.1 Completeness of Resolution for Unsatisfiability of CNFs . . . . . . . . . . . . 36
2
CONTENTS 3
4 Regular Expressions 64
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2 Formal Definition of a Regular Expression . . . . . . . . . . . . . . . . . . . . . . . . 64
4.3 Semantics of Regular Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.1 Atomic Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.2 Union Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.3 Concatenation Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.5 Order of Precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.6 Kleene Star . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.7 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.8 Further Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4 Kleene’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4.1 Part 1: L(Reg. Ex.) ⊆ L(NFAs with ϵ edges) . . . . . . . . . . . . . . . . . . 67
4.4.2 Part 2: L(Reg. Ex.) ⊇ L(NFAs with ϵ edges) . . . . . . . . . . . . . . . . . . 69
4.4.3 Checking Subsethood of Languages . . . . . . . . . . . . . . . . . . . . . . . . 74
CONTENTS 4
5 DFA Minimisation 76
5.1 Minimum States in a DFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.2 Indistinguishability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.3 Equivalence classes of Indistinguishability relation . . . . . . . . . . . . . . . . . . . . 78
5.4 Further Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.4.1 Optimality of Acquired DFA . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.4.2 Uniqueness of Acquired DFA . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.5 From states to words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.5.1 Language of word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.5.2 Relation between states of minimal DFA and equivalence classes for˜L . . . . 83
5.6 Setting up the Parallel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.6.1 Can | ∼L | > | ≡ | ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.6.2 Can| ∼L | < | ≡ | ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.7 Myhill-Nerode Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7 Pushdown Automata 90
7.1 Pushdown Automata for non-regular Languages . . . . . . . . . . . . . . . . . . . . . 90
7.2 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.2.1 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.2.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.2.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.3 Acceptance through empty stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.3.1 For any automaton A, L(A) is not necessarily equivalent to N (A) . . . . . . . 93
7.3.2 Construction for A2 such that L(A1 ) = N (A2 ) . . . . . . . . . . . . . . . . . 93
7.4 From Empty Stack PDA to Final State PDA . . . . . . . . . . . . . . . . . . . . . . 95
7.4.1 Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.4.2 Required to show N (P ) = L(PF ) . . . . . . . . . . . . . . . . . . . . . . . . . 96
7.5 Context Free Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
7.5.1 DPDA and NPDAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
7.5.2 Build-up and emptying of the stack of a PDA . . . . . . . . . . . . . . . . . 97
7.6 Acceptance by PDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.6.1 Run on the PDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.6.2 Recurrence Relations on Languages . . . . . . . . . . . . . . . . . . . . . . . . 99
7.7 More on recurrence relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.7.1 Smallest and Largest Languages . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.7.2 Context Free Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
CONTENTS 5
Propositional Logic
In this course we look at two ways of computation: a state transition view and a logic centric view.
In this chapter we begin with logic centered view with the discussion of propositional logic.
Example. Suppose there are five courses C1 , . . . , C5 , four slots
S1 , . . . , S4 , and five days D1 , . . . , D5 . We plan to schedule these D1 D2 D3 D4 D5
courses in three slots each, but we have also have the following S1
requirements: S2
S3
• For every course Ci , the three slots should be on three
S4
different days.
• For every day Di of the week, have at least one slot free.
Propositional logic is used in many real-world problems like timetables scheduling, train scheduling,
airline scheduling, and so on. One can capture a problem in a propositional logic formula. This is
called as encoding. After encoding the problem, one can use various software tools to systematically
reason about the formula and draw some conclusions about the problem.
1.1 Syntax
We can think of logic as a language which allows us to very precisely describe problems and then
reason about them. In this language, we will write sentences in a specific way. The symbols used in
propositional logic are given in Table 1.1. Apart from the symbols in the table we also use variables
usually denoted by small letters p, q, r, x, y, z, . . . etc. Here is a short description of propositional
logic symbols:
• Variables: They are usually denoted by smalls (p, q, r, x, y, z, . . . etc). The variables can take
up only true or false values. We use them to denote propositions.
• Constants: The constants are represented by ⊤ and ⊥. These represent truth values true
and false.
6
CHAPTER 1. PROPOSITIONAL LOGIC 7
• Operators: ∧ is the conjunction operator (also called and), ∨ is the disjunction operator
(also called or), ¬ is the negation operator (also called not), → is implication, and ↔ is
bi-implication (equivalence).
For the timetable example, we can have propositional variables of the form pijk with i ∈ [5], j ∈ [5]
and k ∈ [4] (Note that [n] = {1, . . . , n}) with pijk representing the proposition ‘course Ci is scheduled
in slot Sk of day Dj ’.
1.2 Semantics
Semantics give a meaning to a formula in propositional logic. The semantics is a function that takes
in the truth values of all the variables that appear in a formula and gives the truth value of the
formula. Let 0 represent “false” and 1 represent “true”. The semantics of a formula φ of n variables
is a function
JφK : {0, 1}n → {0, 1}
CHAPTER 1. PROPOSITIONAL LOGIC 8
→ p1 →
p1 p2 →
( → ) p3 p4
(b) Parse tree for p1 → (p2 → (p3 → p4 ))
p2 →
( → ) → →
p3 p4 p1 p2 p3 p4
(a) (c) Parse tree for (p1 → p2 ) → (p3 → p4 )
It is often presented in the form of a truth table. Truth tables of operators can be found in table
Table 1.2.
φ1 φ2 φ1 ∧ φ2 φ1 φ2 φ1 ∨ φ2
0 0 0 0 0 0
φ ¬φ 0 1 0 0 1 1
0 1 1 0 0 1 0 1
1 0 1 1 1 1 1 1
(a) Truth table for ¬φ. (b) Truth table for φ1 ∧ φ2 . (c) Truth table for φ1 ∨ φ2 .
φ1 φ2 φ1 → φ2 φ1 φ2 φ1 ↔ φ2
0 0 1 0 0 1
0 1 1 0 1 0
1 0 0 1 0 0
1 1 1 1 1 1
(d) Truth table for φ1 → φ2 . (e) Truth table for φ1 ↔ φ2 .
Remark. Do not confuse 0 and 1 with ⊤ and ⊥: 0 (false) and 1 (true) are meanings, while ⊤ and
⊥ are symbols.
Rules of semantics:
Truth Table: A truth table in propositional logic enumerates all possible truth values of logical
expressions. It lists combinations of truths for individual propositions and the compound statement’s
truth.
Example. Let us construct a truth table for J(p ∨ s) → (¬q ↔ r)K (see Table 1.3).
p q r s p∨s ¬q ¬q ↔ r (p ∨ s) → (¬q ↔ r)
0 0 0 0 0 1 0 1
0 0 0 1 1 1 0 0
0 0 1 0 0 1 1 1
0 0 1 1 1 1 1 1
0 1 0 0 0 0 1 1
0 1 0 1 1 0 1 1
0 1 1 0 0 0 0 1
0 1 1 1 1 0 0 0
1 0 0 0 1 1 0 0
1 0 0 1 1 1 0 0
1 0 1 0 1 1 1 1
1 0 1 1 1 1 1 1
1 1 0 0 1 0 1 1
1 1 0 1 1 0 1 1
1 1 1 0 1 0 0 0
1 1 1 1 1 0 0 0
• satisfiable or consistent or sat iff JφK = 1 for some assignment of variables. That is, there
is at least one way to assign truth values to the variables that makes the entire formula true.
Both a formula and its negation may be sat at the same time (φ and ¬φ may both be sat).
• unsatisfiable or contradiction or unsat iff JφK = 0 for all assignments of variables. That
is, there is no way to assign truth values to the variables that makes the formula true. If a
formula φ is unsat then ¬φ must be sat (it is in fact valid).
• valid or tautology: JφK = 1 for all assignments of variables. That is, the formula is always
true, no matter how the variables are assigned. If a formula φ is valid then ¬φ is unsat.
• semantically entail φ1 iff JφK ⪯ Jφ1 K for all assignments of variables, where 0 (false) ⪯
1 (true). This is denoted by φ |= φ1 . If φ |= φ1 , then for every assignment, if φ evaluates to
1 then φ1 will evaluate to 1. Equivalently φ → φ1 is valid.
CHAPTER 1. PROPOSITIONAL LOGIC 10
• equisatisfiable to φ1 iff either both are sat or both are unsat. Also note that, semantic
equivalence implies equisatisfiability but not vice-versa.
Term Example
sat p∨q
unsat p ∧ ¬p
valid p ∨ ¬p
semantically entails ¬p |= p → q
semantically equivalent p → q, ¬p ∨ q
equisatisfiable p ∧ q, r ∨ s
• Connector: It is the logical operator over which the rule works. We use the subscript i
(for introduction) if the connector and the premises are combined to get the inference. The
subscript e (for elimination) is used when we eliminate the connector present in the premises
to draw inference.
Example. Look at the following rule
φ1 ∧ φ2
∧e1
φ1
In the rule above φ1 ∧ φ2 is assumed (is premise). Informally, looking at ∧’s truth table, we can
infer that both φ1 and φ2 are true if φ1 ∧ φ2 is true, so φ1 is an inference. Also, in this process we
eliminate (remove) ∧ so we call this and-elimination or ∧e . For better clarity we call this rule ∧e1
as φ1 is kept in the inference even when both φ1 and φ2 could be kept in inference. If we use φ2 in
inference then the rule becomes ∧e2 .
Table 1.5 summarises the basic proof rules that we would like to include in our proof system.
φ1 φ2 φ1 ∧ φ2 φ1 ∧ φ2
∧ ∧i ∧e1 ∧e2
φ1 ∧ φ2 φ1 φ2
φ1 φ2 φ1 ∨ φ2 φ1 → φ3 φ2 → φ3
∨ ∨i ∨i ∨e
φ1 ∨ φ2 1 φ1 ∨ φ2 2 φ3
φ1
..
. φ1 φ1 → φ2
→ φ2 →e
φ2
→i
φ1 → φ2
φ
..
. φ ¬φ
¬ ⊥ ¬e
⊥
¬i
¬φ
⊥
⊥ ⊥e
φ
¬¬φ
¬¬ ¬¬e
φ
In the →i rule, the box indicates that we can temporarily assume φ1 and conclude φ2 using no extra
non-trivial information. The →e is referred to by its Latin name, modus ponens.
CHAPTER 1. PROPOSITIONAL LOGIC 12
Example 1. We can now use these proof rules along with φ1 ∧ (φ2 ∧ φ3 ) as the premise to conclude
(φ1 ∧ φ2 ) ∧ φ3 .
φ1 ∧ (φ2 ∧ φ3 ) φ1 ∧ (φ2 ∧ φ3 )
∧e2 ∧e1
φ2 ∧ φ3 φ1
φ2 ∧ φ3 φ2 ∧ φ3
∧e1 ∧e2
φ2 φ3
φ1 φ2
∧i
φ1 ∧ φ2
φ1 ∧ φ2 φ3
∧i
(φ1 ∧ φ2 ) ∧ φ3
ϕ1 , ϕ2 , . . . , ϕn ⊢ φ.
We can also infer some formula using no premises, in which case the sequent is ⊢ φ.
Applying these proof rules involves the following general rule:
We can only use a formula φ at a point if it occurs prior to it in the proof and if no box
enclosing that occurrence of φ has been closed already.
Soundness: Σ ⊢ φ implies Σ |= φ
The rules that we have chosen are indeed individually sound since they ensure that if for some
assignment the premises evaluate to 1, so does the inference. Otherwise they rely on the notion of
contradiction and assumption. Hence, soundness for any proof can be shown by inducting on the
length of the proof.
A complete proof system is one which allows the inference of every valid semantic entailment:
Completeness: Σ |= φ implies Σ ⊢ φ
Let’s take some example of semantic entailment. Σ = {p → q, ¬q} |= ¬p.
p q p→q ¬q ¬p
0 0 1 1 1
0 1 1 0 1
1 0 0 1 0
1 1 1 0 0
CHAPTER 1. PROPOSITIONAL LOGIC 14
As we can see, whenever both p → q and ¬q are true, ¬p is true. The question now is how do we
derive this using proof rules? The idea is to ‘mimic’ each row of the truth table. This means that
we assume the values for p, q and try to prove that the formulae in Σ imply φ1 . And to prove an
implication, we can use the →i rule. Here’s an example of how we can prove our claim for the first
row:
1. ¬p given
2. ¬q given
3. (p → q) ∧ ¬q assumption
4. ¬p 1
5. ((p → q) ∧ ¬q) → ¬p →i 3,4
Similarly to mimic the second row, we would like to show ¬p, q ⊢ ((p → q) ∧ ¬q) → ¬p. Actually
for every row, we’d like to start with the assumptions about the values of each variable, and then
try to prove the property that we want.
This looks promising, but we aren’t done, we have only proven our formula under all possible assump-
tions, but we haven’t exactly proven our formula from nothing given. But note that the reasoning
we are doing looks a lot like case work, and we can think of the ∨e rule. In words, this rule states
that if a formula is true under 2 different assumptions, and one of the assumptions is always true,
then our formula is true. So if we just somehow rigorously show at least one of our row assumptions
is always true, we will be able to clean up our proof using the ∨e rule.
But as seen above, we were able to show a proof for the sequent ⊢ φ ∨ ¬φ. If we just recursively
apply this property for all the variables we have, we should be able to capture every row of truth
table. So combining this result, our proofs for each row of the truth table, and the ∨e rule, the
whole proof is constructed as below. The onlyV thing we need now is the ability to construct proofs
for each row given the general valid formula ϕ∈Σ ϕ → φ.
1
Σ semantically entails φ is equivalent to saying intersection of formulae in Σ implies φ is valid
CHAPTER 1. PROPOSITIONAL LOGIC 15
p → q, ¬q
p ∨ ¬p
¬p p
q ∨ ¬q q ∨ ¬q
¬q q ¬q q
Proof assuming row 1 Proof assuming row 2 Proof assuming row 3 Proof assuming row 4
¬p ¬p ¬p ¬p
Or Elimination Or Elimination
¬p ¬p
Or Elimination
¬p
ϕ1 ←→ ϕ2 =| |= (ϕ1 → ϕ2 ) ∧ (ϕ2 → ϕ1 )
∨ ∨
p ¬ ¬ ∧
r q r ∨
p ¬
Since , the red part of the tree is repeating twice , we can make a DAG(Directed Acyclic Graph)
instead of the parse tree.
DAG Representation
∧ ¬
∨ r q
p ¬
Idea 1: Let’s push the "¬" downwards by applying the De Morgans Law and see what happens.
Lets Consider the following example and the highlighted ¬.
CHAPTER 1. PROPOSITIONAL LOGIC 18
¬ ∨
∨ ¬ ∧
∧ ∨
q p ¬p ¬r
¬ ∨
∨ ∧
∧ ∧
¬
¬
∧
q p ¬p ¬r
Now , we have an issue in the blue edge. The blue edge wanted the non - negated tree node but due
to the above mentioned change , it is getting the negated node. So, this idea won’t work. We want
to preserve the non-negated nodes as well.
Modification : Make two copies of the DAG and negate(i.e , ¬ pushing) only 1 of the copies and if
a nodes wants non - negated node then take that node from the copied tree.
CHAPTER 1. PROPOSITIONAL LOGIC 19
∨ ∨
¬ ∨ ¬ ∨
∨ ∧ ∨ ¬ ∧
∧ ∧ ∧ ∨
¬
¬
∧ ∧
q p ¬p ¬r q p ¬p ¬r
∨ ∨
¬ ∨ ¬ ∨
∨ ∧ ∨ ¬ ∧
∧ ∧ ∧ ∨
∨ ∧
¬ ¬
q p ¬p r q p ¬p ¬r
∨ ∨
¬ ∨ ¬ ∨
∨ ∧ ∨ ¬ ∧
∧ ∧ ∧ ∨
∨ ∧
¬q ¬p ¬p r q p ¬p ¬r
¬ ∨
∨ ∧
∧ ∧ ∨
∨ ∧
¬q ¬p ¬p r q p ¬r
∨ ∧
∧ ∨
q p ¬p ¬r
Figure 1.7: Step 1:Make a copy of the DAG and Remove all "¬" nodes except the ones which are
applied to the basic variables
CHAPTER 1. PROPOSITIONAL LOGIC 22
∨ ∨
¬ ∨ ∨
∨ ¬ ∧ ∨ ∧
∧ ∧ ∧ ∨
∧ ∧
q p ¬p ¬r q p ¬p ¬r
∨ ∧
¬ ∨ ∧
∨ ¬ ∧ ∧ ∨
∧ ∧ ∨ ∨
∧ ∨
q p ¬p ¬r ¬q ¬p p r
Figure 1.9: Step 3: Remove the "¬" nodes from the first DAG by connecting them to the corre-
sponding node in the negated DAG.
CHAPTER 1. PROPOSITIONAL LOGIC 23
∨ ∧
∨ ∧
∨ ∧ ∧ ∨
∧ ∧ ∨ ∨
∧ ∨
q p ¬p ¬r ¬q ¬p p r
∧ ∧
∨ ∨
¬p ¬r ¬q ¬p p r
∧ ∧
∨ ∨
¬q ¬p ¬p ¬r
p r
NOTE : The size of the NNF - DAG obtained using the above algorithm is atmost two
times the size of the given DAG. Hence we have an O(N) formula for converting any arbitrary
DAG to a semantically equivalent NNF - DAG.
∨ ∧
P Q ∨ T
R S
• Parse Tree for the formula (¬(p ∧ ¬q) ∧ (¬(¬r ∧ s)) ∧ t) in NNF
∨ ∧
¬ Q ∨ T
P R ¬
• CLAUSE :
– A clause is a disjunction of literals such that a literal and its negation are not both in
the same clause, for any literal
– Example : p ∨ q ∨ (¬ r) , p ∨ (¬ p) ∨ (¬ r) -> Not Allowed
• CUBE :
– a Cube is a conjunction of literals such that a literal and its negation are not both in the
same cube, for any literal
– Example p ∧ q ∧ (¬ r) , p ∧ (¬ p) ∧ (¬ r) -> Not Allowed
Given a DAG of propositional logic formula with only ∨ , ∧ and ¬ nodes , can we
efficiently get a DAG representing a semantically equivalent CNF/DNF formula ?
Tutorial 2 Question 2:
The Parity Function can be expressed as ((( ....(x1 ⊕ x2 ) ⊕ x3 ) ....... ⊕ xn )
The Parse Tree for x1 ⊕ x2 is
∧ ∨
¬x2 x1 x2 ¬x1
∧ ∨
¬x3 ϕ x3 ¬ϕ
CHAPTER 1. PROPOSITIONAL LOGIC 27
∧ ∨
¬x3 x3 ¬
∧ ∨
¬x2 x1 x2 ¬x1
We notice that on adding xi we are adding 4 nodes. Hence, the size of DAG of the parity
function is atmost 4n. And we have already shown that size of NNF-DAG is atmost 2 times the
size of DAG. So, the size of the semantically equivalent NNF-DAG is atmost 8n.
Also , in the Tutorial Question we have proved that the DAG size of the semantically equivalent
CNF/DNF formula is atleast 2n−1 .
NOTE : A formula is valid if it’s negation is not satisfiable. Therefore , we can convert every
validity problem to a satisfiability problem. Thus, it suffices to worry only about satisfiability
problem.
1. Introduce new variables t1 , t2 , t3 , .... , tn for each of the nodes. We will get an equisatisfiable
formula ϕ′ (p, q, r, t1 , t2 , t3 , .... , tn ) which is in CNF.
2. Write the formula for ϕ′ as a conjunction of subformulas for each node of form given below:
ϕ′ =(t1 ⇐⇒ (p ∧ q)) ∧
(t2 ⇐⇒ (t1 ∨ ¬r)) ∧
(t3 ⇐⇒ (¬t2 )) ∧
(t4 ⇐⇒ (¬p ∧ ¬r)) ∧
(t5 ⇐⇒ (t3 ∨ t4 )) ∧
t5
3. Convert each of the subformula to CNF. For the first node it is shown below:
4. For checking the equisatisfiabiliy of ϕ and ϕ′ : Think about an assignment which makes ϕ true
then apply that assignment to ϕ′ .
Say you have a formula Q(p, q, r, . . . ). Now we can use and introduce auxiliary variables t1 , t2 , . . .
to make a new formula Q′ (p, q, r, . . . t1 , t2 , . . . ) using Tseitin encoding which is equisatisfiable as Q.
Q′ is equisatisfiable as Q but not sematically equivalent. Size of Q′ is linear in size of Q.
Lets take an example to understand better. Consider the formula (¬((q ∧ p) ∨ ¬r)) ∨ (¬p ∧ ¬r)
¬ ∧
q p ¬p ¬r
(¬x1 ∨ ¬x2 ∨ x3 ) ∧ (¬x4 ∨ x5 ∨ ¬x3 ) ∧ (¬x1 ∨ ¬x5 ) ∧ (x5 ) ∧ (¬x5 ∨ x3 ) ∧ (¬x5 ∨ ¬x1 )
Now we can convert any horn clause to an implication by using disjunction of the literals that were
in negation form in the horn clause on left side of the implication and the unnegated variable on the
other side of the implication. So all the variables in all the implications will be unnegated.
So the above equation can be translated as follows.
x1 ∧ x2 =⇒ x3
x4 ∧ x3 =⇒ x5
x1 ∧ x5 =⇒ ⊥
x5 =⇒ x3
⊤ =⇒ x5
Now we try to find a satisfying assignment for the above formula.
From the last clause we get x5 = 1, now the fourth clause is ⊤ =⇒ x3 .
Then from the fourth clause we get that x3 = 1.
Now in the remaining clauses none of the left hand sides are reduced to ⊤.
Hence, we set all remaining variables to 0 to get a satisfying assignment.
CHAPTER 1. PROPOSITIONAL LOGIC 30
7 if ⊥ is marked then
8 return’unsatisfiable’
9 else
10 return’satisfiable’
Complexity:
If we have n variables and k clauses then the solving complexity will be O(nk) as in worst case in
each clause you search for each variable.
1.16.1 Example
We are presented with a example involving conditions that determine when an alarm (a) should
ring. Let’s outline the given conditions:
1. If there is a burglary (b) in the night (n), then the alarm should ring (a).
2. If there is a fire (f ) in the day (d), then the alarm should ring (a).
3. If there is an earthquake (e), it may occur in the day (d) or night (n), and in either case, the
alarm should ring (a).
4. If there is a prank (p) in the night (n), then the alarm should ring (a).
5. Also it is known that prank (p) does not happen during day (d) and burglary (b) does not
takes place when there is fire (f ).
Now we want to examine the possible behaviour of this systen under the assumption that alarm
rings during day. For this we add two more clauses:
⊤⇒a ⊤⇒d
This directly gives us that a, d have to be true, what about the rest? We can see that setting all the
remaining variables to false is a satisfying assignment for this set of formulae.
Hence we have none of prank, earthquake, burglary or fire and hence alarm should not ring.
This means that our formulae system is incomplete.
To achieve this, we try to introduce new variables N a (no alarm), N f (no fire), N b (no burglary),
N e (no earthquake), and N p (no prank).
We extend the above set of implications in a natural way using these formulae:
a ∧ Na ⇒ ⊥ b ∧ Nb ⇒ ⊥ f ∧ Nf ⇒ ⊥ e ∧ Ne ⇒ ⊥ p ∧ Np ⇒ ⊥
Nb ∧ Nf ∧ Ne ∧ Np ⇒ Na
• Unit Clause : It is any clause which only has one literal in it. Ex. .. ∧ (¬x5 ) ∧ ..
Note: If any Formula has a unit clause then the literal in it has to be set to true.
• Pure Literal : A literal which doesn’t appear negated in any clause. Say a propositional
variable x appear only as ¬x in every clause it appears in., or say y appears only as y in every
clause.
Note: If there is a pure literal in the formula, it does not hurt any clause to set it to true. All
the clauses in which this literal is present will become true immediately.
We will now utilize every techniques we learnt to simplify our formula. First we check if our formula
has unit clause or not. If yes then we assign the literal in that clause to be 1. Note, φ[l = 1] is
the formula obtained after setting l = 1 everywhere in the formula. We also search for pure liter-
als. If we find a pure literal then we can simply assign it 1 (or 0 if it always appears in negated
form) and proceed, this cannot harm us (cause future conflicts) due to the definition of Pure Literal.
If we do not have any of these then we have only one option left at the moment which is try and error.
We assign any one of the variable in the formula a value which we choose by some way (not
described here). Then we go on with the usual algorithm until we either get the whole formula to
CHAPTER 1. PROPOSITIONAL LOGIC 32
be true or false. At this step we might have to backtrack if the formula turns out to be false. If it
is true we can terminate the algorithm.
Note: Our algorithm can be as worse as a Truth Table as we are trying every assignment. But
as we are applying additional steps, after making a decision there are high chances that we get a
unit clause or a pure literal.
Now as we have done all the prerequisites let us state the algorithm.
Question Can the formula be a horn formula after steps 1 and 2 can’t be applied anymore? i.e. If
our formula does not have any unit clause or pure literal can it be a horn formula?
(a ∨ ¬b) ∧ (¬a ∨ b)
CHAPTER 1. PROPOSITIONAL LOGIC 33
start P4
1,PLE
P6
1,PLE 1,PLE
P7 P5
1,PLE 1,PLE
P3 P2
1,PLE 1,PLE
P5 Sat
1,PLE
P2
1,PLE
Sat
The following is the decision tree if we remove the point 2 of DPLL. Note the increase in number of
operations.
start P6
0,D
P5
0,UP
P7
0,D
P1
1,D
0,D
P2 P2
Backtrack till a D type node found 0,UP 1,UP
P3 P3
1,UP 1,UP
Unsat P4
1,UP
Sat
Advantage of Horn’s method is after all possible Unit Propagations are done, it sets all remaining
variables to 0, but in DPLL we need to go step by step for each remaining variable.
But Horn’s method can only be applied in a special case, moreover, in Horn’s method we only figure
out which variables to set true as opposed to DPLL which can figure out whether variable needs to
be set to true or false via the Pure Literal Elimination.
CHAPTER 1. PROPOSITIONAL LOGIC 35
• Consider the first step in solving for the satisfiability of a given set of Horn clauses in implication
form where,
– If the LHS of an implication is true we set the literal on the RHS of the implication to
be true in all its implications.
– Repeating the above step sets all essential variable which are to be set to 1, true.
This step is equivalent to the first two steps of the DPLL algorithm,
The above steps in the two different schemes do the same are essentially doing the same thing.
Now if the given clauses were Horn, we know that putting all the remaining variables false is
a satisfying assignment. This means if our DPLL algorithm preferentially assigns 0 to each
decision, the procedure thus converges to the method for checking the satisfiability for Horn
formulae.
However intuitive it may look this rule poses as a powerful tool to check the satisfiability of logical
formulae, we can reason out an algorithm to check the satisfiability of a formula (CNF) as follows:
Let us first define a formula to be unresolved if there exists a literal and its negation in the
formula (they cannot be in the same clause by the definition of a clause). If a formula is
resolved (i.e., not unresolved) then it is satisfiable (‘SAT’), as the variables which appear in their
negated form we assign false, and the other variables to true.
CHAPTER 1. PROPOSITIONAL LOGIC 36
2. As the formula is unresolved, we can apply the resolution rule, this gives us a new clause.
3. If the formula so formed is the empty clause, we deem the formula to be UNSAT otherwise
check if the formula is resolved, if not from repeat step 1.
Before rationalizing the soundness of the above sequence of steps let us first see an example.
An Example: Consider C = {C1 , C2 , C3 } as given below:
• C1 := ¬p1 ∨ p2 ( p1 =⇒ p2 )
• C2 := p1 ∨ ¬p2 ( p2 =⇒ p1 )
• C3 := ¬p1 ∨ ¬p2 ( p1 ∧ p2 =⇒ ⊥)
Then, a dry run of the above method would look like:
1. Since both p1 and p2 appear in negated and un-negated form, we apply resolution on C1 and
C2 , which generates C4 , as follows:
(¬p1 ∨ p2 ) ( p1 ∨ ¬p2 )
resolution
(¬p1 ∨ p1 )
2. Once again we apply resolution on C3 and C4 (this is not really a clause by definition, one can
choose to drop the tautologies as soon as encountered),
(¬p1 ∨ ¬p2 ) (¬p1 ∨ p1 )
resolution
(¬p2 ∨ ¬p1 )
3. The clause that we have got is now resolved and thus, our formula is satisfiable.
• Choose a literal pi such that both pi and ¬pi both appear in the CNF. (If no such literal exists
the formula is resolved as defined earlier and has a satisfying assignment). Apply resolution
repeatedly as long as the same pi satisfies this condition.
CHAPTER 1. PROPOSITIONAL LOGIC 37
– If pi vanishes from the CNF, then calling our hypothesis, we can raise UNSAT as the
equivalent form that we have got must be unsatisfiable independent of the value of the
vanished literal as the initial formula was unsatisfiable.
– If pi exists in one of negated or un-negated forms. In which case we repeat the procedure.
This time the number of available pairs has reduced by 1 as pi cannot be selected again.
• As the selection step can take place at most n times, (as a new pair(as in bullet 2) cannot be
generated in the CNF by resolution operations), Consider the case with a pair available for
every literal then the procedure must conclude UNSAT in n steps otherwise at the end of n
steps we have no pairs, which ensures a satisfying assignment for the formula.
Broadly speaking, what we are showing is that upon repeated resolution of an unsatisfiable CNF, if
( ) has not been encountered, the number of propositional variables must decrease.
Chapter 2
2.1 Definitions
• Alphabet: A finite, non-empty set of symbols called characters. We usually represent an
alphabet with Σ. For example Σ = {a, b, c, d}.
• String: A finite sequence of letters form an alphabet. An important thing to note here is
that even though the alphabet may contain just 1 character, it can form countably infinite
number of strings, each of which are finite. In this course we only deal with finite strings
over a finite alphabet.
• Concatenation Operation (·) We can start from a string , take another string an as the
name suggests concatenate them to form another string:
a · b ̸= b · a Not Commutative
(a · b) · c = a · (b · c) Associative
• Identity Element The algebra of the strings defined over the concatenation operator has the
identity element : ε : empty string:
σ·ε=ε·σ =σ
Note the the empty strings remains same for strings of all alphabets.
• Language A subset of all finite strings on Σ. This set doesn’t have to be finite even though
the strings are of finite length.
Note that a set of all finite strings of Σ is countably infinite (cardinality: N), so the number
of languages of Σ is uncountably infinite (cardinality: 2N ).
38
CHAPTER 2. DFAS AND REGULAR LANGUAGES 39
• Σ∗ is defined to be the set of all finite strings on Σ, including ε. Note that Σ∗ = k≥0 Σk ,
S
where Σk is the set of all strings on Σ with exactly k letters. Note that we can prove that the
number of strings for Σ∗ are countably finite by representing each string as a unique number
in base (n + 1) system , where |Σ| = n, we can get an injection to natural numbers.
The solution to this lies in our discussion during the first lecture. I will record just one bit of infor-
mation: whether I have received an even or odd number of 1s till now. Every time I receive a new
bit, I will update this information: if it’s a 0, I won’t do anything, and if it’s a 1, I will change my
answer from even to odd, or vice-versa.
States: These are nodes which contain relevant summary of what we have seen so far
In our case we want to know whether there were even or odd number of 1s.
even # 1s odd # 1s
Where do we start from ? When I have seen nothing there are even number of 1s.
Now, suppose I receive a 0, I would remain in the same state, but if I get a 1 , the parity changes.
1
start even # 1s odd # 1s
Now, if I am in the second state, if I get a 1 I will change states, and if I get a 0, parity is unchanged
to I remain in the same state:
CHAPTER 2. DFAS AND REGULAR LANGUAGES 40
1
start even # 1s odd # 1s
0 0
So when do we know that our string we have seen till now belongs to some language or not, we know
that by marking some states as accepting states: usually represented by double circles, if we end
up on this state, the string recieved till now belongs to out language: is accepted.
1
start even # 1s odd # 1s
0 0
1 1
0 0
start 0 1 2
In certain scenarios, expressing a language solely through propositional logic becomes impractical,
particularly when the length of the strings is unknown or variable. For instance, consider above
example. In this case, the length n of the string is not explicitly provided, making it challenging to
construct a propositional logic expression directly. Propositional logic typically operates on fixed,
predetermined conditions or patterns within strings, which cannot accommodate variable lengths.
However, deterministic finite automata (DFAs) offer a suitable alternative for such situations. DFAs
CHAPTER 2. DFAS AND REGULAR LANGUAGES 41
are well-suited for languages where the structure and properties depend on the characters within the
string rather than on fixed string lengths. By employing states and transitions based on input char-
acters, DFAs can effectively recognize languages with variable-length strings and complex patterns,
making them a more appropriate choice when string length is not predetermined.
We
P will denote Σ as the set of alphabets.
= {a, b}P
L = {ω ∈ * : n (ω) is divisible by 2 and n (ω) is divisible by 3}
a b
a a a a a a
b b
S5 (1, 0) S4 (1, 1) S3 (1, 2)
If we try to cleverly convert the problem into a simpler one, we will observe that nab (w) = nba (w)
will always be true if the start and end alphabets are same (be it a or b).
CHAPTER 2. DFAS AND REGULAR LANGUAGES 42
a
b
S1 S3 b
a a
start S0
b
b
S2 S4 a
a
b
In the last few lectures, we covered the formalization of deterministic finite automata (DFA) where
the transition function outputs a single state for a given input and current state. In this lecture,
we will discuss non-deterministic finite automata (NDFA) where the transition function can output
multiple states (or a set of states) instead.
1 0
0
start q0 q1
43
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 44
Q Σ Q’
q0 0 q1
q0 1 q0
q1 0 q1
q1 1 q0
0, 1 1
0, 1
start q0 q1
where δ ′ is the transition function: As we can see, δ ′ is a partial function whose output is a set of
Q Σ 2Q
q0 0 {q0 , q1 }
q0 1 {q0 , q1 }
q1 1 {q1 }
Hence, the NDFA shown above has the language L(A) = Σ∗ \ {ϵ}, i.e., the set of all strings over the
alphabet Σ except the empty string.
0 1 1
For example, the string 011 is accepted by A as it has the following path q0 →
− q0 → − q1 .
− q0 →
To determine if a string is accepted by an NDFA, we can check if the set of states reachable from
the initial state by reading the string contains any final state.
As {q0 , q1 } contains q1 ∈ F , the string 011 is accepted by A. By construction, there will always be
a set of choices which reach q1 from q0 for the input 011.
We can convert a NDFA to a DFA by considering the set of reachable states as states of the equiv-
alent DFA.
[
δ ′ (G, σ) = δ(q, σ)
q∈G
F ′ = {q : q ∈ P(Q), q ∩ F ̸= ϕ}
Notice that in our new DFA, the states are labelled as subsets of the states of the NFA. This means
that we have 2n states in our DFA if the NFA had n. It is left to the reader to verify that the DFA
we have defined satisfies all the requirements of a DFA.
We claim that after the same characters are inputted into both the NFA and DFA, the state of the
DFA is labelled the same as the set of current states of the NFA. This claim is easy to check using
the definition of the δ ′ function.
Now, in an NFA, a string is accepted if any one of the active states at the end of the string is in the
set of accepting states. Clearly, with our interpretation of the DFA, this is equivalent to being in a
state that belongs to the F ′ we have defined.
0,1 1
0,1
start q0 q1
The language depicted by this NFA is all the strings formed using {0,1} excluding the empty string(ϵ).
We represent this as:
L(A) = {w ∈ {0, 1}∗ |w is accepted by A}
that is,
L(A) = Σ∗ \{ϵ}
The transition function(δ) table looks as follows:
Q Σ 2Q
q0 0 {q0 , q1 }
q0 00 {q0 , q1 }
q0 01 {q0 , q1 }
To convert this NFA to a DFA, we need to track the states that can be reached after n choices which
will be a subset of 2Q .
Step 1
{q0 , q1 }
0
{q0 }
1
{q0 , q1 }
Step 2
0
{q0 , q1 }
Final DFA
0,1
0,1
start
{q0 } {q0 , q1 }
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 47
Now, to show that the languages represented by the NFA(A) and the DFA(A′ ) are the same, i.e,
L(A) = L(A′ ), we need to show the following:
1. L(A) ⊆ L(A′ )
2. L(A′ ) ⊆ L(A)
• L(D) ⊆ L(N )
• L(N ) ⊆ L(D)
We can prove that the equivalent DFA D accepts exactly the same language as the original NDFA
N , i.e, L(D) = L(N ).
3.4.1 Claim
We aim to demonstrate the equivalence of the languages accepted by NFA A and DFA A′ , denoted
as L(A) and L(A′ ) respectively. In other words, we want to show that L(A) = L(A′ ), where A
represents the original NFA and A′ represents the DFA obtained through subset construction from
NFA A. Alternatively, we can establish that L(A) ⊆ L(A′ ) and L(A′ ) ⊆ L(A), which implies
L(A) = L(A′ ).
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 48
3.4.2 Proof
We’ll prove this claim by showing that for all n ≥ 0 and for every word w ∈ Σ∗ with |w| = n, NFA
A can reach state q ∈ Q 1 upon reading w if and only if DFA A′ reaches state S ⊆ Q such that
q ∈ S.
Given the condition that for all n ≥ 0, and for every word w, where w ∈ Σ∗ such that |w| = n, we
aim to demonstrate the equivalence between the NFA A’s ability to reach state q ∈ Q upon reading
word w and the DFA A′ ’s capability to reach state S ⊆ Q, where q ∈ S. In other words, we want to
show that words in the language recognized by NFA A reach a certain state (say, a final state q), if
and only if words in the language recognized by DFA A′ reach state S (where S ⊆ Q, q ∈ S, and
S is a final state in the equivalent DFA A′ ).
Thus, we can establish that the set of words accepted by NFA A is equivalent to the set of words
accepted by DFA A′ , implying L(A) = L(A′ ), thereby validating our claim.
We will demonstrate this by induction on n.
1. Base case: When n = 0, i.e., |w| = 0, it essentially means that we are at the initial state of
the automaton. The initial state of the DFA A’ is represented by the singleton set containing
the initial states of the NFA A. So, definition of initial state of DFA A’ satisfies the claim.
2. Induction hypothesis: Assume the claim holds for all 0 ≤ n < k for some k > 0.
This basically means that there exists some path in NFA and equivalent DFA which is as
follows :
In NFA A :
q0 q
In DFA A’ 2 :
. . . q0 ...q
w
1
Q is the set of states in our original NFA A.
2
. . . q0 ⇒ set containing initial states of the original NFA A
...q ⇒ ⊆ Q containing the state q, where Q is the set of states of the original NFA A
. . . q̂ ⇒ ⊆ Q containing the state q̂, where Q is the set of states of the original NFA A
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 49
This is because state . . . q of DFA A′ contains q, which transitions DFA A′ to state . . . q̂ (a set
containing q̂) when symbol a is encountered.
Thus, we have shown that for all n ≥ 0, and for every word w where w ∈ Σ∗ such that |w| = n,
NFA A can reach state q ∈ Q on reading word w AND DFA A′ reaches state S ⊆ Q such
that q ∈ S.
(a) As the length of the word increases, the number of choices for state transitions in the
NFA grows exponentially.
(b) If an NFA has N states, the equivalent DFA can have an exponential number of states in
the worst case.
In simpler terms, a compiler is composed of three main components: a lexical analyzer, a parser,
and a code generator. Its primary function is to process a sequence of characters as input. The
lexical analyzer breaks down a sequence of characters into tokens, which are then analyzed by a
parser. To accomplish this, the lexical analyzer employs a NFA to determine whether a particular
state can be reached in the corresponding DFA and traces a path accordingly. When faced with a
long string, finding the final state can be challenging. At each step, there are multiple choices to
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 50
explore. However, as the length of the input increases, exhaustively exploring each choice becomes
increasingly difficult. Converting an NFA to a DFA may result in an exponential increase in states.
However, itâs important to note that not all states are necessary to reach the final state. This
excessive expansion of states may lead to unnecessary complexity, creating the entire DFA when itâs
not actually required.
So, the lexical analyzer determines whether there exists a path in the automaton that leads to an
accepting state for a given word without constructing the equivalent DFA.
Suppose the Lexical Analyser of Compiler have to check whether a GIVEN WORD is accepted by
the following NFA.
a b b a
a
start q0 q1 q2
b a
Question: How can a lexical analyzer determine whether a given word leads to an accepting state
without constructing the equivalent DFA?
Answer:
a b a
{q0 } {q0 , q1 } {q0 , q1 , q2 } {q0 , q1 , q2 }
Lexical Analyser can track the set of states the NFA could be in after reading each symbol of the
GIVEN WORD, updating the state based on the transitions specified by the GIVEN NFA.
• If the final set of states contains at least one accepting state of the original NFA, then there
exists a successful path for the given word to reach an accepting state, indicating acceptance
by the given automaton.
• The time complexity for this process is O(n · k), where n represents the size of the automaton
and k denotes the length of the word. This complexity is significantly lower than the usual
exponential time complexity observed in similar processes.3
q1 ϵ
0
1
0
ϵ
1
start q0
ϵ
q2
ϵ−edges bring non-determinism to the NFA, as you can sit on a node and take one of the ϵ−edges
possible from that node to jump to another node without consuming any letter of the input.
Figure 3.3 shows how ϵ edges are used to connect states of an automaton for free(without consuming
any letter from input), we can see that 10 ∈
/ L without the ϵ edge between q0 and q1 , but with the
presence of this ϵ edge 10 ∈ L.
ϵ−edge also allows us to connect two automatons. In Figure 3.4, the accepting states of L1 are
connected to the start states of L2 automaton, which generates an automaton for accepting L1 · L2 .
Here · is the concatenation operator. So if ω1 ∈ L1 and ω2 ∈ L2 , then ω1 · ω2 ∈ L1 · L2 will be
accepted by this new automaton.
Now, we will try to find an equivalent DFA of this NFA having ϵ edges. For that, we will first find
an NFA without ϵ edges which will preserve the original NFA, and this obtained NFA can then be
constructed into a DFA.
Initially, just look at the ϵ−edges only and find for each state its ϵ−closure, which is the set of the
states that we can reach from it by taking only ϵ−edges.
From each node, we can go to every node that is present in the ϵ−closure of that node for free. So
wherever non-ϵ edges take us from these nodes in ϵ−closure, we can reach from the node itself whose
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 52
closure it was. So all these states will be connected to this node, in the new NFA.
The starting states and the final states of this new NFA will be the ϵ−closures of the start and final
states respectively of the original NFA.
So NFA without ϵ−edges for the NFA in Figure 3.3 will look like as shown in Figure 3.5
0,1
start q1
0 0, 1
0
0, 1
0, 1 0
0, 1
0
start q0 q2
1 ϵ
ϵ,0,1
start q0 q1
ϵ 0
q2
Focus on ϵ-Edges of the Automaton to get the Epsilon Closure of the states in an Automaton
For Figure-3.6 Automaton:
5
equivalence of NFA and DFA is already proved
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 53
q0 {q0 , q1 } ϵ-closureq0
q1 {q1 } ϵ-closureq1
q2 {q0 , q1 , q2 } ϵ-closureq2
1. Non-epsilon transitions for a symbol, denoted as a ∈ Σ, originating from any state q ′ within
the epsilon closure of a state, say q, will also be present as non-epsilon transitions for state
q in the equivalent NFA. These transitions leads to the same destination states for q in new
NFA as they were for the state q ′ in the original epsilon-NFA.
2. The states in the epsilon closure of the initial state in the epsilon-NFA can serve as the initial
state in the new NFA. However, this is not necessary, we can make an equivalent NFA where
ϵ-NFA and the new NFA have same initial state(s).
3. All states within the epsilon-NFA that include the accepting state in their epsilon closure will
also act as final states in the new/equivalent NFA.
For the Figure-3.6 epsilon-NFA, after applying the aforementioned rules, we obtain the following
equivalent NFA:
6
These rules will become more clearer in the next class, when we will learn about leading epsilon transitions and
trailing epsilon transitions within the states of ϵ-NFA.
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 54
0,1
0
start q0 start q1 0,1
0,1
0 0,1
0,1 0
q2
3.7 Recap
We were trying to convert an NFA with ε edges to an equivalent NFA without ε edges in the previous
lecture. We’ll do that in more detail in this lecture.
• ε-edges are those edges which can be traversed without consuming any character from the
alphabet Σ, i.e. by consuming an empty string . Observe that the string "10" ∈ L with ε-edges
but without ε-edges, string "10"∈
/ L where L is the Language of the NFA.
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 55
1 ε 0,1
start q0 q1 ε
ε 0
q2
1. For each node q in NFA, find its ε-closure (say S) and mark all its non-ε-edges as they are.
Now, for each node q ′ ̸= q in S, mark all non-ε-edges starting from q ′ going to q ′′ as extra edges
starting from q going to q ′′ . For eg., for the node q0 , the only distinct node in its ε-closure is
q1 so mark the edges {0, 1} from q1 to q1 , {0} from q1 to q0 and {0} from q1 to q2 as extra
edges {0, 1} going from q0 to q1 , {0} from q0 to q0 and {0} from q0 to q2 respectively (marked
in red).
2. Mark all those states as accepting whose ε-closures contain atleast one of the accepting states
of the ε-NFA. For eg., here only q2 has the accepting state q2 in its ε-closure, so only q2 is
marked as accepting.
3. Starting states in the new non-ε-automaton will be the same as the starting states in the
original ε-automaton.
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 56
1 0,1 0,1
start q0 q1
0 0
0,1
0,1
0
0
q2
0
Figure 3.9: Equivalent non-ε-NFA
Subsequently, we’ll use "original/ori" for the ε-automaton and "new" for the non-ε-automaton.
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 57
• One may ask why the states which are in the ε-closures of the accepting states of the original
automaton are not marked as accepting in the non-ε-NFA. The answer is :- every string which
reaches an accepting state a can also reach any of the states in ε-closure(a) by taking ε-edges and
those strings are indeed in the language of the ε-NFA. But there are many such strings also which
can reach at the states in the ε-closure(a) which are not in the language of the ε-NFA. So, if we
mark those states as accepting, we are changing the language which is not our intention. For eg.,
q0 ∈ ε-closure(q2 ) but if we mark q0 as accepting in the non-ε-NFA, then the string "1" will also
be included in the language of the non-ε-NFA but "1" ∈ / L(ε-NFA).
Now, what we mean by correctness of the algorithm is that the language of the new non-ε-automaton
should exactly be equal to the language of the original ε-automaton. So we need to prove that
L(new) = L(ori)
that ε-closure(s) contains an accepting state s′ . Now, according to our algorithm (step
2), all accepting states in the new automaton are either accepting states in the original
automaton as well or are those states which have atleast one accepting state in the original
automaton in their ε-closures. So, the starting state s is an accepting state in the new
automaton and thus w ∈ L(new).
(b) Case 2: w is non-empty
It means that we have a path εk0 c1 εk1 c2 . . . εkm−1 cm εkm for all ki ≥ 0 ∀i ∈ {0, 1, 2 . . . , m}
from a starting state s0 in the original automaton to an accepting state sm+1 where taking
εki from any state x means going to some state y ∈ ε-closure(x). Here, s1 is the state
reached after reading c1 , s2 after reading c2 . . . sm after reading cm and sm+1 after taking
εkm . Now, according to our algorithm (step 1), for s0 in the new automaton, we have
a direct edge {c1 } from s0 to s1 . Similarly, we’ll have direct edges from s1 to s2 and so
on. So, finally, we can reach sm from s0 in the new automaton. Now, according to our
algorithm (step 2), sm is an accepting state in the new automaton because ε-closure(sm )
contains accepting state sm+1 in the original automaton. Thus, w ∈ L(new).
Hence, proved.
3.7.4 Extras
• Language of a node q is defined as the set of all those strings w ∈ Σ∗ such that upon reading
w character by character we can reach q from any of the starting states of the automaton.
• Note that in ε-automaton we could also take ε-edges in between the characters, before the
first character as well as after the last character of w and reach q so such strings would also
be considered in the language of q.
• And the language of an automaton is defined as the union of the languages of all its accepting
nodes. So, we have
[
L(new) = L(qi ) ∀ accepting states qi of the new non-ε-automaton
i
[
L(ori) = L(qj ) ∀ accepting states qj of the original ε-automaton
j
• A lemma: L(qi )original ⊇ L(qi )new holds true where qi is any node of the NFA and L(qi ) is the
language of that node.
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 59
• Proof: Going by the same idea as we did in Correctness proof part 1, we can prove the lemma.
Like if w ∈ L(qi )new and w is empty then qi must be one of the starting states in the new
automaton and also in the original automaton so w ∈ L(qi )original since starting states are the
same in both the automatons. If w ∈ L(qi )new and w is non-empty, then we can have a path
with some ε’s in between the characters of w from some starting state s0 to qi in the original
automaton and thus L(qi )original ⊇ L(qi )new is indeed true.
Where the ⊃ sign comes in is the case when the path of w has some trailing ε’s. For eg.
consider this ε-NFA and its equivalent non-ε-NFA,
0,1 0
q0 1 q1 ε q2
start
Figure 3.10: ε-NFA
0,1 0
q0 1 q1 q2
start
Figure 3.11: Equivalent Non-ε-NFA
L(q2 )original is non-empty and for instance contains the string "1" with the path "1ε" from
q0 to q1 but L(q2 )new is clearly empty,i.e., the null set ϕ. This happened because the path
contained a trailing ε. So, L(q2 )new ⊂ L(q2 )original .
3.7.6 Examples
• L = {x ∈ Σ∗ |x = u.v.w, v ∈ Σ∗ , u, w ∈ Σ+,|u|≤2,u=w} where Σ = {0, 1, 2 . . . 9}
For instance, "0000" ∈ L because either take u = w = 0, v = 00 or take u = w = 00, v = ε
Also, "1234" ∈/ L because we cannot have any satisfiable u, v, w.
It’s NFA will be a combination of 110 NFA’s of the following form :-
0, 1, . . . 9
q0 0 q1 ε q2 ε q3 0 q4
start
0, 1, . . . 9
q0 0 q1 ε q2 ε q3 0 q4
start
1 1
0, 1, . . . 9
ε ε
q1′ q2′ q3′
One for each u = w = 0, 1, 2 . . . 9 so 10 here and 100 more for u = w = 00 to 99. We can have
separate NFA’s with different possible q1′ , q3′ states. Like instead of 0 put 1 to 9 and 00 to 99
there on the edge from q0 to q1′ and q3′ to q4 and in this way we can form the whole NFA just
like in figure 6.
Σ Σ
q0 ε q1 x q2 x q3 x q4 ε q5
start
• Takeaway task : Think about how KMP implicitly constructs NFA for pattern matching.
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 61
where δ̂ ((q, r), a) = (δ(q, a), δ ′ (r, a)) and F̂ can be defined according to the required operation on
the DFAs.
For example, if Q = {q0 , q1 } and Q′ = {r0 , r1 , r2 } and F = {q1 } and F ′ = {r1 , r2 } then for
the intersection of these Automata, F̂ = {(q1 , r1 ), (q1 , r2 )} and for the union of the Automata,
F̂ = {(q1 , r0 ), (q1 , r1 ), (q1 , r2 ), (q0 , r1 ), (q0 , r2 )}. Similarly we can define the complement operation
by taking F̂ = Q − F .
With these basic rules in place, we can go on to define more complicated combinations like
(DF A1 ∩ DF A3 ) ∪ (DF A2 ∩ DF A3 ) − (DF A1 ∩ DF A2 )
We now have another way to tell if a language is a subset of another language. To tell if L1 ⊆ L2 ,
we have to show that L1 ∩ Lc2 = ϕ. The problem thus reduces to showing that in the DFA defined
by DF A1 ∩ DF Ac2 there is no path from the start node to any accepting state.
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 62
0 0
1
start q0 q1
1
q1
1 0 1
0
start q0 q2
1
0
start q00 q10 q20
0
1
1 1 1
1 1
q01 0 q11 q21
0
However, for NFAs, just flipping the accepting and non-accepting states won’t give us the comple-
ment.
Thus to take the intersection or union of two NFAs, we can follow a similar approach to DFAs but
for complementation, the NFA must first be converted to a DFA and then complemented to give the
actual complement of the original NFA.
q1 q1
a a
start q0 start q0
a a
q2 q2
3.10 Substitution
We will start with an example.
∗
Consider two alphabets Σ1 = {a, b} and Σ2 = {0, 1, 2} and languages L1 = a∗b defined on Σ∗1 and
∗1∗ ∗
La = 0∗(1+2) and Lb = 1∗(0+2) defined on Σ∗2 .
Now we define a set subst(L1 , La , Lb ) as
∗ | ∃u∈L1 =α1 α2 ...αk such that w∈Lα1 Lα2 ...Lαk
n o
subst(L1 , La , Lb ) = w ∈ Σ2 (3.3)
Or equivalently, subst(L1 , La , Lb ) =
S
Lα1 Lα2 . . . Lαk
u=α1 α2 ...αk ∈L1
Intuitively, the substitution operation is to replace each letter by a language.
Diagrammatically, it is represented as replacing each edge by an entire automaton and connecting
the initial and accepting states to the original states by ϵ edges.
Using this operation, we can easily prove that if L1 and L2 are regular languages then L1 · L2 is also
regular, since L = {a · b} is regular then subst(L,L1 ,L2 ) will also be regular.
Regular Expressions
4.1 Introduction
In arithmetic, we can use the operations + and × to build up expressions such as (5 + 3) × 4.
Similarly, we can use the regular operations to build up expressions describing languages, which are
called regular expressions. An example is: (0 ∪ 1)0∗ . The value of the arithmetic expression is the
number 32. The value of a regular expression is a language. In this case, the value is the language
consisting of all strings starting with a 0 or a 1 followed by any number of 0s. We get this result
by dissecting the expression into its parts. First, the symbols 0 and 1 are shorthand for the sets
{0} and {1}. So (0 ∪ 1) means ({0} ∪ {1}). The value of this part is the language {0, 1}. The
part 0∗ means {0}∗ , and its value is the language consisting of all strings containing any number of
0s. Second, like the × symbol in algebra, the concatenation symbol ◦ often is implicit in regular
expressions. Thus (0 ∪ 1)0∗ actually is shorthand for (0 ∪ 1) ◦ 0∗ . The concatenation attaches the
strings from the two parts to obtain the value of the entire expression. Regular expressions have an
important role in computer science applications. In applications involving text, users may want to
search for strings that satisfy certain patterns. Regular expressions provide a powerful method for
describing such patterns. Utilities such as awk and grep in UNIX, modern programming languages
such as Perl, and text editors all provide mechanisms for the description of patterns by using regular
expressions.
2. ε,
3. ∅,
64
CHAPTER 4. REGULAR EXPRESSIONS 65
In items 1 and 2, the regular expressions a and ε represent the languages {a} and {ε}, respectively. In
item 3, the regular expression ∅ represents the empty language. In items 4, 5, and 6, the expressions
represent the languages obtained by taking the union or concatenation of the languages R1 and R2 ,
or the Kleene star of the language R1 , respectively.
4.3.4 Example
Let’s illustrate with an example: (ab) + a. This expression represents the union of the language
containing the string ab and the language containing the string a, resulting in {ab, a}.
4.3.7 Example
For an expression e1 = a + b, the language denoted by [e1 ] is {a, b}. Therefore, [(a + b)∗ ] represents
the set of all possible strings comprising as and bs, including the empty string ε, a, b, ab, ba, aa, bb,
and so on.
CHAPTER 4. REGULAR EXPRESSIONS 66
• [(a∗ · b∗ )∗ ] represents the set of all strings containing any number of occurrences of strings
composed of as followed by bs.
• 01 ∪ 10 = {01, 10}
• 0Σ∗ 0 ∪ 1Σ∗ 1 ∪ {0, 1} = {w | w starts and ends with the same symbol}
• (0 ∪ ε)1∗ = 01∗ ∪ 1∗
• 1∗ ∅ = ∅
• ∅∗ = {ε} The star operation puts together any number of strings from the language to get a
string in the result. If the language is empty, the star operation can put together 0 strings,
giving only the empty string.
In previous lectures, we have already proved the equivalence of NFAs with ϵ edges, NFAs without ϵ
edges and DFAs. So in order to prove Kleene’s Theorem, it suffices to show the equivalence between
Regular Expressions and NFAs with ϵ edges.
We prove each of these parts separately and then combine them to get the required result.
We wish to construct an NFA with ϵ edges that accepts the same language as our regular expression.
To do this, we construct the parse tree of the regular expression and use a bottom up approach
starting from the leaves to construct an NFA for each node of the parse tree.
[.· [.∗ [.+ [.∗ [.· [.0 ] [.1 ] ] ] [.∗ [.· [.1 ] [.0 ] ] ] ] ] [.∗ [.+ [.· [.1 ] [.1 ] ] [.· [.0 ] [.0 ] ] ] ] ]
We start from the leaves, in this case the nodes labelled 0 and 1. Consider the leaf 0. It represents
a language with only one string, i.e., L(0) = {0}. It can be represented by the NFA
0
start q0 q1
1
start q0 q1
Now we have NFAs for each of the leaf nodes. Let us see how to construct the NFAs for the rest of
the nodes from these.
Suppose we have the NFAs of regular expressions a and b, and we want to obtain the NFA of
a.b. Since . is simply the concatenation operator, the new NFA should accept all strings that
satisfy both the NFAs in sequence. Such an NFA can be constructed by connecting the final state
of a and the starting state of b (and marking them as regular states) by an ϵ edge. This results in
CHAPTER 4. REGULAR EXPRESSIONS 68
a new NFA whose starting state is the starting state of a and final state is the final state of b. The
language of this NFA is simply a.b.
In out example, given the NFAs of 0 and 1 we want to construct the NFA of 0.1. Following
the above procedure, we obtain the NFA
0 ϵ 1
start q0 q1 q2 q3
From the parse tree, the next node we need to construct an NFA for is (0.1)∗ .
The language accepted by a∗ is simply the set of strings in which strings in the language of a
are concatenated with themselves multiple (possibly 0) times. To obtain the NFA for a∗ , we take
the NFA of a, introduce a new state that is both starting and accepting, and draw ϵ edges from the
original final state to the new state and from the new state to the original starting state. We also
mark the original starting and accepting states as normal states. The correctness of this procedure
can be proved by induction.
ϵ 0 ϵ 1
start q0 q1 q2 q3 q4
ϵ 1 ϵ 0
start q0 q1 q2 q3 q4
Now the next step in our recursive contruction of parse tree nodes is to obtain the NFA for (0.1)∗
+ (1.0)∗ .
The language accepted by a+b is simply the union of the language accepted by a and the lan-
guage accepted by b. So in order for a string to be accepted by the NFA of a+b, it must be
CHAPTER 4. REGULAR EXPRESSIONS 69
accepted by either the NFA of a or the NFA of b. Hence the required NFA can be obtained by
introducing a new start state, drawing ϵ edges from it to the starting states of the original NFAs,
and marking the original NFA starting states as normal states.
Applying this procedure to (0.1)∗ and (1.0)∗ , we get the NFA of (0.1)∗ + (1.0)∗
ϵ ϵ
q0 ϵ q1 0 q2 ϵ q3 1 q4 r0 ϵ r1 1 r2 ϵ r3 0 r4
ϵ
s
ϵ
start
Using the same procedures, we can construct NFAs for all the remaining nodes of the parse tree.
We observe that the size of the NFA is linear in the size of the original regular expression.
start q0 q1
b
a a
a
q2
b
b
b
a a,b
start q3 q4 b
Our strategy to obtain a regular expression is to gradually reduce the 5 state automaton to one
with fewer states, and keep repeating this process until we end up with a regular expression. To
perform this reduction, we relax a condition on NFAs; we now allow edges to be labelled by regular
expressions. An edge labelled by a regular expression simply means a path in which we consume a
string satisfying the regular expression.
CHAPTER 4. REGULAR EXPRESSIONS 70
Algorithm
Step 1: Create a new starting state and get rid of all the old ones. Connect this new state to
the old starting states using ϵ edges.
q0 q1
ϵ b
a a
start qs a
q2
b
b
ϵ b
a a,b
q3 q4 b
Step 2: Create a new final state and get rid of the old ones. Connect all the old final states to the
new state using ϵ edges.
q0 q1
ϵ b ϵ
a a
start qs a qf
q2
b
b
ϵ b ϵ
a a,b
q3
a
q4 b
Observe that our NFA now has one single initial state with no edges leading back to it, and one
single accepting state with no edges going out of it.
Step 3: Systematically remove all the states except for the inital and accepting states.
Suppose we choose state q2 to be removed first. This state was facilitating some strings’ paths
from the initial to the accepting state. In order to remove q2 , we must first create alternate paths
not involving q2 that allow these strings to reach the final state. To do this, we follow the approach
below,
CHAPTER 4. REGULAR EXPRESSIONS 71
1. Choose an incoming transition of q2 . In our example, suppose we choose the transition from
q0 to q2 on consuming a.
2. Now consider all the outgoing edges of q2 , say from q2 to qx . Our goal is to facilitate transitions
from q0 to qx without involving q2 . This can be done by introducing direct transitions from q0
to qx and labelling them with the regular expressions formed by concatenating label(q0 − q2 )
and label(q2 − qx ), and then removing the incoming edge.
Applying step 2 to q2 ,
Observe that q2 has a self loop labelled a,b, which can be replaced by regular expression
a+b. This means that while concatenating label(q0 − q2 ) and label(q2 − qx ), we must also
account for the regular expression (a+b)∗ in between the labels.
(a) Outgoing edge: q2 to q0 on consuming b
a.(a+b)∗ .b
q0 q1
b
ϵ a a
ϵ
q2
a
start qs b
b b qf
ϵ a,b
a ϵ
q3
a
q4 b
a.(a+b)∗ .b a.(a+b)∗ .a
q0 q1
b
ϵ a a
ϵ
q2
a
start qs b
b b qf
ϵ a,b
a ϵ
q3
a
q4 b
a.(a+b)∗ .b a.(a+b)∗ .a
q0 q1
b
ϵ a a
ϵ
b q2
a
start qs a.(a+b)∗ .a
b b qf
ϵ a,b
a ϵ
q3
a
q4 b
a.(a+b)∗ .b
a.(a+b)∗ .b a.(a+b)∗ .a
q0 q1
b
ϵ a a
ϵ
b q2
a
start qs a.(a+b)∗ .a
b b qf
ϵ a,b
a ϵ
q3
a
q4 b
The NFAs corresponding to applying step 2 to each of the above transitions are illustrated on
the next page.
Now that we have added the above four transitions, strings from q0 do not need q2 to reach
the final state. So we can remove the incoming edge from q0 to q2 (see figure 11). Now we
repeat the same procedure for all incoming edges of q2 . After doing this, q2 has no incoming
edges and hence cannot be part of the path of any string. So it does not contribute to our
NFA and can be removed along with all its outgoing edges.
CHAPTER 4. REGULAR EXPRESSIONS 73
a.(a+b)∗ .b
a.(a+b)∗ .b a.(a+b)∗ .a
q0 q1
b
ϵ a
ϵ
b q2
a
start qs a.(a+b)∗ .a
b b qf
ϵ a,b
a ϵ
q3
a
q4 b
We repeat these steps for each remaining non-initial non-accepting state of the NFA, and
finally obtain the required regular expression.
To see how exactly the NFA becomes a regular expression, we illustrate the pre-final step
for an arbitrary NFA.
e1
e2 e4
start q0 q1 q2
e3
e1
e2 .e∗3 .e4
start q0 q2
1. L(DFA1 ) ⊆ L(DFA2 )
2. L(DFA1 ) ⊇ L(DFA2 )
L(DFA1 ) = L(DFA2 )
But how do we check if a language L1 is a subset of another language L2 from just their DFAs? We
use the following algorithmic procedure.
We say that L1 ⊆L2 when every string in L1 is also present in L2 . So we can say
L1 ⊆ L2 ≡ L1 ∩ (Σ∗\L2 )=ϕ
The language Σ∗\L2 is just the set of all strings not accepted by L2 . So to obtain its DFA, all we have
to do is invert the acceptance status of each state (accepting states become normal states and vice
versa). Also since L2 is a regular language, Σ∗\L2 (denoted by Lc2 or L¯2 ) is also a regular language.
Now given the DFAs of L1 and Lc2 , we wish to construct the DFA of L1 ∩ Lc2 . This new DFA
should accept only those strings that are accepted by both L1 and Lc2 .
We achieve this by running both DFAs simultaneously on the same string, and checking if we
end up in a pair of accepting states. This is analogous to taking the Cartesian product of both
transition functions. Consider the following example:
1 1
0 0,1
start q1 q2 start q3 q4
0 0,1
0 q2 , q3
11
0
q1 , q4 q2 , q4
1
00
q1 , q3
1
start
DFA Minimisation
0 1
1
start q0 q1
0
The above DFA has two states q0 and q1 . q0 is reached when the last seen letter was 0 (also at the
start). q1 is reached when the last seen letter was 1 (also, it is an accepting state).
We can even construct a 4-state DFA for the same language. Here is an example:
0 1 0 1
1
1 0
start q1 q2 q3 q4
0
76
CHAPTER 5. DFA MINIMISATION 77
• q4 : Last letter 1
One can construct many more 4-state DFAs for the same language. Above is another example.
0 1 0
1
1 0
start q1 q2 q3 q4
0
A natural question which comes to our mind is that can we construct another 2-state DFA for this
language. One can find through trial and error, that this is not possible.
Is one state DFA possible for this language? Let us assume that this is possible. Two cases arise. If
that state is accepting, then it will also accept ϵ, which is not possible. If that state is not accepting,
then the language will be empty. We have arrived at a contradiction. Hence, one state DFA is not
possible for this language.
Therefore, for this language, the minimum states in any DFA can be 2. We also observe that
number of such 2-state DFAs is 1. So, we will now claim that for every language, the number of
minimal DFAs is 1 and try to prove this. Before we prove this, we will introduce the notion of
indistinguishability.
5.2 Indistinguishability
Two states of a DFA qi and qj are considered indistinguishable, if ∀w ∈ Σ∗ , we start with qi , process
w and reach qi′ and we start with qj , process w and reach qj′ , then either qi′ ∈ F and qj′ ∈ F or
/ F and qj′ ∈
qi′ ∈ / F , where F is the set of all final states of the DFA. So, we are basically changing
the start states and checking whether we reach the same type of states or not through the same
string.
This relation is denoted by ≡. It has the following properties:
• It is reflexive. It is clear to see that every state is indistinguishable to itself, as it will reach
a particular state on seeing w. (Due to the nature of a DFA)
• Also, it is clear to see that this relation is symmetric.
• This relation is also transitive. We can prove this by contradiction. Let us assume that
(qi ≡ qj ) ∧ (qj ≡ qk ) but qi ̸≡ qk . Then ∃w such that qi′ ∈ F and qk′ ∈
/ F , where qi′ and qk′ are
the states we reach from qi and qk respectively on seeing w. From the equivalence of qi and qj ,
we have qj′ ∈ F but from the equivalence of qj and qk , we have qj′ ∈ / F , where qj′ is the state
that we reach from qj on seeing w. Hence, we have arrived at a contradiction. Therefore, this
relation is transitive.
• A relation which is reflexive, symmetric and transitive, is equivalent. Thus the states of the
DFA, on which this relation is defined can be partitioned into equivalence classes.
CHAPTER 5. DFA MINIMISATION 78
In the above example (Figure: 5.2), q1 and q3 belong to the same equivalence class, and q2 and q4
also belong to another equivalence class. Let us now try to construct a 2-state DFA from the above
4-state DFA example (Figure: 5.2). We will choose, one element each from both of the equivalence
classes. Let us take q1 from the first class and q4 from the second class.
0 1
1
start q1 q4
0
From q1 , if we see a 0, we land at q1 itself. If we see a 1, we would have landed at q2 , but since q2
and q4 are equivalent, we replace q2 by q4 . Similarly, from q4 , if we see a 1, we remain at q4 , but
if we see a 0, we would have landed at q3 , but since q1 and q3 are equivalent, we replace q3 by q1 .
Also, q2 and q4 both were acceptable earlier, now q4 is acceptable.
So, through these equivalence classes, we have minimized our 4-state DFA into a 2-state DFA, and
also this 2-state DFA is structurally the same as the previous one (Figure: 7.10), thus again making
us think that the claim that there is a single minimal DFA for every language might be correct.
Another interesting thing we observe is that an accepting state cannot be indistinguishable from a
non-accepting state. We can take w = ϵ, and observe that the states we reach from this pair of
states do not satisfy the definition of indistinguishability relation. However, an initial state and a
non-initial state can belong to the same equivalence class. (For example, above q1 and q3 belonged
to the same class.)
Now, several important questions arise. How can we find the equivalence classes of this relation?
When we will come to know that we cannot compress our DFA further (by compress, we mean
reducing the number of states of the DFA)? How can we prove our claim that the minimal DFA will
be unique?
We will try to answer all these questions subsequently. Let us start with the easiest one.
w w
start qs qs′ start qt qt′
a1 a2 . . . an
start qs qs′′ qs′
a1 a2 . . . an
start qt qt′′ qt′
From, the above figures, one can observe that qs′′ and qt′′ are also distinguishable, where w′ = a2 . . . an
is the string which is making them distinguishable.
Thus, we can observe that for all states qi and qj such that qi ̸≡ qj and for all a ∈ Σ, such that qs
on seeing a lands at qi and qt on seeing a lands at qj then qs will be distinguishable with qt , where
qs and qt are two states of the DFA.(We are basically extending w′ = a + w.)
Therefore using a distinguishable pair, we have found another one. This is the basis of our algorithm.
We initialize our set with all the pairs, where one state belongs to the set of accepting states and
other state does not belong to the set of non accepting states. And then through the above step, we
keep on increasing the size of this set. (Note that this algorithm is not exponential, because there
n
are only 2 pairs possible, and we do need to check an already visited pair.)
q1 q2 . . . qn
q1
q2
X X
.
.
. X X
qn
w w
start qs qs′ start qt qt′
Now w can be written as a1 a2 . . . an , where |w| = n. And also assume that we reach qs′′ and qt′′ on
seeing a1 from qs and qt respectively. It is clear that, if our algorithm has not detected qs and qt as
distinguishable, then it would not even have detected qs′′ and qt′′ as a distinguishable pair. (Because,
if it would have done, then the next step would have been to make qs and qt as distinguishable.)
a1 a2 . . . an a1 a2 . . . an
start qs qs′′ qs′ start qt qt′′ qt′
Now, we will inductively move forward our algorithm. Thus qs′′′ and qt′′′ would not also be detected as
distinguishable after our algorithm finishes. But clearly, this is not possible, because our algorithm
initializes the set of pairs of {accepting, non-accepting} states and then it’s first step is to move
backwards. So, it would have marked qs′′′ and qt′′′ as distinguishable in the first step itself.
a1 a2 . . . an−1 an a1 a2 . . . an−1 an
start qs qs′′′ qs′ start qt qt′′′ qt′
We have achieved contradiction. Therefore, we can safely conclude that our algorithm terminates
and is also correct.
Now, once our algorithm detects all pairs of distinguishable states, we can choose equivalence classes
of the indistinguishability relation. (Then, we can choose one representative from each class and
move forward with our proof of existence of a unique minimal DFA.)
It is worth noting that distinguishability is not an equivalent relation. It is not even reflexive. It is
also not transitive. But, it is indeed symmetric. And also, it proved out to be very useful for finding
these equivalence classes.
Automaton accepting
Reduced Automaton same set of languages
obtained by our method as X but not obtained
by our method
Automaton X Automaton Y
Let say we started with the automaton D. After termination the resultant automaton was A. Let SA
represent the set of distinct such languages of nodes of A. Since A has all states indistinguishable
from every other state, the |SA | = no of states.
Claim: Given any DFA B, equivalent to A,
Proof: If L ∈SA , then by definetion it must be the language of some node in A, say s. Since all the
nodes of A are ir-redundant, so let’s say string w is one such string through which we can reach this
node starting from the starting state of A. Now run the same string w on the automaton B. Say we
reach the state S. Then L ≡ LA s ≡ LS , because if say string x is present in former and not in latter
B
then, string w.x will be an accepting string in A and not in B, and vice-versa. Thus LB S ≡ L.
For every language in SA , we must have at least one node in B. Since one node has a specific
language, this gives us a lower bound to number of nodes, |SA |.
A achives the lower bound of states, hence it is the(?) minimum state DFA which
represent the same regular language represented by D.
Since C is an optimal DFA, it can not be further shrunk. That implies two things: No redundant
nodes, no pair of nodes is indistinguishable. That means that the language of the nodes of C is
distinct and unique. Since A and C are both optimal, both must have same number of states, and
all the nodes of both of them are reachable, and all the nodes of both of them have unique languages(
unique over the domain of single DFA not collectively).
Consider any node s of A, take a string w which take us to it, run it over B, say we reach the node S,
then s and S have same language. Since the nodes of B are distinguishable, so S does not depend on w.
This can be used to define an injection from the nodes of A to the nodes of B. It is injective
function because of distinguishability of the nodes of A as well as of B. The function is also bijec-
tive because the number of nodes of A and B are both finite and equal, making it both injective
and surjective, thus bijective.
q1 q1
0 0
start q0 start q0
0,1 1 0,1 1
q2 q2
Figure 5.9: Two DFAs with corresponding states connected by dotted lines.
Claim:
As per this function, starting state of A will be mapped to starting stata of B. . This is
because the languages of these two states is essentially the language of there respective DFA which
is same( given ).
Claim:
An accepting state will be mapped to an accepting state of B. . This is because any state
which has ϵ in it’s language is an accepting state by the definition of "language of state", similarly
an accepting state will have ϵ in it’s language. So an accepting state of A mapped to whatever state
of B, that state must have ϵ in it’s language making it an acceptable state of B.
Claim:
Consider Two states of A, S1A and S2A . Suppose that they are mapped to S1B and S2B
respectively. Then if there is an α labelled edge going from S1A to S2A , there must be an
α labelled edge going from S1B to S2B also. Consider any string w that takes us to S1A , we can
see that w.α will take us to S2A . Now since S1A is mapped to S2A and S1B is mapped to S2B we can
say that "any" string that takes us to one in A will take us to the image of it when run on B(Why?
proved above:)). Thus w takes us to S1B and w.α takes us to S2B . Because a DFA is deterministic,
there should be α labelled edge going from S1B to S2B .(Food for thought: Is mentioning determinism
important here?)
Isomorphism of labelled graphs:
An isomorphism is a vertex bijection which is both edge-preserving and label-preserving.
Isomorphism of DFAs:
An isomorphism is a vertex bijection which is both edge-preserving and label-preserving, where im-
CHAPTER 5. DFA MINIMISATION 83
age of starting state is starting state and image of a accepting state is an accepting state.
All these three claims shows that A and B are same automatons.
This tool can be used to check wether two different regular expressions represent the same language
or different languages. Just convert them into DFAs, and then to the minimal DFAs. Check for
the isomorphity of the two DFAs if they are isomorphic, then the regular expressions are equivalent
otherwise not. ( Isomorphism ⇐⇒ equivalence )
5.5.2 Relation between states of minimal DFA and equivalence classes for ˜L
If a string w ends up in a state of DFA, say s, then the language of the state is equivalent to [w]˜L .
This can be easily proved by using definition of the "language of state" and "language of word". If
a string w.x ∈ L, then if you run the w.x on DFA, you first reach that state using w then x takes
to some accepting state. Hence x ∈ LDF s
A . Also if x ∈ LDF A , then essentially w.x will end in a
s
accepting state because w ends in state s.
A particular equivalence class represent a particular language and a state represent a particular
language. Since for a minimal DFA, all the states represent distinct language, we can define a
bijection from equivalence classes to the state of minimal DFA, where a equivalence class is mapped
to that state which represent the same language as that of any string in that equivalence class.
If the language L is regular, then a minimized DFA A = (Q, Σ, δ, q0, F ) can also be defined for the
language L.
CHAPTER 5. DFA MINIMISATION 84
For this language L, w1 and w2 are two words in Σ∗ such that w1 ∼L w2 . Let qi and qj be two
states ∈ Q such that
Since w1 ∼L w2 , it follows that for all x ∈ Σ∗ , the state reached in DFA A after reading w1 · x and
the state reached in DFA A after reading w2 · x will either both belong to F or both belong to Q \ F .
Otherwise, one word would be accepted by L while the other would not, leading to a contradiction
in the definition of the equivalence relation.
This demonstrates that if two words/strings belong to the same equivalence class with respect to L,
then both strings will end up in the same state q ∈ Q in the minimized DFA.
Till now, we have defined two equivalence relations ∼L over the language L and ≡ over the states of
a DFA characterizing a language L. We aim to define a relation between the number of equivalence
classes of both these relations.
Since they belong to different equivalence classes, we can say ∃ x ∈ Σ∗ s.t. w1 · x ∈ L and w2 · x ∈
/L
(or can be vice-versa, does not matter).
w1 x
start q0 q1 q3
w2
q2 x q4
• Here, q1 represents the equivalence class of w1 while q2 represents the equivalence class of w2.
• Doing this for every pair of equivalence classes, we find that the states representing the distinct
equivalence classes are all distinguishable from each other leading us to the following relation.
| ∼L | ≤ | ≡ |
.i.e. the number of indistinguishability equivalence classes is atleast the the number of Nerode
equivalence classes
start [ϵ]
• Then we pick a letter from the alphabet and look the transitions from each of the existing states.
If the next word already lies in the equivalence class of one of the existing states, we draw the
arrow representing the particular transition otherwise a new state is made representing the
equivalence class of the newly made word.
0
start [ϵ] [0]
1
1
0
start [ϵ] [0]
[1] If 1 lies in [0], then the automata will look
like this.
Here, 1 does not lie in the equivalence class
of 0, i.e., [0].
• By repeatedly applying point 2, the automata can keep on expanding like as below
0 1 0
start [ϵ] [0] [01] [010]
[1]
• However, this construction is guaranteed to converge because we have already proved that
| ∼L | ≤ | ≡ | . It is known that the cardinality of equivalence classes of the indistinguishability
relation is equal to the number of states in the minimal DFA representing the language L.Since
the number of such states is finite, it follows that | ≡ | is also finite.Given that | ∼L | ≤ | ≡ |,
it can be deduced that | ∼L | is finite. Consequently, this guarantees the convergence of the
algorithm in question.[If the above algorithm does not converge, that means new states will
keep on forming contradicting the | ∼L | ≤ | ≡ | identity.]
CHAPTER 5. DFA MINIMISATION 86
• Finally, all those states whose equivalence classes have words that are accepted by the language
L are marked as accepting states.
Hence, the aforementioned algorithm successfully allows us to construct a finite state automata(DFA)
which accepts only the words that are accepted by the language L. Note that here the number of
states in the new automata is equal to ∼L .
Now, let us assume that | ∼L | < | ≡ | holds. This implies that our newly constructed automata is
the minimized DFA for the language L.
However, in the last lecture we had proved that the DFA constructed by ≡ relation is minimized
one and is unique. Hence, it leads to a contradiction,making our assumption wrong.
This theorem provides an exact characterization of a regular language, unlike the Pumping Lemma.
While the Pumping Lemma does not guarantee that L is regular if it holds, here, if ∼L has a finite
number of equivalence classes, the language L must be regular. The proof of this theorem can be
found here.
Chapter 6
If L is a regular language, then there exists a finite DFA with a minimum number of states
(say p) that recognizes it. Let’s consider a string w of length at least p that is accepted by this DFA.
By the Pigeonhole Principle, we can deduce that when the DFA processes the string, it must revisit
at least one state, thereby implying the existence of a loop.
6.1 Example
Consider string w ∈ L, where L is a regular Language. Suppose the first state in DFA which is
revisited again on processing string w is q1 .
x z
start q0 q1 q3
87
CHAPTER 6. PUMPING LEMMA FOR REGULAR LANGUAGES 88
• If a language L is regular then Pumping Lemma holds true for L but if Pumping Lemma holds
true for Language L then it does not necessarily mean that L is regular.
• But if Pumping Lemma does not hold true for Language L then L is not regular.
• The believer chooses an integer p > 0 and claims this is the count of states in the DFA that
she believes recognizes the language.
• Believer then splits w into three parts w = xyz, where |xy| ≤ p and |y| > 0
yes
Is
? Oracle
w∈L
membership in
L no
Test all strings w, such that n < |w| ≤ 2n for membership in L. If we find even one string which
returns ’yes’ on testing then L is infinite. (Since pumping lemma can be applied to such a string,
and it can also be pumped to show that an infinite sequence of strings are in L).
If all strings w, ( n < |w| ≤ 2n) return ’no’ on testing then we claim that there are no strings in L
of length more than 2n, and thus there are only a finite number of strings in L.
Proof: On applying the pumping lemma for such w repeatedly, we can obtain strings of length
between n and 2n, which again belong to L. This is because, for w = xyz, removing loop y every
time decreases the length by at least 1. This is a contradiction to our assumption that there are no
strings of length between n and 2n in L.
Hence L is infinite if it contains atleast one string of length at least n + 1.
CHAPTER 6. PUMPING LEMMA FOR REGULAR LANGUAGES 89
6.5 Example
k
Consider the Language L = {(0 + 1)∗01 | k is prime} with alphabet
P
= {0, 1}
The application of the Pumping Lemma is independent of the initial state chosen such that length
of string remains ≥ p (pumping length)
In the case of this language, the Pumping Lemma can indeed be applied to the substring 1k , where
k is any prime number.
Now we can break w = 1k into w = xyz such that y ̸= ϵ and |xy| ≤ n. Let |y |= m (m > 0), then
|xz |= k − m
Consider the string a=xy k−m z
As length of string a is not a prime (it has factors m + 1,k − m), we have a ∈
/L
Hence Language L is not regular.
Reverse Language
If Language L is regular then its reverse language LR is also regular.
To construct automaton for LR from that of L swap initial and final states and reverse the edges.
Chapter 7
Pushdown Automata
So if we can equip the finite state automata with more structures like a stack, it will be abl to accept
the non-regular languages too.
0 1
0,0/01
start q0 q1
1 1
Transition
0 0
q2
1 1
Stack 1 Stack 2
90
CHAPTER 7. PUSHDOWN AUTOMATA 91
7.2 Description
PushDown Automata (PDA) are essentially NFAs equipped with a stack that is maintained on an
alphabet that is different from the alphabet used for state transition. PDAs have greater accepting
power than regular NFAs and can accept non-regular languages as well.
7.2.1 Conventions
Just like we described a NFA L as L(Q, Σ, q0 , δ, F ), we describe PDAs as characterized by
L(Q, Σ, Γ, q0 , Z0 , δ, F )
where-
• Γ is the set of symbols that are pushed/popped from the stack associated with the PDA
• q0 is the starting state of the PDA. There may be more than one starting state for the PDA.
• Z0 is the starting symbol in the stack. This is so that a symbol can be popped on the first
transition.
• δ is the transition function for the PDA. While the transition function for regular NFA was
of the form δ : Q × Σ ∪ {ϵ} → 2Q , the transition function for PDAs is characterized by
∗
δ : Q × Σ ∪ {ϵ} × Γ → 2Q×T , where Γ∗ is the set of strings made by the alphabet Γ.
A single transition on a PDA from state q1 to q2 is represented by x, a/ba where x ∈ Σ is the original
symbol and may be ϵ, a is the top element of the stack and is popped off (it cannot be ϵ), and ba is
the string that is pushed onto the stack after a is popped off. The final top element of the stack is
b after this transition takes place.
This places a restriction on PDAs as compared to NFAs, that is, for any transition to occur, the top
element of the stack also has to match the top element specified in the transition.
7.2.2 Example
Let us characterize the non regular language {0n 1n |n ≥ 0} through a PDA with a finite number of
states, as shown in Figure 7.2. We define Γ = {z0 , a} as our stack alphabet with our PDA starting
with only z0 in the stack. a will represent a counter for the number of zeros in our string.
Notice that if a string deviates from the form 0m 1n the automaton goes to trap. For every 0 in the
string a is pushed onto the stack and hence the l(S) = m + 1 where l(S) is the length of stack. For
strings following the given structure, if m = n, then the string ends at an accepting state. If m > n,
the string goes to trap and if m < n, the string ends the run at a non accepting state.
We reiterate that a string is accepted if and only if the run does not halt before the string is completed
and the final state is an accepting state.
CHAPTER 7. PUSHDOWN AUTOMATA 92
q1 0, a/aa
z0
1,
/a
a/
z0
ϵ
0,
1, z0 /z0 0, a/a
start q0 trap q3 1, a/ϵ
0, z0 /z0
ϵ,
0
/z
z0
z0
/z
ϵ,
0
q2
7.2.3 Example
Now let us extend our previous automaton to accept the language {0n 1m |n ≥ m ≥ 1} as shown in
Figure 7.3. Herein, we introduce a different notion of string acceptance, that is, if the stack becomes
empty upon completion of the run, we can say that the string is accepted. This is called acceptance
through empty stack.
1. String should not halt in between its run, that is, if any symbol is left in the run on the string,
there must exist atleast one entry in the transition function δ corresponding to any of the
possible current states, the next character and any of the possible stack tops.
2. Upon completion of the run, either the stack should be naturally empty or through a series of
ϵ transitions, should be able to make its stack empty for the string to be accepting.
3. There are no constraints on the current state and the string can be accepted through any state
q ∈ Q.
This gives a new method of acceptance. We represent the language accepted by this method over
an automaton A as N (A) and the language accepted by our original method as L(A).
However, we do not yet know if the two methods of acceptance are equally powerful. We shall prove
this claim by showing the equivalence of the two methods over PDAs by transforming one into the
other and vice versa.
CHAPTER 7. PUSHDOWN AUTOMATA 93
q1 0, a/aa
z0
1,
/a
a
z0
/ϵ
0,
1, z0 /z0 0, a/a
start q0 trap q3 1, a/ϵ
0, z0 /z0
ϵ,
ϵ, z 0
z0
z
0/
ϵ, a
0
/z
/ϵ
ϵ,
0
/ϵ
ϵ, z0 /ϵ ϵ, z0 /ϵ
q2 q5
ϵ, a/ϵ ϵ, a/ϵ
• In Figure 1, L(A) = {0n 1n |n ≥ 0} but N (A) = Φ as the stack never gets empty.
1. If L(A1 ) is the language accepted by a PDA A1 , there exists another PDA A2 such that
L(A1 ) = N (A2 ).
2. If N (A1 ) is the language accepted by a PDA A1 , there exists another PDA A3 such that
N (A1 ) = L(A3 ).
We will show the existence of both of these by giving a construction which when applied to A1 gives
us A2 and A3 respectively.
1. Rejection - N (A2 ) should not accept any string s that is not a part of L(A1 )
We shall assume A1 has only one starting state, and if not we can simply add extra ϵ-transitions to
reach any start state from our original start state.
CHAPTER 7. PUSHDOWN AUTOMATA 94
Before this start state, we append a new start state s0 and have our stack start with x0 ̸∈ Γ ∪ {z0 }
and connect it to start with an ϵ-edge pushing z0 x0 to the stack. This is absolutely identical to the
original stack with the exception of an extra x0 at the bottom of the stack.
We will convert A1 to A2 without modifying the basic structure of A1 such that for any accepting
state in A1 , we add ϵ-transitions to a new final state f ′ , which does not take in any input word and
just pops the stack (x0 or z0 or γ ∈ Γ) without performing any additions onto it.
Acceptance
If string s is accepted in A1 it will reach f ∈ F on atleast one of its possible runs on A1 . If the
stack has atleast one element upon reaching f , the string can go to f ′ and there the stack can get
emptied through ϵ-transitions from f ′ to itself, thus putting s in N (A2 ) as well.
If the stack became empty when the string s ran on A1 we will now have x0 as the only element left
in the stack (as no transition in the original automaton can pop x0 because x0 ̸∈ Γ ∪ {z0 }). Our
automaton proceeds to f ′ by popping x0 and is accepted as stack is emptied.
Rejection
A run of string s is rejected in L(A1 ) if-
1. Run successfully completes but the string is not in an accepting state
If the run successfully completes but the string is not in an accepting state, then the stack never
becomes empty because x0 will always be present.
q0
7.4.1 Construction
Now, Consider a construction to P that involves
• Addition of a new start state P0 and an ϵ - transition from this new start state on reading X0
on the stack to q0 and Z0 as top of stack.
start ϵ, X0 /Z0 X0
P0 q0
• Introducing a new final state Pf and ϵ-transitions from all states in Q such that on reading
X0 on the stack transition happens to state Pf and an empty stack.
After the construction, the new automaton PF will be (Q ∪ {p0 , pf }, Σ, Γ ∪ {X0 }, δF , p0 , X0 , {pf })
whose acceptance is by final state.
CHAPTER 7. PUSHDOWN AUTOMATA 96
ϵ,X0 /ϵ
ϵ,X0 /ϵ
start ϵ, X0 /Z0 X0
p0 q0 pf
ϵ,X0 /ϵ
If w is accepted by P, then the stack in P must have been emptied after consuming w, which
implies that stack in PF now has X0 at its top. Now , taking the ϵ-transition by reading X0
will lead us to pf which is an accepting state of PF .Hence w will also be present in L(PF ).
In the PDA L(PF ), the only transitions that can be used to reach pf state are ϵ transi-
tions on reading X0 . Since w is present in L(PF ), it must have taken one of these transitions
and read X0 at the top of the stack which implies that stack in P must have been emptied by
w. Hence, w would also be accepted by P.
The language that can be represented using DPDAs can also be represented using the NPDAs.i.e
DP DA ⊆ N P DA
But, there are certain languages which can be represented only by NPDAs but not DPDAs.Here is
an example,
L = {wwR | w is in (0 + 1)∗}
If we try to represent L using a DPDA by pushing a copy of input symbol seen on to stack, one has
to make a guess every time whether input has reached end of string w or not so that popping can
be done to check the palindrome nature of the string. Hence non-determinism has to be involved to
represent L.
Hence,
DP DA ⊂ N P DA
In the above Figure, if we observe the part of graph where strings xA...xB and xC....xD are read,
the behaviour is independent of the content of the stack before xA but just depends on the state
and top of the stack when the symbol xA is being read.
Hence, a context-free language can be fragmented into some finite no.of sets (these sets need not be
finite) each defined by (top of stack,state of automaton )
0,X/XX 1,X/ϵ
0,Z0 /X 1,X/ϵ
start q0 q1 q2
Stack
X
Z0 X X X ϵ
q0 0 q1 0 q1 1 q2 1 q2
We can see how the string 0011 starts with the state q0 and Z0 initially on the top of the stack and
ends with an empty stack. Let us define a mathematical notation for the language that the string
0011 belongs to.
Lqi Z0 qj = set of all strings that start at the state qi with Z0 on the top of the stack and end
at the state qj with Z0 popped off the stack and the remaining stack unaltered
Note: The stack may have risen or fallen during the transitions but finally only Z0 is popped
off the stack and the remaining stack is unaltered
Z0
w
Lqi Z0 qj = {w | }
qi qj
We took the union of languages that started at q0 emptied the stack and ended at any of the states
present in the PDA A.
0,X/XX
q1
If we consider a string w such that 0 is the first letter of w then this transition is undertaken
and we have added another X to the stack. So now w would have to pop two X from the stack and
arrive at the state q2 for w to belong to the language Lq1 Xq2 . This can be written as
Lq1 Xq2 ⊇ 0.Lq1 Xq0 .Lq0 Xq2 ∪ 0.Lq1 Xq1 .Lq1 Xq2 ∪ 0.Lq1 Xq2 .Lq2 Xq2
1,X/ϵ
q1 q1
We can say
Lq1Xq2 ⊇ 1
We can keep doing this for all transitions from q1 and get a super set recurrence relation for Lq1 Xq2
We will be particularly interested in finding these smallest languages satisfying the relation
NOTE: Σ∗ is always a solution(largest) of the super set recurrence relation since it contains all
strings
where L1 = Lq1 Xq2 , L2 = Lq1 Xq0 , L3 = Lq0 Xq2 , L4 = Lq1 Xq1 and L5 = Lq2 Xq2 .
Similarly, we can do the same for the languages Li ∀i ∈ {1, 2, . . . , n2 k}. This will form a context
free grammar.
WIth proper reductions, we can say that the context free grammar generated for the language
L = {0n 1n | n ≥ 1} is:
S → 0S1 | 01
∗)
We observe that the universal language Σ∗=L((0+1) also satisfies the context free grammar, but the
language L = {0n 1n | n ≥ 1} is the minimal language that satisfies the context free grammar.
Chapter 8
Definition: A context-free grammar (CFG) is a formal grammar whose production rules can
be applied to a nonterminal symbol regardless of its context. In particular, in a context-free
grammar, each production rule is of the form V → (V ∪ T )∗ . Where V is set of Non-Terminal
and T is set of Terminals.
Formally, a context-free grammar can be represented as follows -
G = (V, Σ ∪ {ϵ}, P, S)
where
V - is the set of non-terminals. These are symbols that can be replaced or expanded.
Σ ∪ {ϵ} - is the set of terminals. These are symbols that cannot be replaced or expanded further.
P - are the set of production rules
S - is the start symbol (S ∈ V )
Again, a grammar is said to be the Context-free grammar only if every production is in the form of
G → (V ∪ T )∗, where G∈V
S → A.S | ϵ
A → A1 | 0A1 | 01
Here, S and A are non-terminal symbols representing languages. ϵ is a special symbol representing
an empty string and is in the language S. The symbols 0 and 1 are terminal symbols and are in
the language A. The set of strings that can be generated using a context-free grammar is called a
context-free language. This language of A contains strings that are of the form 01 or A1 or 0A1, by
just substituting all possible strings of A in the form recursively
A→
− 01 | 011 | 0011 | 0111 | 00111 | ...
Thus, Language derived from A can be intutively be told as
0i 1j , where j ≥ i ≥ 1
101
CHAPTER 8. CONTEXT FREE GRAMMER 102
0, X/XY
q1 q2
If we denote the languages by L1 , L2 . . . Ln2 k , where n is the number of states and k is the number
of stack symbols, then we can write the above equation as a recurrence relation for all languages.
.
For example, [mathescape=true] L → 1 | 0 · L · L | 0 · L · L | . . . L → . . . .. L 2 → . . . This
1 2 3 4 1 2 n k
set of recurrence relations is called a Context-free grammar. Hence, given a PDA, we can write the
language accepted by the PDA by an empty stack as a CFG.
A S
A 1 A S
0 1 0 1 ε
CHAPTER 8. CONTEXT FREE GRAMMER 103
As we can see, the leaves form 01101ε, from left to right, which is exactly the string we wanted.
Hence, 01101 ∈ LS , using the production rules in the grammar.
While drawing the trees, we must ensure that the root is the start symbol, and the node at which
we want to split our tree, we are using production rules with that value in the left hand side. For
example, if we want to split the tree at an A, we must only use production rules of the form A → . . . .
For this particular string and grammar, only one parse tree is possible. Let us now consider the
string 00111 ∈ LS .
S S
A S A S
A 1 ε 0 A 1 ε
0 A 1 A 1
0 1 0 1
As we can see, these are two completely different, and correct parse trees for the string in the same
grammar. Such grammars, where there exist multiple parse trees for some strings, are called am-
biguous grammars. Hence, the grammar we have defined is indeed ambiguous.
Are there CFLs for which every CFG representing it are ambiguous grammars? Yes, there indeed
are such CFLs. Such languages are called inherently ambiguous languages. If there exists even one
grammar for the CFL which is unambiguous, then the language is an unambiguous language as well.
Is LS inherently ambiguous or is it an unambiguous language?
The answer is that it is not inherently ambiguous. The intuition is that, we must force our grammar
to first match up all the zeros in the strings with ones, and following that, add the terminating
ones. Our current grammar puts no such restriction. We could first add zeros and ones to the start
and end, add ones to the end, and then go back to adding zeros and ones to the start and end.
This leads to multiple parse trees. An example of a CFG representing LS , which is unambiguous is
[mathescape=true] S → C · S | ε C → A · B A → 0 · A · 1 | 01 B → 1 · B | ε As we can see, A does
the job of adding zeros and ones to the start and end, B does the job of creating strings of ones,
and C does the job of concatenating A and B.
This is important, because every programming language is specified using CFGs, for which we can
build compilers and interpreters, and it is important for these CFGs to be unambiguous since we do
not want multiple interpretations of a program.
CHAPTER 8. CONTEXT FREE GRAMMER 104
S → AS | ϵ
A → A1 | 0A1 | 01
Where,
Starting Symbol: S
Production Rules: S → AS, S → ϵ, A → A1, A → 0A1, A → 01
Non-terminals: {S, A}
Terminals: {0, 1}
where Σ represents the alphabet of the PDA which we are going to construct and Γ represents the
alphabet of the stack.
Note that we have exactly same number of symbols as total number of terminals and non-terminals
in our CFG each symbol corresponding to each terminal or non-terminal.
We will start constructing our PDA by adding some edge(s) to a single node, pushing or popping
some letter from the stack while constructing derivation tree/ Parse tree by using each production
rule.
Step 1:
We initialize our stack by pushing S as S is
our starting symbol and we represent it as fol-
S
lows ,Where left means bottom of the stack
and right is it’s top
Stack = S,
Step 2:
We used the rule: S → AS here and added
the edge ϵ, S/AS into our single node as we can ϵ, S/AS
replace S to AS anywhere if we want to as it
S
is defined in our CFG’s production rules. Also
now we pop S from our stack and push S,A A S
(here A and S are from Γ).
Stack = S,A,
CHAPTER 8. CONTEXT FREE GRAMMER 105
Step 3: ϵ, S/AS
We used the rule: A → A1 here and added
the edge ϵ, A/AX1 (here X1 is just the stack S ϵ, A/AX1
symbol corresponding to the terminal ’1’) into A S
our single node as we can replace A to A1 any-
where while forming a string. Also now we pop A 1
A from our stack and push X1 ,A (here A and
X1 are from Γ).
Stack = S,X1 ,A,
Step 4: ϵ, S/AS
We used the rule: A → 01 here and added the
S ϵ, A/AX1 ϵ, A/X0 X1
edge ϵ, A/X0 X1 (here X1 ,X0 are just the stack
symbol corresponding to terminals 1 and 0) A S
into our single node. Also now we pop A from A 1
our stack and push X1 ,X0 (here X0 and X1 are
from Γ). 0 1
Stack = S,X1 ,X1 ,X0 ,
Step 5:
Now as we know that when terminals come on ϵ, S/AS
top of the stack and we read it from the top of
S ϵ, A/AX1 ϵ, A/X0 X1
the stack and pop it off and if a non terminal
is seen then we apply any of the given produc- A S
tion rules. So for this pop operation we now A 1
add the edges 0, X0 /ϵ and 1, X1 /ϵ into our sin- 0, X0 /ϵ
gle node. So we pop all the terminals until we 0 1
1, X1 /ϵ
see a non-terminal.
Stack = S,
ϵ, S/AS
Step 6:
S ϵ, A/AX1 ϵ, A/X0 X1
We now use the rule: S → ϵ here and added
the edge ϵ, S/ϵ into our single node. Also now A S
we pop S from our stack and push ϵ that is A 1 ϵ
stack becomes empty again. 0, X0 /ϵ ϵ, S/ϵ
Stack = 0 1
1, X1 /ϵ
Now this stack helps in doing the DFS of the derivation tree, we first visit all the nodes on left and
then all the nodes on right.
So to conclude we can say that given any arbitrary Context-free grammar we can easily construct
PDA out of it by using some simple steps as described above which recognizes the exact same lan-
guage as the given CFG.
CHAPTER 8. CONTEXT FREE GRAMMER 106
S → ABS | 0A1
A → 1A0 | D | 01 | ϵ
B → 1B | BB0
C → A0 | 01
D → S | 0AD
Where,
Starting Symbol: S
Non-terminals: {S, A, B, C, D}
Terminals: {0, 1, ϵ}
• While "B" depends on itself and "A" depends on itself and "D," we ultimately find that "D"
relies on "A," "S," and itself.
• Therefore, knowing the strings accepted by "C" is unnecessary for generating strings from "S."
In other words, even without the production rules for "C," "S" can still generate valid strings
by appropriately utilizing "A," "B," and "D."
We can further simplify the grammar by identifying and removing another redundant non-terminal
symbol: "B". Analyzing the production rules for "B", we observe two key characteristics:
• Absence of Terminal Productions: None of "B"’s production rules generate strings con-
sisting solely of terminal symbols.
• So the language generated by the non-terminal symbol "B" is empty (as it cannot produce any
finite length string by terminating at some point), we can safely remove all production rules
that involve "B" from the context-free grammar.
Now that we’ve removed the unnecessary rules, here’s the updated CFG:
S → 0A1
A → 1A0 | D | 01 | ϵ
D → S | 0AD
CHAPTER 8. CONTEXT FREE GRAMMER 107
S → 0A1 | 01
A → 1A0 | D | 01 | 10
D → S | 0AD | 0D
Let’s first remove the rule D → S by doing the changes described above,
S → 0A1 | 01
A → 1A0 | D | S | 01 | 10
D → 0AD | 0AS | 0D | 0S
As we can see that after doing this modification to our CFG we now have two more production rules
A → S and A → D so let’s remove them also,
8.3.4 Reducing further to get at least two non-terminals on the right hand side or
a single terminal
Just replace the terminals 0 and 1 with their corresponding symbols "X0 " and "X1 " in the CFG
and add corresponding new rules.
8.3.5 Reducing further to get exactly two non-terminals on the RHS of any pro-
duction rule which has >2 non-terminals on it’s RHS
Let us say our production rule has 3 non-terminals on it’s right hand side then we can reduce it to
exactly 2 non-teminals as follows,
CHAPTER 8. CONTEXT FREE GRAMMER 108
S → X0 S1 | X0 X1
S1 → AX1 | DX1 | SX1
A → X1 A1 | X0 X1 | X1 X0
A1 → AX0 | DX0 | SX0
D → X0 D1 | X0 D | X0 S
D1 → AD | AS | DD | DS | SD | SS
X0 → 0
X1 → 1
The CFG written above is said to be in it’s Chomsky normal form (defined below).
Every Context-free grammar can be expressed in it’s Chomsky normal form by application of various
rules as described in section 2 which accepts the same set of strings which is accepted by the original
CFG.
Now due to the restricted nature of the CNF form of CFG (can contain maximum 2 symbols in it’s
RHS) the derivation tree will be a binary tree.
Now a binary tree of height n can have maximum of 2n leaves but here we are taking |s| > 2n so
the height of derivation tree must be > n. But we are having only n non-terminals in our CFG so
by pigeon hole principle there must exist some non-terminal "N" which repeats atleast twice.
We split up s into 5 parts as shown in the figure (8.1),
s = u.v.w.x.y
CHAPTER 8. CONTEXT FREE GRAMMER 109
Figure 8.2: Examples of some possible derivation trees from figure (8.1)
• Using the property of derivation trees (of CFG) we can say that whatever I can generate from
first "N" can also be generated by the second "N" and vice-versa.
• So after encountering a "N" in our derivation tree we can choose to generate a new "N" or
terminate after some productions by choosing the "N" which doesn’t produce another "N".
• So, to formally state pumping lemma for CFL we need to have some constraints like the
sub-derivation tree of "N" should not have repetition of any non-terminal that is |vwx| ≤ 2n .
8.4.1 Lemma:
If a language L is context-free, then there exists some integer n ≥ 1 (number of non-terminals in G)
such that every string s in L that has a length of 2n or more symbols (i.e. with |s| ≥ 2n ) can be
CHAPTER 8. CONTEXT FREE GRAMMER 110
written as
s = uvwxy,
with sub strings u, v, w, x, and y, such that:
1. |w| ≥ 1
2. |vwx| ≤ 2n , and
Example: Consider a language L = {0n 1n 2n |n ≥ 0} and Σ = {0, 1, 2}. Now we need to prove that
L can not be a CFL.
Suppose it was a CFL having n distinct non-terminals, we define k = 2n and take a very large
string w = 02k 12k 22k ∈ L.
Now as |vwx| ≤ 2n or |vwx| ≤ k so v.w.x can either lie completely in 02k , 12k or 22k or in 02k 12k or
12k 22k .
In any case we can can pump the string to get a word which is not in L (any case will lead to
increase some but not all alphabets in that word so it won’t be in L) but it should be in L, therefore
L is not a CFL.
8.5.2 Intersection
Context Free Languages are not closed under intersection.
CHAPTER 8. CONTEXT FREE GRAMMER 111
8.5.3 Complement
Suppose CFLs were closed under complementation. Then for any two CFLs L1 , L2 , we have
L1 and L2 are CFLs. Then, since CFLs are closed under union, L1 ∪ L2 is CFL.
Then, again by hypothesis, L1 ∪ L2 is CFL. i.e., L1 ∩ L2 is a CFL i.e., CFLs are closed under
intersection. which is Contradiction! Thus CFLs are not closed under Complement.
Another example let L = {x | x not of the form ww} is a CFL.
But L = {ww|w ∈ {a, b}} which is not a CFL. Thus CFLs are not closed under complement.
8.5.4 Substitution
Context Free Languages are closed under Substitution, this means that replacing any symbol with
a Language and the result will still be a CFL.
Given grammar G: S → 0S0 | 1S1 | ε Substituting h: 0 → aba and 1 → bb
Rules of G′ such that L(G′ ) = L(h(L(G))):
S → X0SX0 | X1SX1 | ε
X0 → aba
X1 → bb
Thus, CFLs are closed under substitution.
Chapter 9
Turing Machine
head
L R
0 0 1 b b
Tape
Turing machines are a fundamental model of computation that can simulate the logic of any computer
algorithm, regardless of complexity. They are composed of:
1. Tape: The machine has an infinitely long tape divided into cells, each of which can contain a
symbol from a finite alphabet.
2. Sybmols: The tape cells contain symbols from a set. b is used to represent ’black’ cell (kind
of like NULL).
3. Head: The machine has a head that can read and write symbols on the tape and move the
tape left or right one cell at a time.
4. Transition function: The machine uses a transition function that dictates the machine’s
actions based on its current state and the symbol it reads on the tape. Actions include writing
a symbol, moving the tape left or right, and changing the state.
112
CHAPTER 9. TURING MACHINE 113
1. Pushing into the stack is equivalent to moving to the right and writing the symbols to be
pushed on the tape.
2. Popping off the stack is equivalent to moving to the left and writing black symbols to the tape.
The problems which can be solved using a deterministic turing machine along with a polynomial
number of cells in the tape compared to the length of the input are in P and those which can be
solved similarly but using non-deterministic turing machine are in NP.
Head
Head
··· b
0 1 1 0 1 0 b
0 b
···
0/1, R
q5 q9
1/0, L
For simplicity, let us call the left part of the input string α and the right part β and say the state
is qi , so we can represent the configuration as αqi β
∗
If α0 q0 β0 αi done βi , we have reached the end of the computation.
Note: ‘*’ on indicates multiple steps.
b/
b, L
1/1, R
start q0 q1
0/0, R
Let us look at the computation of the string 0 where the initial configuration is
b q0 0.
Head
··· b
0 b
b
···
Now the Turing machine scans the cell ‘0’, replaces it with ‘0’, moves right (to a blank cell), and
stays in the same state q0 . So now the configuration becomes 0 q0
b.
CHAPTER 9. TURING MACHINE 115
Head
··· b
0 b
b
···
This time, the Turing machine scans the cell ‘b’, replaces it with ‘
b’, moves left (to ‘0’), and stays
in the same state q0 , because of which the configuration changes to b q0 0.
Head
··· b
0 b
b
···
As we can see, the machine does not halt on this input string.
Head
··· b
1 b
b
···
The Turing machine scans the cell ‘1’, replaces it with ‘1’ moves right, and transitions to state q1 .
Hence the configuration switches to 1 q1
b.
Head
··· b
1 b
b
···
As we know from the transition diagram of the Turing machine, we can’t go to any other state of
q1 . So the input string 1 halts on this machine whereas 0 doesn’t.
We can notice that certain strings halt when processed by a Turing Machine, while others keep
don’t. This concept helps us define what it means for a Turing Machine to "accept" a string.
In essence, Turing machines provide us with a way to describe different languages.
CHAPTER 9. TURING MACHINE 116
• We can also talk about language a TM accepts using the notion of accepting and non-accepting
states. In this case, we will accept a string if the TM ever enters an accepting state during its
operation on the string as input.
• Acceptance by Halting: All those strings are accepted for whom the Turing Machine halts
its operation in finite time. We will primarily be interested in this kind of acceptance.
• Acceptance of Final State: All those strings are accepted for whom during the operation
of the Turing Machine, a final state is reached at some point in the journey.
It is not very difficult to convert the notion of acceptance by final state to that by halting. Once
you reach a final state, move on to a state whose only job is to finish parsing (similar to emptying
a stack with a PDA). Moreover, we must also ensure that the TM does not halt in a non-accepting
state. First, let us consider when a TM would halt on a non-accepting state:- when the TM has no
outgoing transition for a particular action/input:
0/1, L
1/1, R
start q0 q1
0/0, R
0/0, L
q2
b/b, R
1
As a side note, consider the string 0 1 0. The behavior of termination of TM described above using this string
as input depends on where you start the tail from. So, we will adopt a standard to describe the language described
by a TM by assuming that the string we have taken as input lies on the tail and head is at the leftmost character of
the string. If the TM halts taking this configuration as the starting configuration, then we say that the string lies in
the language of the TM.
CHAPTER 9. TURING MACHINE 117
1/1, R
b/b, R 1/1, R
b/b, R
1/1, R
start q0 q1 0/1, L
0/0, R
0/0, R
q2
So, the process for conversion seems straightforward now: just remove the transitions going out of
the final state (in this case, we removed the loop of the final state), and if some string halts on
reaching some other state, make the state non-halting using trap state.
A slight note: The typical notion for acceptance by final state involves accepting the string only if
after reading the entire string, you reach an accepting state. However, we have a minor modification
that if you reach a final state at some point in the journey, read the rest of the string and stay
in the same state. If this modification had been considered in previous things we studied, such as
DFA, then it would have created a problem since that would mean all strings with their prefix as
the smallest string accepted by the DFA would be accepted (there can be other strings also, but at
least these would be accepted). In the case of a TM, this does not cause a problem due to 2 reasons:
So, in the case of a TM, it is not necessary that strings formed by concatenating something with an
accepting prefix will be accepted.