0% found this document useful (0 votes)
12 views118 pages

Lectures 1 To 31

This document provides an in-depth overview of automata theory and logic. It covers topics like propositional logic, regular languages recognized by finite automata, and algorithms for solving problems on satisfiability and validity. The document is extensive and divided into multiple sections on different related concepts.

Uploaded by

Abhijeet
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views118 pages

Lectures 1 To 31

This document provides an in-depth overview of automata theory and logic. It covers topics like propositional logic, regular languages recognized by finite automata, and algorithms for solving problems on satisfiability and validity. The document is extensive and divided into multiple sections on different related concepts.

Uploaded by

Abhijeet
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 118

CS 208 : Automata Theory and Logic

Spring 2024

Instructor : Prof. Supratik Chakraborty


1

Disclaimer
This is a compiled version of class notes scribed by students registered for CS
208 (Automata Theory and Logic) in Spring 2024. Please note this document
has not received the usual scrutiny that formal publications enjoy. This may
be distributed outside this class only with the permission of the instructor.
Contents

1 Propositional Logic 6
1.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Important Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Proof Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Natural Deduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Soundness and Completeness of our proof system . . . . . . . . . . . . . . . . . . . . 13
1.6 What about Satisfiability? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.7 Algebraic Laws and Some Redundancy . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.7.1 Distributive Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.7.2 Reduction of bi-implication and implication . . . . . . . . . . . . . . . . . . . 16
1.7.3 DeMorgan’s Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.8 Negation Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.9 From DAG to NNF-DAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.10 An Efficient Algorithm to convert DAG to NNF-DAG . . . . . . . . . . . . . . . . . 21
1.11 Conjunctive Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.12 Satisfiability and Validity Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.13 DAG to Equisatisfiable CNF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.14 Tseitin Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.15 Towards Checking Satisfiability of CNF and Horn Clauses . . . . . . . . . . . . . . . 29
1.16 Counter example for Horn Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.16.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.17 Davis Putnam Logemann Loveland (DPLL) Algorithm . . . . . . . . . . . . . . . . . 31
1.18 DPLL in action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.18.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.19 Applying DPLL Algorithm to Horn Formulas . . . . . . . . . . . . . . . . . . . . . . 34
1.20 DPLL on Horn Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.21 Rule of Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.21.1 Completeness of Resolution for Unsatisfiability of CNFs . . . . . . . . . . . . 36

2 DFAs and Regular Languages 38


2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.2 Deterministic Finite Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.3 DFA Design - Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.4 DFA Design - Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

2
CONTENTS 3

3 Non-Deterministic Finite Automata (NFA) 43


3.1 NFA Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.1.1 Formalization of Non-Deterministic Finite Automata . . . . . . . . . . . . . . 44
3.1.2 NFA into action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.2 Equal expressiveness of DFA or NFA . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2.1 Construction of DFA from NFA . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2.2 Step Wise Conversion from NFA to DFA . . . . . . . . . . . . . . . . . . . . . 45
3.2.3 Proof of Equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3 Reflective Insights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4 Proof of Equivalence – Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4.1 Claim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4.2 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.5 Compiler Special Case – Lexical Analyzer . . . . . . . . . . . . . . . . . . . . . . . . 49
3.6 NFA with ϵ−edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.6.1 Equivalence with DFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.7 Recap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.7.1 Converting ε-NFA to non-ε-NFA . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.7.2 Correctness of Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.7.3 Intuition of the algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.7.4 Extras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.7.5 Significance of ε-edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.7.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.8 Equivalence in Finite Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.9 DFA definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.9.1 Combinations of DFAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.9.2 Combinations of NFAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.9.3 Closure Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.10 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.10.1 Infinite Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4 Regular Expressions 64
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2 Formal Definition of a Regular Expression . . . . . . . . . . . . . . . . . . . . . . . . 64
4.3 Semantics of Regular Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.1 Atomic Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.2 Union Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.3 Concatenation Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.5 Order of Precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.6 Kleene Star . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.7 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.8 Further Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4 Kleene’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4.1 Part 1: L(Reg. Ex.) ⊆ L(NFAs with ϵ edges) . . . . . . . . . . . . . . . . . . 67
4.4.2 Part 2: L(Reg. Ex.) ⊇ L(NFAs with ϵ edges) . . . . . . . . . . . . . . . . . . 69
4.4.3 Checking Subsethood of Languages . . . . . . . . . . . . . . . . . . . . . . . . 74
CONTENTS 4

5 DFA Minimisation 76
5.1 Minimum States in a DFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.2 Indistinguishability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.3 Equivalence classes of Indistinguishability relation . . . . . . . . . . . . . . . . . . . . 78
5.4 Further Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.4.1 Optimality of Acquired DFA . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.4.2 Uniqueness of Acquired DFA . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.5 From states to words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.5.1 Language of word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.5.2 Relation between states of minimal DFA and equivalence classes for˜L . . . . 83
5.6 Setting up the Parallel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.6.1 Can | ∼L | > | ≡ | ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.6.2 Can| ∼L | < | ≡ | ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.7 Myhill-Nerode Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

6 Pumping Lemma for Regular Languages 87


6.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.2 Formal Statement of Pumping Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.3 Pumping Lemma as Adversarial Game . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.4 If a language L is regular and the number of states in DFA for L is n, is L infinite? . 88
6.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

7 Pushdown Automata 90
7.1 Pushdown Automata for non-regular Languages . . . . . . . . . . . . . . . . . . . . . 90
7.2 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.2.1 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.2.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.2.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.3 Acceptance through empty stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.3.1 For any automaton A, L(A) is not necessarily equivalent to N (A) . . . . . . . 93
7.3.2 Construction for A2 such that L(A1 ) = N (A2 ) . . . . . . . . . . . . . . . . . 93
7.4 From Empty Stack PDA to Final State PDA . . . . . . . . . . . . . . . . . . . . . . 95
7.4.1 Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.4.2 Required to show N (P ) = L(PF ) . . . . . . . . . . . . . . . . . . . . . . . . . 96
7.5 Context Free Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
7.5.1 DPDA and NPDAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
7.5.2 Build-up and emptying of the stack of a PDA . . . . . . . . . . . . . . . . . 97
7.6 Acceptance by PDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.6.1 Run on the PDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.6.2 Recurrence Relations on Languages . . . . . . . . . . . . . . . . . . . . . . . . 99
7.7 More on recurrence relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.7.1 Smallest and Largest Languages . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.7.2 Context Free Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
CONTENTS 5

8 Context Free Grammer 101


8.1 Context-free Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8.1.1 PDA to CFG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
8.2 Parse Trees and Ambiguous Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . 102
8.2.1 CFG to PDA conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
8.3 Cleaning Context-free Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
8.3.1 Eliminating Useless Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
8.3.2 Eliminating ϵ-Productions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
8.3.3 Eliminating Unit Productions . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
8.3.4 Reducing further to get at least two non-terminals on the right hand side or
a single terminal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
8.3.5 Reducing further to get exactly two non-terminals on the RHS of any produc-
tion rule which has >2 non-terminals on it’s RHS . . . . . . . . . . . . . . . . 107
8.3.6 Chomsky Normal form (CNF) . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
8.4 Pumping Lemma for Context-free Languages . . . . . . . . . . . . . . . . . . . . . . 108
8.4.1 Lemma: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
8.5 Closure Properties of CFL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
8.5.1 Union & Concatenation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
8.5.2 Intersection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
8.5.3 Complement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
8.5.4 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

9 Turing Machine 112


9.1 Configuration of the Turing Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
9.2 Language described by a TM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
9.2.1 Inter conversion between acceptance by final state and by halting: . . . . . . 116
Chapter 1

Propositional Logic

In this course we look at two ways of computation: a state transition view and a logic centric view.
In this chapter we begin with logic centered view with the discussion of propositional logic.
Example. Suppose there are five courses C1 , . . . , C5 , four slots
S1 , . . . , S4 , and five days D1 , . . . , D5 . We plan to schedule these D1 D2 D3 D4 D5
courses in three slots each, but we have also have the following S1
requirements: S2
S3
• For every course Ci , the three slots should be on three
S4
different days.

• Every course Ci should be scheduled in at most one of


S1 , . . . , S4 .

• For every day Di of the week, have at least one slot free.

Propositional logic is used in many real-world problems like timetables scheduling, train scheduling,
airline scheduling, and so on. One can capture a problem in a propositional logic formula. This is
called as encoding. After encoding the problem, one can use various software tools to systematically
reason about the formula and draw some conclusions about the problem.

1.1 Syntax
We can think of logic as a language which allows us to very precisely describe problems and then
reason about them. In this language, we will write sentences in a specific way. The symbols used in
propositional logic are given in Table 1.1. Apart from the symbols in the table we also use variables
usually denoted by small letters p, q, r, x, y, z, . . . etc. Here is a short description of propositional
logic symbols:

• Variables: They are usually denoted by smalls (p, q, r, x, y, z, . . . etc). The variables can take
up only true or false values. We use them to denote propositions.

• Constants: The constants are represented by ⊤ and ⊥. These represent truth values true
and false.

6
CHAPTER 1. PROPOSITIONAL LOGIC 7

• Operators: ∧ is the conjunction operator (also called and), ∨ is the disjunction operator
(also called or), ¬ is the negation operator (also called not), → is implication, and ↔ is
bi-implication (equivalence).

Name Symbol Read as


true ⊤ top
false ⊥ bot
negation ¬ not
conjunction ∧ and
disjunction ∨ or
implication → implies
equivalence ↔ if and only if
open parenthesis (
close parenthesis )

Table 1.1: Logical connectives.

For the timetable example, we can have propositional variables of the form pijk with i ∈ [5], j ∈ [5]
and k ∈ [4] (Note that [n] = {1, . . . , n}) with pijk representing the proposition ‘course Ci is scheduled
in slot Sk of day Dj ’.

Rules for formulating a formula:

• Every variable constitutes a formula.

• The constants ⊤ and ⊥ are formulae.

• If φ is a formula, so are ¬φ and (φ).

• If φ1 and φ2 are formulas, so are φ1 ∧ φ2 , φ1 ∨ φ2 , φ1 → φ2 , and φ1 ↔ φ2 .

Propositional formulae as strings and trees:


Formulae can be expressed as a strings over the alphabet Vars ∪ {⊤, ⊥, ¬, ∧, ∨, →, ↔, (, )}. Vars is
the set of symbols for variables. Not all words formed using the alphabet qualify as propositional
formulae. A string constitutes a well-formed formula (wff) if it was constructed while following the
rules. Examples: (p1 ∨ ¬q2 ) ∧ (¬p2 → (q1 ↔ ¬p1 )) and p1 → (p2 → (p3 → p4 )).
Well-formed formulas can be represented using trees. Consider the formula p1 → (p2 → (p3 → p4 )).
This can be represented using the parse tree in figure Figure 1.1a. Notice that while strings require
parentheses for disambiguation, trees don’t, as can be seen in Figure 1.1b and Figure 1.1c.

1.2 Semantics
Semantics give a meaning to a formula in propositional logic. The semantics is a function that takes
in the truth values of all the variables that appear in a formula and gives the truth value of the
formula. Let 0 represent “false” and 1 represent “true”. The semantics of a formula φ of n variables
is a function
JφK : {0, 1}n → {0, 1}
CHAPTER 1. PROPOSITIONAL LOGIC 8

→ p1 →

p1 p2 →

( → ) p3 p4
(b) Parse tree for p1 → (p2 → (p3 → p4 ))
p2 →

( → ) → →

p3 p4 p1 p2 p3 p4
(a) (c) Parse tree for (p1 → p2 ) → (p3 → p4 )

Figure 1.1: Parse trees obviate the need for parentheses.

It is often presented in the form of a truth table. Truth tables of operators can be found in table
Table 1.2.
φ1 φ2 φ1 ∧ φ2 φ1 φ2 φ1 ∨ φ2
0 0 0 0 0 0
φ ¬φ 0 1 0 0 1 1
0 1 1 0 0 1 0 1
1 0 1 1 1 1 1 1
(a) Truth table for ¬φ. (b) Truth table for φ1 ∧ φ2 . (c) Truth table for φ1 ∨ φ2 .
φ1 φ2 φ1 → φ2 φ1 φ2 φ1 ↔ φ2
0 0 1 0 0 1
0 1 1 0 1 0
1 0 0 1 0 0
1 1 1 1 1 1
(d) Truth table for φ1 → φ2 . (e) Truth table for φ1 ↔ φ2 .

Table 1.2: Truth tables of operators.

Remark. Do not confuse 0 and 1 with ⊤ and ⊥: 0 (false) and 1 (true) are meanings, while ⊤ and
⊥ are symbols.
Rules of semantics:

• J¬φK = 1 iff JφK = 0.

• Jφ1 ∧ φ2 K = 1 iff Jφ1 K = Jφ2 K = 1.

• Jφ1 ∨ φ2 K = 1 iff at least one of Jφ1 K or Jφ2 K evaluates to 1.


CHAPTER 1. PROPOSITIONAL LOGIC 9

• Jφ1 → φ2 K = 1 iff at least one of Jφ1 K = 0 or Jφ2 K = 1.

• Jφ1 ↔ φ2 K = 1 iff at both Jφ1 → φ2 K = 1 and Jφ2 → φ1 K = 1.

Truth Table: A truth table in propositional logic enumerates all possible truth values of logical
expressions. It lists combinations of truths for individual propositions and the compound statement’s
truth.
Example. Let us construct a truth table for J(p ∨ s) → (¬q ↔ r)K (see Table 1.3).

p q r s p∨s ¬q ¬q ↔ r (p ∨ s) → (¬q ↔ r)
0 0 0 0 0 1 0 1
0 0 0 1 1 1 0 0
0 0 1 0 0 1 1 1
0 0 1 1 1 1 1 1
0 1 0 0 0 0 1 1
0 1 0 1 1 0 1 1
0 1 1 0 0 0 0 1
0 1 1 1 1 0 0 0
1 0 0 0 1 1 0 0
1 0 0 1 1 1 0 0
1 0 1 0 1 1 1 1
1 0 1 1 1 1 1 1
1 1 0 0 1 0 1 1
1 1 0 1 1 0 1 1
1 1 1 0 1 0 0 0
1 1 1 1 1 0 0 0

Table 1.3: Truth table of (p ∨ s) → (¬q ↔ r).

1.2.1 Important Terminology


A formula φ is said to (be)

• satisfiable or consistent or sat iff JφK = 1 for some assignment of variables. That is, there
is at least one way to assign truth values to the variables that makes the entire formula true.
Both a formula and its negation may be sat at the same time (φ and ¬φ may both be sat).

• unsatisfiable or contradiction or unsat iff JφK = 0 for all assignments of variables. That
is, there is no way to assign truth values to the variables that makes the formula true. If a
formula φ is unsat then ¬φ must be sat (it is in fact valid).

• valid or tautology: JφK = 1 for all assignments of variables. That is, the formula is always
true, no matter how the variables are assigned. If a formula φ is valid then ¬φ is unsat.

• semantically entail φ1 iff JφK ⪯ Jφ1 K for all assignments of variables, where 0 (false) ⪯
1 (true). This is denoted by φ |= φ1 . If φ |= φ1 , then for every assignment, if φ evaluates to
1 then φ1 will evaluate to 1. Equivalently φ → φ1 is valid.
CHAPTER 1. PROPOSITIONAL LOGIC 10

• semantically equivalent to φ1 iff φ |= φ1 and φ1 |= φ. Basically φ and φ1 have identical


truth tables. Equivalently, φ ↔ φ1 is valid.

• equisatisfiable to φ1 iff either both are sat or both are unsat. Also note that, semantic
equivalence implies equisatisfiability but not vice-versa.

Term Example
sat p∨q
unsat p ∧ ¬p
valid p ∨ ¬p
semantically entails ¬p |= p → q
semantically equivalent p → q, ¬p ∨ q
equisatisfiable p ∧ q, r ∨ s

Table 1.4: Some examples for the definitions.

Example. Consider the formulas φ1 : p → (q → r), φ2 : (p ∧ q) → r and φ3 : (q ∧ ¬r) → ¬p. The


three formulas φ1 , φ2 and φ3 are semantically equivalent. One way to check this is to construct the
truth table.
On drawing the truth table for the above example, one would realise that it is laborious. Indeed,
for a formula with n variables, the truth table has 2n entries! So truth tables don’t work for large
formulas. We need a more systematic way to reason about the formulae. That leads us to proof
rules...
But before that let us get a closure on the example at the beginning of the chapter. Let pijk represent
the proposition ‘course Ci is scheduled in slot Sk of day Dj ’. We can encode the constraints using
the encoding strategy used in tutorial 1 - problem 3. That is, by introducing extra variables that
boundPthe sum for first
P5few variables (sum
P5 of i is atmost Using this we can
P4j). P P4encode
P5 the constraints
4 5
as : p ijk ≤ 1, p ≤ 1, p ≤ 1, p ≤ 3, j=1 pijk ≤ 3 and
P4 k=1 P5 j=1 ijk i=1 ijk k=1 i=1 ijk k=1
¬ k=1 j=1 pijk ≤ 2 .

1.3 Proof Rules


After encoding a problem into propositional formula we would like to reason about the formula.
Some of the properties of a formula that we are usually interested in are whether it is sat, unsat
or valid. We have already seen that truth tables do not scale well for large formulae. It is also not
humanly possible to reason about large formulae modelling real-world systems. We need to delegate
the task to computers. Hence, we need to make systematic rules that a computer can use to reason
about the formulae. These are called as proof rules.
The overall idea is to convert a formula to a normal form (basically a standard form that will make
reasoning easier - more about this later in the chapter) and use proof rules to check sat etc.

Rules are represented as


Premises
Connectori/e
Inferences
• Premise: A premise is a formula that is assumed or is known to be true.
CHAPTER 1. PROPOSITIONAL LOGIC 11

• Inference: The conclusion that is drawn from the premise(s).

• Connector: It is the logical operator over which the rule works. We use the subscript i
(for introduction) if the connector and the premises are combined to get the inference. The
subscript e (for elimination) is used when we eliminate the connector present in the premises
to draw inference.
Example. Look at the following rule
φ1 ∧ φ2
∧e1
φ1
In the rule above φ1 ∧ φ2 is assumed (is premise). Informally, looking at ∧’s truth table, we can
infer that both φ1 and φ2 are true if φ1 ∧ φ2 is true, so φ1 is an inference. Also, in this process we
eliminate (remove) ∧ so we call this and-elimination or ∧e . For better clarity we call this rule ∧e1
as φ1 is kept in the inference even when both φ1 and φ2 could be kept in inference. If we use φ2 in
inference then the rule becomes ∧e2 .
Table 1.5 summarises the basic proof rules that we would like to include in our proof system.

Connector Introduction Elimination

φ1 φ2 φ1 ∧ φ2 φ1 ∧ φ2
∧ ∧i ∧e1 ∧e2
φ1 ∧ φ2 φ1 φ2
φ1 φ2 φ1 ∨ φ2 φ1 → φ3 φ2 → φ3
∨ ∨i ∨i ∨e
φ1 ∨ φ2 1 φ1 ∨ φ2 2 φ3

φ1
..
. φ1 φ1 → φ2
→ φ2 →e
φ2
→i
φ1 → φ2

φ
..
. φ ¬φ
¬ ⊥ ¬e

¬i
¬φ

⊥ ⊥e
φ
¬¬φ
¬¬ ¬¬e
φ

Table 1.5: Proof rules.

In the →i rule, the box indicates that we can temporarily assume φ1 and conclude φ2 using no extra
non-trivial information. The →e is referred to by its Latin name, modus ponens.
CHAPTER 1. PROPOSITIONAL LOGIC 12

Example 1. We can now use these proof rules along with φ1 ∧ (φ2 ∧ φ3 ) as the premise to conclude
(φ1 ∧ φ2 ) ∧ φ3 .
φ1 ∧ (φ2 ∧ φ3 ) φ1 ∧ (φ2 ∧ φ3 )
∧e2 ∧e1
φ2 ∧ φ3 φ1
φ2 ∧ φ3 φ2 ∧ φ3
∧e1 ∧e2
φ2 φ3
φ1 φ2
∧i
φ1 ∧ φ2
φ1 ∧ φ2 φ3
∧i
(φ1 ∧ φ2 ) ∧ φ3

1.4 Natural Deduction


If we can begin with some formulas ϕ1 , ϕ2 , . . . , ϕn as our premises and then conclude φ by applying
the proof rules established we say that ϕ1 , ϕ2 , . . . , ϕn syntactically entail φ which is denoted by the
following expression, also called a sequent:

ϕ1 , ϕ2 , . . . , ϕn ⊢ φ.

We can also infer some formula using no premises, in which case the sequent is ⊢ φ.
Applying these proof rules involves the following general rule:

We can only use a formula φ at a point if it occurs prior to it in the proof and if no box
enclosing that occurrence of φ has been closed already.

Example. Consider the following proof of the sequent ⊢ p ∨ ¬p:


1. ¬(p ∨ ¬p) assumption
2. p assumption
3. p ∨ ¬p ∨i1 2
4. ⊥ ¬e 3,1
5. ¬p ¬i 2–4
6. p ∨ ¬p ∨i2 5
7. ⊥ ¬e 6,1
8. ¬¬(p ∨ ¬p) ¬i 1–7
9. p ∨ ¬p ¬¬e 8
Example. Proof for p ⊢ ¬¬p which is ¬¬i , a derived rule
1. p premise
2. ¬p assumption
3. ⊥ ¬e 1, 2
4. ¬¬p ¬i 2–3
CHAPTER 1. PROPOSITIONAL LOGIC 13

Example. A useful derived rule is modus tollens which is p → q, ¬q ⊢ ¬p:


1. p→q premise
2. ¬q premise
3. p assumption
4. q →e 3,1
5. ⊥ ¬e 4,2
6. ¬p ¬i 3–5
Example. ¬p ∧ ¬q ⊢ ¬(p ∨ q):
1. ¬p ∧ ¬q premise
2. p∨q assumption
3. p assumption
4. ¬p ∧e1 1
5. ⊥ ¬e 3,4
6. p→⊥ →i 3–5
7. q assumption
8. ¬q ∧e2 1
9. ⊥ ¬e 7,8
10. q→⊥ →i 7–9
11. ⊥ ∨e 2,6,10
12. ¬(p ∨ q) ¬i 2–11

1.5 Soundness and Completeness of our proof system


A proof system is said to be sound if everything that can be derived using it matches the semantics.

Soundness: Σ ⊢ φ implies Σ |= φ
The rules that we have chosen are indeed individually sound since they ensure that if for some
assignment the premises evaluate to 1, so does the inference. Otherwise they rely on the notion of
contradiction and assumption. Hence, soundness for any proof can be shown by inducting on the
length of the proof.
A complete proof system is one which allows the inference of every valid semantic entailment:
Completeness: Σ |= φ implies Σ ⊢ φ
Let’s take some example of semantic entailment. Σ = {p → q, ¬q} |= ¬p.

p q p→q ¬q ¬p
0 0 1 1 1
0 1 1 0 1
1 0 0 1 0
1 1 1 0 0
CHAPTER 1. PROPOSITIONAL LOGIC 14

As we can see, whenever both p → q and ¬q are true, ¬p is true. The question now is how do we
derive this using proof rules? The idea is to ‘mimic’ each row of the truth table. This means that
we assume the values for p, q and try to prove that the formulae in Σ imply φ1 . And to prove an
implication, we can use the →i rule. Here’s an example of how we can prove our claim for the first
row:
1. ¬p given
2. ¬q given
3. (p → q) ∧ ¬q assumption
4. ¬p 1
5. ((p → q) ∧ ¬q) → ¬p →i 3,4
Similarly to mimic the second row, we would like to show ¬p, q ⊢ ((p → q) ∧ ¬q) → ¬p. Actually
for every row, we’d like to start with the assumptions about the values of each variable, and then
try to prove the property that we want.

Figure 1.2: Mimicking all 4 rows of the truth table

This looks promising, but we aren’t done, we have only proven our formula under all possible assump-
tions, but we haven’t exactly proven our formula from nothing given. But note that the reasoning
we are doing looks a lot like case work, and we can think of the ∨e rule. In words, this rule states
that if a formula is true under 2 different assumptions, and one of the assumptions is always true,
then our formula is true. So if we just somehow rigorously show at least one of our row assumptions
is always true, we will be able to clean up our proof using the ∨e rule.

But as seen above, we were able to show a proof for the sequent ⊢ φ ∨ ¬φ. If we just recursively
apply this property for all the variables we have, we should be able to capture every row of truth
table. So combining this result, our proofs for each row of the truth table, and the ∨e rule, the
whole proof is constructed as below. The onlyV thing we need now is the ability to construct proofs
for each row given the general valid formula ϕ∈Σ ϕ → φ.

This can be done using structural induction to prove the following:


Let φ be a formula using the propositional variables p1 , p2 , . . . , pn . For any assignment to these
variables define p̂i = pi if pi is set to 1 and p̂i = ¬pi otherwise, then:
p̂1 , p̂2 , . . . , p̂n ⊢ φ is provable if φ evaluates to 1 for the assignment
p̂1 , p̂2 , . . . , p̂n ⊢ ¬φ is provable if φ evaluates to 0 for the assignment.

1
Σ semantically entails φ is equivalent to saying intersection of formulae in Σ implies φ is valid
CHAPTER 1. PROPOSITIONAL LOGIC 15

p → q, ¬q

p ∨ ¬p

¬p p

q ∨ ¬q q ∨ ¬q

¬q q ¬q q

Proof assuming row 1 Proof assuming row 2 Proof assuming row 3 Proof assuming row 4

¬p ¬p ¬p ¬p

Or Elimination Or Elimination

¬p ¬p

Or Elimination

¬p

1.6 What about Satisfiability?


Using Natural Deduction, we can only talk about the formulas that are a contradiction or valid.
But there are formulas that are neither i.e. they are satisfiable for some assignment of variables but
CHAPTER 1. PROPOSITIONAL LOGIC 16

not all. Example, for some p and q,


̸⊢ p ∧ q
But clearly, p ∧ q is satisfiable when both p and q are true. Natural deduction can only claim
statements like,
⊢ ¬p → ¬(p ∧ q)
⊢ (p → (q → (p ∧ q))
An important link between the two situations is that,
A formula ϕ is valid iff ¬ϕ is not satisfiable

1.7 Algebraic Laws and Some Redundancy


1.7.1 Distributive Laws
Here are some identities that help complex formulas to some required forms discussed later. These
formulas are easily derived using the natural deduction proof rules as discussed above.

ϕ1 ∧ (ϕ2 ∨ ϕ3 )=| |= (ϕ1 ∧ ϕ2 ) ∨ (ϕ1 ∧ ϕ3 )

ϕ1 ∨ (ϕ2 ∧ ϕ3 )=| |= (ϕ1 ∨ ϕ2 ) ∧ (ϕ1 ∨ ϕ3 )

1.7.2 Reduction of bi-implication and implication


We also see that bi-implication and implication can be reduced to ∨,∧ and ¬ and are therefore
redundant in the alphabet of our string.

ϕ1 ←→ ϕ2 =| |= (ϕ1 → ϕ2 ) ∧ (ϕ2 → ϕ1 )

(ϕ1 → ϕ2 )=| |= ((¬ϕ1 ) ∨ ϕ2 )

1.7.3 DeMorgan’s Laws


Similar to distributive laws, the following laws (again easily provable via natural deduction) help
reduce any normal string to a suitable form(discuissed in the next section).

¬(ϕ1 ∧ ϕ2 )=| |= (¬ϕ1 ∨ ¬ϕ2 )

¬(ϕ1 ∨ ϕ2 )=| |= (¬ϕ1 ∧ ¬ϕ2 )

1.8 Negation Normal Forms


In mathematical logic, a formula is in negation normal form (NNF) if the negation operator (¬)
is only applied to variables and the only other allowed Boolean operators are conjunction (∧) and
disjunction(∨) One can convert any formula to a NNF by repeatedly applying DeMorgan’s Laws to
any clause that may have a ¬, until only the variables have the ¬ operator. For example
A NNF formula may be represented using it’s parse tree, which doesn’t have any negation nodes
except at the leaves, consider ( p ∨ ¬ r ) ∧ ( ¬ q ∨ (r ∧ (p ∨ ¬ r)))
CHAPTER 1. PROPOSITIONAL LOGIC 17

Parse Tree Representation of the Above Example:

∨ ∨

p ¬ ¬ ∧

r q r ∨

p ¬

Since , the red part of the tree is repeating twice , we can make a DAG(Directed Acyclic Graph)
instead of the parse tree.

DAG Representation

∧ ¬

∨ r q

p ¬

1.9 From DAG to NNF-DAG


Given a DAG of propositional logic formula with only ∨ , ∧ and ¬ nodes , can we efficiently get a
DAG representing a semantically equivalent NNF formula ?

Idea 1: Let’s push the "¬" downwards by applying the De Morgans Law and see what happens.
Lets Consider the following example and the highlighted ¬.
CHAPTER 1. PROPOSITIONAL LOGIC 18

¬ ∨

∨ ¬ ∧

∧ ∨

q p ¬p ¬r

Pushing down the highlighted ¬ across the red edge.

¬ ∨

∨ ∧

∧ ∧
¬
¬

q p ¬p ¬r

Now , we have an issue in the blue edge. The blue edge wanted the non - negated tree node but due
to the above mentioned change , it is getting the negated node. So, this idea won’t work. We want
to preserve the non-negated nodes as well.

Modification : Make two copies of the DAG and negate(i.e , ¬ pushing) only 1 of the copies and if
a nodes wants non - negated node then take that node from the copied tree.
CHAPTER 1. PROPOSITIONAL LOGIC 19

∨ ∨

¬ ∨ ¬ ∨

∨ ∧ ∨ ¬ ∧

∧ ∧ ∧ ∨
¬
¬
∧ ∧

q p ¬p ¬r q p ¬p ¬r

Figure 1.3: Step 1

∨ ∨

¬ ∨ ¬ ∨

∨ ∧ ∨ ¬ ∧

∧ ∧ ∧ ∨

∨ ∧
¬ ¬
q p ¬p r q p ¬p ¬r

Figure 1.4: Step 2


CHAPTER 1. PROPOSITIONAL LOGIC 20

∨ ∨

¬ ∨ ¬ ∨

∨ ∧ ∨ ¬ ∧

∧ ∧ ∧ ∨

∨ ∧

¬q ¬p ¬p r q p ¬p ¬r

Figure 1.5: Step 3

¬ ∨

∨ ∧

∧ ∧ ∨

∨ ∧

¬q ¬p ¬p r q p ¬r

Figure 1.6: Step 4 : Remove all redundant nodes


CHAPTER 1. PROPOSITIONAL LOGIC 21

1.10 An Efficient Algorithm to convert DAG to NNF-DAG

∨ ∧

∧ ∨

q p ¬p ¬r

Figure 1.7: Step 1:Make a copy of the DAG and Remove all "¬" nodes except the ones which are
applied to the basic variables
CHAPTER 1. PROPOSITIONAL LOGIC 22

∨ ∨

¬ ∨ ∨

∨ ¬ ∧ ∨ ∧

∧ ∧ ∧ ∨

∧ ∧

q p ¬p ¬r q p ¬p ¬r

Figure 1.8: Step 2: Negate the entire DAG obtained in step 1

∨ ∧

¬ ∨ ∧

∨ ¬ ∧ ∧ ∨

∧ ∧ ∨ ∨

∧ ∨

q p ¬p ¬r ¬q ¬p p r

Figure 1.9: Step 3: Remove the "¬" nodes from the first DAG by connecting them to the corre-
sponding node in the negated DAG.
CHAPTER 1. PROPOSITIONAL LOGIC 23

∨ ∧

∨ ∧

∨ ∧ ∧ ∨

∧ ∧ ∨ ∨

∧ ∨

q p ¬p ¬r ¬q ¬p p r

Figure 1.10: Step 4: Remove all the redundant nodes.

∧ ∧

∨ ∨

¬p ¬r ¬q ¬p p r

Figure 1.11: NNF - DAG


CHAPTER 1. PROPOSITIONAL LOGIC 24

∧ ∧

∨ ∨

¬q ¬p ¬p ¬r

p r

Figure 1.12: The NNF DAG (rearranged)


CHAPTER 1. PROPOSITIONAL LOGIC 25

NOTE : The size of the NNF - DAG obtained using the above algorithm is atmost two
times the size of the given DAG. Hence we have an O(N) formula for converting any arbitrary
DAG to a semantically equivalent NNF - DAG.

1.11 Conjunctive Normal Forms


A formula is in conjunctive normal form or clausal normal form if it is a conjunction of one or more
clauses, where a clause is a disjunction of variables.
Examples:

• Parse Tree for a formula in CNF

∨ ∧

P Q ∨ T

R S

• Parse Tree for the formula (¬(p ∧ ¬q) ∧ (¬(¬r ∧ s)) ∧ t) in NNF

∨ ∧

¬ Q ∨ T

P R ¬

Some Important Terms


• LITERAL :

– Variable or it’s complement.


– Example : p , ¬ p , r , ¬ q
CHAPTER 1. PROPOSITIONAL LOGIC 26

• CLAUSE :

– A clause is a disjunction of literals such that a literal and its negation are not both in
the same clause, for any literal
– Example : p ∨ q ∨ (¬ r) , p ∨ (¬ p) ∨ (¬ r) -> Not Allowed

• CUBE :

– a Cube is a conjunction of literals such that a literal and its negation are not both in the
same cube, for any literal
– Example p ∧ q ∧ (¬ r) , p ∧ (¬ p) ∧ (¬ r) -> Not Allowed

• CONJUNCTIVE NORMAL FORM(CNF)

– A propositional formula is said to be in Conjunctive Normal Form (CNF) if it is a con-


junction of clauses
– Product of Sums
– Example ( p ∨ q ∨ (¬ r) ) ∧ ( q ∨ r ∨ (¬ s) ∨ t )

• DISJUNCTIVE NORMAL FORM(DNF)

– A formula is said to be in Disjunctive Normal Form (DNF) if it is a disjunction of cubes.


– Sum of Products
– Example ( p ∧ q ∧ (¬ r) ) ∨ ( q ∧ r ∧ (¬ s) ∧ t )

Given a DAG of propositional logic formula with only ∨ , ∧ and ¬ nodes , can we
efficiently get a DAG representing a semantically equivalent CNF/DNF formula ?

Tutorial 2 Question 2:
The Parity Function can be expressed as ((( ....(x1 ⊕ x2 ) ⊕ x3 ) ....... ⊕ xn )
The Parse Tree for x1 ⊕ x2 is

∧ ∨

¬x2 x1 x2 ¬x1

Now consider ϕ = x1 ⊕ x2 then Parse tree for (x1 ⊕ x2 ) ⊕ x3 will be

∧ ∨

¬x3 ϕ x3 ¬ϕ
CHAPTER 1. PROPOSITIONAL LOGIC 27

Now putting ϕ = x1 ⊕ x2 in the tree, we get the following DAG representation

∧ ∨

¬x3 x3 ¬

∧ ∨

¬x2 x1 x2 ¬x1

We notice that on adding xi we are adding 4 nodes. Hence, the size of DAG of the parity
function is atmost 4n. And we have already shown that size of NNF-DAG is atmost 2 times the
size of DAG. So, the size of the semantically equivalent NNF-DAG is atmost 8n.

Also , in the Tutorial Question we have proved that the DAG size of the semantically equivalent
CNF/DNF formula is atleast 2n−1 .

NNF -> CNF/DNF (Exponential Growth in size of the DAG)

1.12 Satisfiability and Validity Checking


It is easy to check validity of CNF. Check for every clause that it has some p , ¬ p. If there
is a clause which does not have both(p , ¬ p), then the Formula is not valid because we
can always assign variables in such a way that makes this clause false.
Example - If we have a clause p ∨ q ∨ (¬ r) , we can make it 0 with assignments p = 0, q = 0 and
r = 1.
And if both a variable and its conjugate are present in a clause then since p ∨ ¬ p is valid and
1 ∨ ϕ = 1 for any ϕ.So, the clause will be always valid.
Hence , we have an O(N) algorithm to check the validity of CNF but the price we pay is
conversion to it(which is exponential).
It is hard to check validity of DNF because we will have to find an assignment which falsifies all the
cubes.
What is meant by satisfiability?
Given a Formula , is there an assignment which makes the formula valid.
It is easy to check satisfiability of DNF. Check for every cube that it has some p , ¬ p. If there
is a cube which does not have both(p , ¬ p), then the Formula is satisfiable because we can always
assign variables in such a way that makes this cube true and hence the entire formula true. Overall,
We just need to find a satisfying assignment for any one cube.
Hard to check satisfiability using CNF. We will have to find an assignment which simultaneosly
CHAPTER 1. PROPOSITIONAL LOGIC 28

satisfy all the cubes.

NOTE : A formula is valid if it’s negation is not satisfiable. Therefore , we can convert every
validity problem to a satisfiability problem. Thus, it suffices to worry only about satisfiability
problem.

1.13 DAG to Equisatisfiable CNF


Claim: Given any formula, we can get an equisatisfiable formula in CNF of linear size efficiently.
Proof: Let’s consider the DAG ϕ(p, q, r) given below:

1. Introduce new variables t1 , t2 , t3 , .... , tn for each of the nodes. We will get an equisatisfiable
formula ϕ′ (p, q, r, t1 , t2 , t3 , .... , tn ) which is in CNF.

2. Write the formula for ϕ′ as a conjunction of subformulas for each node of form given below:

ϕ′ =(t1 ⇐⇒ (p ∧ q)) ∧
(t2 ⇐⇒ (t1 ∨ ¬r)) ∧
(t3 ⇐⇒ (¬t2 )) ∧
(t4 ⇐⇒ (¬p ∧ ¬r)) ∧
(t5 ⇐⇒ (t3 ∨ t4 )) ∧
t5

3. Convert each of the subformula to CNF. For the first node it is shown below:

(t1 ⇐⇒ (p ∧ q)) =(¬t1 ∧ (p ∧ q)) ∧ (¬(p ∧ q) ∨ t1 )


=(¬t1 ∨ p) ∧ (¬t1 ∨ q) ∧ (¬p ∨ ¬q ∨ t1 )

4. For checking the equisatisfiabiliy of ϕ and ϕ′ : Think about an assignment which makes ϕ true
then apply that assignment to ϕ′ .

1.14 Tseitin Encoding


The Tseitin encoding technique is commonly employed in the context of Boolean satisfiability (SAT)
problems. SAT solvers are tools designed to determine the satisfiability of a given logical formula,
i.e., whether there exists an assignment of truth values to the variables that makes the entire formula
true.
The basic idea behind Tseitin encoding is to introduce additional auxiliary variables to represent
complex subformulas or logical connectives within the original formula. By doing this, the formula
can be transformed into an equivalent CNF representation.
CHAPTER 1. PROPOSITIONAL LOGIC 29

Say you have a formula Q(p, q, r, . . . ). Now we can use and introduce auxiliary variables t1 , t2 , . . .
to make a new formula Q′ (p, q, r, . . . t1 , t2 , . . . ) using Tseitin encoding which is equisatisfiable as Q.
Q′ is equisatisfiable as Q but not sematically equivalent. Size of Q′ is linear in size of Q.

Lets take an example to understand better. Consider the formula (¬((q ∧ p) ∨ ¬r)) ∨ (¬p ∧ ¬r)

¬ ∧

q p ¬p ¬r

We define a new equisatisfiable formula with auxiliary variables t1 , t2 , t3 , t4 and t5 as follows:

(t1 ⇐⇒ p ∧ q) ∧ (t2 ⇐⇒ t1 ∨ ¬r) ∧ (t3 ⇐⇒ ¬t2 ) ∧ (t4 ⇐⇒ ¬p ∧ ¬r) ∧ (t5 ⇐⇒ t3 ∨ t4 ) ∧ (t5 )

1.15 Towards Checking Satisfiability of CNF and Horn Clauses


A Horn clause is a disjunctive clause (a disjunction of literals) with at most one positive literal.
A Horn Formula is a conjuction of horn clauses, for example:

(¬x1 ∨ ¬x2 ∨ x3 ) ∧ (¬x4 ∨ x5 ∨ ¬x3 ) ∧ (¬x1 ∨ ¬x5 ) ∧ (x5 ) ∧ (¬x5 ∨ x3 ) ∧ (¬x5 ∨ ¬x1 )

Now we can convert any horn clause to an implication by using disjunction of the literals that were
in negation form in the horn clause on left side of the implication and the unnegated variable on the
other side of the implication. So all the variables in all the implications will be unnegated.
So the above equation can be translated as follows.

x1 ∧ x2 =⇒ x3
x4 ∧ x3 =⇒ x5
x1 ∧ x5 =⇒ ⊥
x5 =⇒ x3
⊤ =⇒ x5
Now we try to find a satisfying assignment for the above formula.
From the last clause we get x5 = 1, now the fourth clause is ⊤ =⇒ x3 .
Then from the fourth clause we get that x3 = 1.
Now in the remaining clauses none of the left hand sides are reduced to ⊤.
Hence, we set all remaining variables to 0 to get a satisfying assignment.
CHAPTER 1. PROPOSITIONAL LOGIC 30

Algorithm 1: HORN Algorithm


1 Function HORN(ϕ):
2 foreach occurrence of ⊤ in ϕ do
3 mark the occurrence
4 while there is a conjunct P1 ∧ P2 ∧ . . . ∧ Pki → P ′ in ϕ do
// such that all Pj are marked but P ′ isn’t
5 if all Pj are marked and P ′ isn’t then
6 mark P ′

7 if ⊥ is marked then
8 return’unsatisfiable’
9 else
10 return’satisfiable’

Complexity:
If we have n variables and k clauses then the solving complexity will be O(nk) as in worst case in
each clause you search for each variable.

1.16 Counter example for Horn Formula


In our previous lectures, we delved into the Horn Formula, a valuable tool for assessing the satisfia-
bility of logical formulas.

1.16.1 Example
We are presented with a example involving conditions that determine when an alarm (a) should
ring. Let’s outline the given conditions:

1. If there is a burglary (b) in the night (n), then the alarm should ring (a).

2. If there is a fire (f ) in the day (d), then the alarm should ring (a).

3. If there is an earthquake (e), it may occur in the day (d) or night (n), and in either case, the
alarm should ring (a).

4. If there is a prank (p) in the night (n), then the alarm should ring (a).

5. Also it is known that prank (p) does not happen during day (d) and burglary (b) does not
takes place when there is fire (f ).

Let us write down these implications


b∧n⇒a f ∧d⇒a e∧d⇒a
e∧n⇒a p∧n⇒a d∧n⇒⊥
b∧f ⇒a p∧d⇒⊥
CHAPTER 1. PROPOSITIONAL LOGIC 31

Now we want to examine the possible behaviour of this systen under the assumption that alarm
rings during day. For this we add two more clauses:

⊤⇒a ⊤⇒d

This directly gives us that a, d have to be true, what about the rest? We can see that setting all the
remaining variables to false is a satisfying assignment for this set of formulae.
Hence we have none of prank, earthquake, burglary or fire and hence alarm should not ring.
This means that our formulae system is incomplete.
To achieve this, we try to introduce new variables N a (no alarm), N f (no fire), N b (no burglary),
N e (no earthquake), and N p (no prank).
We extend the above set of implications in a natural way using these formulae:

a ∧ Na ⇒ ⊥ b ∧ Nb ⇒ ⊥ f ∧ Nf ⇒ ⊥ e ∧ Ne ⇒ ⊥ p ∧ Np ⇒ ⊥

Nb ∧ Nf ∧ Ne ∧ Np ⇒ Na

All the implications will hold true for the values b = p = e = f = N b = N e = N f = N p = 0.


Here, we are getting b = N b, which is not possible, hence, we need the aditional constraint that
b ⇐⇒ N b. But on careful examination we see that this cannot be represented as a horn clause.
Therefore, it becomes necessary to devise an alternative algorithm suited for evaluating satisfiability.

1.17 Davis Putnam Logemann Loveland (DPLL) Algorithm


This works for more general cases of CNF formulas where it need not be a Horn formula. Let us
first discuss techniques and terms required for our Algorithm.

• Partial Assignment (PA) : It is any assignment of some of the propositional variables.


Ex. P A = {x1 = 1, x2 = 0} ; P A = {}, etc.

• Unit Clause : It is any clause which only has one literal in it. Ex. .. ∧ (¬x5 ) ∧ ..
Note: If any Formula has a unit clause then the literal in it has to be set to true.

• Pure Literal : A literal which doesn’t appear negated in any clause. Say a propositional
variable x appear only as ¬x in every clause it appears in., or say y appears only as y in every
clause.
Note: If there is a pure literal in the formula, it does not hurt any clause to set it to true. All
the clauses in which this literal is present will become true immediately.

We will now utilize every techniques we learnt to simplify our formula. First we check if our formula
has unit clause or not. If yes then we assign the literal in that clause to be 1. Note, φ[l = 1] is
the formula obtained after setting l = 1 everywhere in the formula. We also search for pure liter-
als. If we find a pure literal then we can simply assign it 1 (or 0 if it always appears in negated
form) and proceed, this cannot harm us (cause future conflicts) due to the definition of Pure Literal.
If we do not have any of these then we have only one option left at the moment which is try and error.

We assign any one of the variable in the formula a value which we choose by some way (not
described here). Then we go on with the usual algorithm until we either get the whole formula to
CHAPTER 1. PROPOSITIONAL LOGIC 32

be true or false. At this step we might have to backtrack if the formula turns out to be false. If it
is true we can terminate the algorithm.

Note: Our algorithm can be as worse as a Truth Table as we are trying every assignment. But
as we are applying additional steps, after making a decision there are high chances that we get a
unit clause or a pure literal.

Now as we have done all the prerequisites let us state the algorithm.

Algorithm 2: SAT(φ, PA)


{// These are the base cases for our recursion}
if φ = ⊤ then
return SAT(sat, PA)
else if φ = ⊥ then
return SAT(unsat, PA)
else if Ci is a unit clause (literal l) and Ci ∈ φ then
{//This step is called Unit Propagation}
return SAT(φ[l = 1], PA ∪ {l = 1}) {// Here recursively call the algorithm on the simplified}
//formula φ[l = 1]
else if l is a pure literal and l ∈ φ then
{//This step is called Pure Literal Elimination}
return SAT(φ[l = 1], PA ∪ {l = 1})
else
{//This step is called Decision Step}
x ← choose_a_var(φ)
v ← choose_a_value({0, 1})
if SAT(φ[x = v], PA ∪ {x = v}).status = sat then
return SAT(sat, PA ∪ {x = v})
else if SAT(φ[x = 1 − v], PA ∪ {x = 1 − v}).status = sat then
return SAT(sat, PA ∪ {x = 1 − v})
else
return SAT(unsat, PA)
end if
end if

Question Can the formula be a horn formula after steps 1 and 2 can’t be applied anymore? i.e. If
our formula does not have any unit clause or pure literal can it be a horn formula?

ANS. Yes. Here is an example

(a ∨ ¬b) ∧ (¬a ∨ b)
CHAPTER 1. PROPOSITIONAL LOGIC 33

1.18 DPLL in action


1.18.1 Example
Consider the following clauses, for which we have to find whether all can be satisfied for a variable
mapping or not using DPLL algorithm-
C1 : (¬P1 ∨ P2 ) C2 : (¬P1 ∨ P3 ∨ P5 ) C3 : (¬P2 ∨ P4 ) C4 : (¬P3 ∨ P4 )
C5 : (P1 ∨ P5 ∨ ¬P2 ) C6 : (P2 ∨ P3 ) C7 : (P1 ∨ P3 ∨ P7 ) C8 : (P6 ∨ ¬P5 )
Lets make two possible decision trees for these clauses.
PLE - Pure literal elimination UP - Unit Propagation D - Decision

start P4
1,PLE
P6
1,PLE 1,PLE

P7 P5
1,PLE 1,PLE
P3 P2
1,PLE 1,PLE

P5 Sat
1,PLE
P2
1,PLE

Sat

Figure 1.13: DPLL


CHAPTER 1. PROPOSITIONAL LOGIC 34

The following is the decision tree if we remove the point 2 of DPLL. Note the increase in number of
operations.

start P6
0,D

P5
0,UP
P7
0,D

P1
1,D
0,D
P2 P2
Backtrack till a D type node found 0,UP 1,UP
P3 P3
1,UP 1,UP

Unsat P4
1,UP

Sat

Figure 1.14: DPLL without point 2

1.19 Applying DPLL Algorithm to Horn Formulas


Let us apply DPLL algorithm to Horn Formula
If there are no variables on LHS, it becomes a Unit Clause i.e. ⊤ → xi and is equivalent to (xi ).
Horn Method can be viewed in terms of DPLL algorithm as following:

• Apply Unit Propagation until you can’t apply.

• After that, set all remaining variables to 0.

Advantage of Horn’s method is after all possible Unit Propagations are done, it sets all remaining
variables to 0, but in DPLL we need to go step by step for each remaining variable.

But Horn’s method can only be applied in a special case, moreover, in Horn’s method we only figure
out which variables to set true as opposed to DPLL which can figure out whether variable needs to
be set to true or false via the Pure Literal Elimination.
CHAPTER 1. PROPOSITIONAL LOGIC 35

1.20 DPLL on Horn Clauses


We shall quickly investigate what happens when we feed in Horn clauses to the DPLL (Davis-
Putman-Logemann-Loveland) algorithm.
Consider the following,

• Consider the first step in solving for the satisfiability of a given set of Horn clauses in implication
form where,

– If the LHS of an implication is true we set the literal on the RHS of the implication to
be true in all its implications.
– Repeating the above step sets all essential variable which are to be set to 1, true.

This step is equivalent to the first two steps of the DPLL algorithm,

– Satisfy unit literal clauses by assignment.


– Recomputing the formula for the above bullet.
– Repeating this procedure until no unit clause is left.

The above steps in the two different schemes do the same are essentially doing the same thing.
Now if the given clauses were Horn, we know that putting all the remaining variables false is
a satisfying assignment. This means if our DPLL algorithm preferentially assigns 0 to each
decision, the procedure thus converges to the method for checking the satisfiability for Horn
formulae.

1.21 Rule of Resolution


This is yet another powerful rule for inference. Let us first jot down the rule here:

(a1 ∨ a2 ∨ a3 · · · ∨ an ∨ x) ∧ (b1 ∨ b2 ∨ b3 · · · ∨ bm ∨ ¬x)


resolution
(a1 ∨ a2 ∨ a3 · · · ∨ an ∨ b1 ∨ b2 ∨ b3 · · · ∨ bm )

However intuitive it may look this rule poses as a powerful tool to check the satisfiability of logical
formulae, we can reason out an algorithm to check the satisfiability of a formula (CNF) as follows:
Let us first define a formula to be unresolved if there exists a literal and its negation in the
formula (they cannot be in the same clause by the definition of a clause). If a formula is
resolved (i.e., not unresolved) then it is satisfiable (‘SAT’), as the variables which appear in their
negated form we assign false, and the other variables to true.
CHAPTER 1. PROPOSITIONAL LOGIC 36

Let C be the set of clauses for a give CNF.


1. If C contains tautologies we can drop them, if C becomes empty upon dropping the tautolo-
gies, we mark the given CNF SAT. (However by definition, clauses by themselves cannot be
tautologies.)

2. As the formula is unresolved, we can apply the resolution rule, this gives us a new clause.

3. If the formula so formed is the empty clause, we deem the formula to be UNSAT otherwise
check if the formula is resolved, if not from repeat step 1.
Before rationalizing the soundness of the above sequence of steps let us first see an example.
An Example: Consider C = {C1 , C2 , C3 } as given below:
• C1 := ¬p1 ∨ p2 ( p1 =⇒ p2 )

• C2 := p1 ∨ ¬p2 ( p2 =⇒ p1 )

• C3 := ¬p1 ∨ ¬p2 ( p1 ∧ p2 =⇒ ⊥)
Then, a dry run of the above method would look like:
1. Since both p1 and p2 appear in negated and un-negated form, we apply resolution on C1 and
C2 , which generates C4 , as follows:
(¬p1 ∨ p2 ) ( p1 ∨ ¬p2 )
resolution
(¬p1 ∨ p1 )

2. Once again we apply resolution on C3 and C4 (this is not really a clause by definition, one can
choose to drop the tautologies as soon as encountered),
(¬p1 ∨ ¬p2 ) (¬p1 ∨ p1 )
resolution
(¬p2 ∨ ¬p1 )

3. The clause that we have got is now resolved and thus, our formula is satisfiable.

1.21.1 Completeness of Resolution for Unsatisfiability of CNFs


Here as we claimed above, given any CNF, if it is unsatisfiable the remainder of continuous resolutions
is the empty clause which we deem UNSAT. We here prove the consistency of the claim.
For this we employ the method of mathematical induction, we induct on the number of propositional
variables in our CNF. Let p1 , p2 . . . pm be our propositional variables.
Base: n = 1 An unsatisfiable CNF in a single literal must contain the clauses (p1 ) and (¬p1 ) which
upon resolution gives us ( ) the empty clause hence we raise UNSAT.
Inductive Hypothesis: Assume that our claim holds ∀m ≤ n − 1 we now show that it holds from
m = n as follows,
• Remove tautologies from C, the set of all clauses.

• Choose a literal pi such that both pi and ¬pi both appear in the CNF. (If no such literal exists
the formula is resolved as defined earlier and has a satisfying assignment). Apply resolution
repeatedly as long as the same pi satisfies this condition.
CHAPTER 1. PROPOSITIONAL LOGIC 37

• If the CNF contains ( ), in which case we raise UNSAT, otherwise

– If pi vanishes from the CNF, then calling our hypothesis, we can raise UNSAT as the
equivalent form that we have got must be unsatisfiable independent of the value of the
vanished literal as the initial formula was unsatisfiable.
– If pi exists in one of negated or un-negated forms. In which case we repeat the procedure.
This time the number of available pairs has reduced by 1 as pi cannot be selected again.

• As the selection step can take place at most n times, (as a new pair(as in bullet 2) cannot be
generated in the CNF by resolution operations), Consider the case with a pair available for
every literal then the procedure must conclude UNSAT in n steps otherwise at the end of n
steps we have no pairs, which ensures a satisfying assignment for the formula.

Broadly speaking, what we are showing is that upon repeated resolution of an unsatisfiable CNF, if
( ) has not been encountered, the number of propositional variables must decrease.
Chapter 2

DFAs and Regular Languages

Consider a set of formulas made up of propositional variables {x1 , x2 . . . , xn }, ϕ(x1 , x2 . . . , xn ). We


defined the set L ⊆ {0, 1}n , the language defined by the formula as the set of strings which form
a satisfying assignment for the formula ϕ.
Basically using Propositional Logic, we were able to represent a large set of finite length strings
having some properties in a compact form. This leads us to a question what about string of
arbitrary length having some properties. How do we formulate them?
The answer to this is Automata: A way to formulate arbitrary length strings in a compact form.

2.1 Definitions
• Alphabet: A finite, non-empty set of symbols called characters. We usually represent an
alphabet with Σ. For example Σ = {a, b, c, d}.
• String: A finite sequence of letters form an alphabet. An important thing to note here is
that even though the alphabet may contain just 1 character, it can form countably infinite
number of strings, each of which are finite. In this course we only deal with finite strings
over a finite alphabet.
• Concatenation Operation (·) We can start from a string , take another string an as the
name suggests concatenate them to form another string:
a · b ̸= b · a Not Commutative
(a · b) · c = a · (b · c) Associative

• Identity Element The algebra of the strings defined over the concatenation operator has the
identity element : ε : empty string:
σ·ε=ε·σ =σ
Note the the empty strings remains same for strings of all alphabets.
• Language A subset of all finite strings on Σ. This set doesn’t have to be finite even though
the strings are of finite length.
Note that a set of all finite strings of Σ is countably infinite (cardinality: N), so the number
of languages of Σ is uncountably infinite (cardinality: 2N ).

38
CHAPTER 2. DFAS AND REGULAR LANGUAGES 39

• Σ∗ is defined to be the set of all finite strings on Σ, including ε. Note that Σ∗ = k≥0 Σk ,
S

where Σk is the set of all strings on Σ with exactly k letters. Note that we can prove that the
number of strings for Σ∗ are countably finite by representing each string as a unique number
in base (n + 1) system , where |Σ| = n, we can get an injection to natural numbers.

2.2 Deterministic Finite Automata


Generalization of Parity function: Lets go back to the question we asked first. Suppose we are
given a string of arbitrary length and don’t know the length of the string. This can be done by
propositional logic. So we need a new formalism to represent sets of strings with any length. we
want to develop a mechanism where we are given the bits of the string one by one, and I don’t know
when it will stop. So I must be ready with the answer each time a new bit arrives.

The solution to this lies in our discussion during the first lecture. I will record just one bit of infor-
mation: whether I have received an even or odd number of 1s till now. Every time I receive a new
bit, I will update this information: if it’s a 0, I won’t do anything, and if it’s a 1, I will change my
answer from even to odd, or vice-versa.

States: These are nodes which contain relevant summary of what we have seen so far

In our case we want to know whether there were even or odd number of 1s.

even # 1s odd # 1s

Where do we start from ? When I have seen nothing there are even number of 1s.

start even # 1s odd # 1s

Now, suppose I receive a 0, I would remain in the same state, but if I get a 1 , the parity changes.

1
start even # 1s odd # 1s

Now, if I am in the second state, if I get a 1 I will change states, and if I get a 0, parity is unchanged
to I remain in the same state:
CHAPTER 2. DFAS AND REGULAR LANGUAGES 40

1
start even # 1s odd # 1s

0 0

So when do we know that our string we have seen till now belongs to some language or not, we know
that by marking some states as accepting states: usually represented by double circles, if we end
up on this state, the string recieved till now belongs to out language: is accepted.

1
start even # 1s odd # 1s

0 0

Such a formalism with finite states is known as Finite Automata.


Further, if for every string in Σ∗ , there exists a unique path we will follow in the automata, such
automata are also known Deterministic Finite Automata (DFA).
Example
Σ = {0, 1}, let L = {w|w ∈ Σ∗ , no. of 0’s is a 0 mod 3 or 2 mod 3}

1 1

0 0
start 0 1 2

2.3 DFA Design - Example 1


Draw a Deterministic Finite Automaton (DFA) for L := {w ∈ {a, b}∗ } : 2 divides na (w) and 3 divides nb (w)}.
Here na (w) stands for the number of a’s in w, and nb (w) stands for the number of b’s in w. For
example, na (abbaab) = nb (abbaab) = 3. Therefore, ababbaa ∈ L but aabababa ∈ / L.

In certain scenarios, expressing a language solely through propositional logic becomes impractical,
particularly when the length of the strings is unknown or variable. For instance, consider above
example. In this case, the length n of the string is not explicitly provided, making it challenging to
construct a propositional logic expression directly. Propositional logic typically operates on fixed,
predetermined conditions or patterns within strings, which cannot accommodate variable lengths.
However, deterministic finite automata (DFAs) offer a suitable alternative for such situations. DFAs
CHAPTER 2. DFAS AND REGULAR LANGUAGES 41

are well-suited for languages where the structure and properties depend on the characters within the
string rather than on fixed string lengths. By employing states and transitions based on input char-
acters, DFAs can effectively recognize languages with variable-length strings and complex patterns,
making them a more appropriate choice when string length is not predetermined.
We
P will denote Σ as the set of alphabets.
= {a, b}P
L = {ω ∈ * : n (ω) is divisible by 2 and n (ω) is divisible by 3}
a b

start S0 (0, 0) S1 (0, 1) S2 (0, 2)


b b

a a a a a a

b b
S5 (1, 0) S4 (1, 1) S3 (1, 2)

Figure 2.1: DFA for above task

S0 : na (w)%2 = 0 and nb (w)%3 = 0


S1 : na (w)%2 = 0 and nb (w)%3 = 1
S2 : na (w)%2 = 0 and nb (w)%3 = 2
S3 : na (w)%2 = 1 and nb (w)%3 = 2
S4 : na (w)%2 = 1 and nb (w)%3 = 1
S5 : na (w)%2 = 1 and nb (w)%3 = 0

2.4 DFA Design - Example 2


We will look at another example now.
Σ = {a, b}
L = {w ∈ Σ∗ | nab (w) = nba (w)}

w = abaabab (nab (w) = 3 & nba (w) = 2)

If we try to cleverly convert the problem into a simpler one, we will observe that nab (w) = nba (w)
will always be true if the start and end alphabets are same (be it a or b).
CHAPTER 2. DFAS AND REGULAR LANGUAGES 42

So the above problem simplifies to:

L = {w ∈ Σ∗ | First and Last alphabet are same}

Forming an automation for this task can be done as:

a
b
S1 S3 b
a a

start S0

b
b
S2 S4 a
a
b

Figure 2.2: DFA: Last and First alphabets are same


Chapter 3

Non-Deterministic Finite Automata (NFA)

In the last few lectures, we covered the formalization of deterministic finite automata (DFA) where
the transition function outputs a single state for a given input and current state. In this lecture,
we will discuss non-deterministic finite automata (NDFA) where the transition function can output
multiple states (or a set of states) instead.

3.1 NFA Representation


As discussed earlier, a DFA is a 5-tuple (Q, Σ, q0 , δ, F ) where:

• Q is a finite set of all states

• Σ is the alphabet, a finite set of input symbols

• q0 ∈ Q is the initial state

• δ : Q × Σ → Q is the transition function

• F ⊆ Q is the set of final/accepting states

For example, consider the following automaton:

1 0
0

start q0 q1

It can be represented as:


DFA ( {q0 , q1 }, {0, 1}, q0 , δ, {q1 } )

where δ is the transition function:

43
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 44

Q Σ Q’
q0 0 q1
q0 1 q0
q1 0 q1
q1 1 q0

3.1.1 Formalization of Non-Deterministic Finite Automata


However, in NDFA,

• q0 ⊆ Q is the set of initial states

• δ : Q × Σ → 2Q is the transition function

Hence, consider the following non-deterministic finite automaton (A):

0, 1 1

0, 1
start q0 q1

It can be represented as:

NDFA {q0 , q1 }, {0, 1}, {q0 }, δ ′ , {q1 }




where δ ′ is the transition function: As we can see, δ ′ is a partial function whose output is a set of

Q Σ 2Q
q0 0 {q0 , q1 }
q0 1 {q0 , q1 }
q1 1 {q1 }

states instead of a single state.

3.1.2 NFA into action


Due to its transition function, an NDFA gives choices for path taken at some states for a given
input string. Any string for which there exists a path from the initial state to a final state is con-
sidered to be accepted by the NDFA.

Hence, the NDFA shown above has the language L(A) = Σ∗ \ {ϵ}, i.e., the set of all strings over the
alphabet Σ except the empty string.
0 1 1
For example, the string 011 is accepted by A as it has the following path q0 →
− q0 → − q1 .
− q0 →

To determine if a string is accepted by an NDFA, we can check if the set of states reachable from
the initial state by reading the string contains any final state.

Here we have: Here we have:


CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 45

Initial State Σ∗ Reachable States


q0 0 {q0 , q1 }
q0 01 {q0 , q1 }
q0 011 {q0 , q1 }

As {q0 , q1 } contains q1 ∈ F , the string 011 is accepted by A. By construction, there will always be
a set of choices which reach q1 from q0 for the input 011.

3.2 Equal expressiveness of DFA or NFA


We claim that regular languages, ie, the set of languages which can be defined by DFAs, is exactly
the set of languages which can be defined by NFAs. In other words, for every NFA, there exists
a DFA that accepts the same language, and vice versa.Even though NDFA gives choices at some
states, it can still be represented as some equivalent DFA. Even though NDFA gives choices at some
states, it can still be represented as some equivalent DFA. This implies that NDFAs have no more
expressive power than DFAs in terms of string acceptance. However, representation in form of
NDFA is much more succinct than DFA.

We can convert a NDFA to a DFA by considering the set of reachable states as states of the equiv-
alent DFA.

3.2.1 Construction of DFA from NFA


For an NFA (Q, , Q0 , δ, F ), construct a DFA (P(Q), , Q0 , δ ′ , F ′ ) with:
P P

[
δ ′ (G, σ) = δ(q, σ)
q∈G

F ′ = {q : q ∈ P(Q), q ∩ F ̸= ϕ}
Notice that in our new DFA, the states are labelled as subsets of the states of the NFA. This means
that we have 2n states in our DFA if the NFA had n. It is left to the reader to verify that the DFA
we have defined satisfies all the requirements of a DFA.
We claim that after the same characters are inputted into both the NFA and DFA, the state of the
DFA is labelled the same as the set of current states of the NFA. This claim is easy to check using
the definition of the δ ′ function.
Now, in an NFA, a string is accepted if any one of the active states at the end of the string is in the
set of accepting states. Clearly, with our interpretation of the DFA, this is equivalent to being in a
state that belongs to the F ′ we have defined.

3.2.2 Step Wise Conversion from NFA to DFA


Now we’ll see how to convert a NFA to a DFA through an example.
We call the following NFA ‘A’
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 46

0,1 1
0,1

start q0 q1

The language depicted by this NFA is all the strings formed using {0,1} excluding the empty string(ϵ).
We represent this as:
L(A) = {w ∈ {0, 1}∗ |w is accepted by A}
that is,
L(A) = Σ∗ \{ϵ}
The transition function(δ) table looks as follows:
Q Σ 2Q
q0 0 {q0 , q1 }
q0 00 {q0 , q1 }
q0 01 {q0 , q1 }
To convert this NFA to a DFA, we need to track the states that can be reached after n choices which
will be a subset of 2Q .

Step 1

{q0 , q1 }
0
{q0 }
1
{q0 , q1 }

Step 2
0

{q0 , q1 }

Final DFA
0,1
0,1

start
{q0 } {q0 , q1 }
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 47

Call this DFA ‘A′ ’.


One property we’ve extensively used in the conversion is:
S
If S ⊆ Q, then δ(S, 0) = δ(q, 0)
q∈S

Now, to show that the languages represented by the NFA(A) and the DFA(A′ ) are the same, i.e,
L(A) = L(A′ ), we need to show the following:

1. L(A) ⊆ L(A′ )

2. L(A′ ) ⊆ L(A)

3.2.3 Proof of Equivalence


To show that the languages represented by the NFA(A) and the DFA(A′ ) are the same, i.e, L(A) =
L(A′ ), we need to show the following:

• L(D) ⊆ L(N )

• L(N ) ⊆ L(D)

We can prove that the equivalent DFA D accepts exactly the same language as the original NDFA
N , i.e, L(D) = L(N ).

3.3 Reflective Insights


Even with all the extra "powers" and behaviours NFAs can have on top of those of DFAs, they
are equally expressive. This shows us that the limitation in expressiveness is not in the behaviour
of state transitions but in the finiteness of states. However, due to the exponential blowup in the
number of states, it is often more human-readable to express certain automata/languages through
NFAs, making it a convenient representation tool.

3.4 Proof of Equivalence – Correctness


In the last lecture, we discussed the conversion of Nondeterministic Finite Automata (NFA) to its
equivalent Deterministic Finite Automata (DFA)

3.4.1 Claim
We aim to demonstrate the equivalence of the languages accepted by NFA A and DFA A′ , denoted
as L(A) and L(A′ ) respectively. In other words, we want to show that L(A) = L(A′ ), where A
represents the original NFA and A′ represents the DFA obtained through subset construction from
NFA A. Alternatively, we can establish that L(A) ⊆ L(A′ ) and L(A′ ) ⊆ L(A), which implies
L(A) = L(A′ ).
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 48

3.4.2 Proof
We’ll prove this claim by showing that for all n ≥ 0 and for every word w ∈ Σ∗ with |w| = n, NFA
A can reach state q ∈ Q 1 upon reading w if and only if DFA A′ reaches state S ⊆ Q such that
q ∈ S.
Given the condition that for all n ≥ 0, and for every word w, where w ∈ Σ∗ such that |w| = n, we
aim to demonstrate the equivalence between the NFA A’s ability to reach state q ∈ Q upon reading
word w and the DFA A′ ’s capability to reach state S ⊆ Q, where q ∈ S. In other words, we want to
show that words in the language recognized by NFA A reach a certain state (say, a final state q), if
and only if words in the language recognized by DFA A′ reach state S (where S ⊆ Q, q ∈ S, and
S is a final state in the equivalent DFA A′ ).
Thus, we can establish that the set of words accepted by NFA A is equivalent to the set of words
accepted by DFA A′ , implying L(A) = L(A′ ), thereby validating our claim.
We will demonstrate this by induction on n.

1. Base case: When n = 0, i.e., |w| = 0, it essentially means that we are at the initial state of
the automaton. The initial state of the DFA A’ is represented by the singleton set containing
the initial states of the NFA A. So, definition of initial state of DFA A’ satisfies the claim.

2. Induction hypothesis: Assume the claim holds for all 0 ≤ n < k for some k > 0.

3. Inductive step: We’ll show that the claim holds for n = k.


Since |w| = k, let w = w′ .a (w′ concatenate a) , where w ∈ Σ∗ , w′ ∈ Σ∗ , a ∈ Σ, and |w′ | = k−1.

By hypothesis , there exists the following path in both automata :


w′

This basically means that there exists some path in NFA and equivalent DFA which is as
follows :

In NFA A :
q0 q

In DFA A’ 2 :

. . . q0 ...q

Now, if a symbol a, a ∈ Σ comes, word w is formed :


w′ a

w
1
Q is the set of states in our original NFA A.
2
. . . q0 ⇒ set containing initial states of the original NFA A
...q ⇒ ⊆ Q containing the state q, where Q is the set of states of the original NFA A
. . . q̂ ⇒ ⊆ Q containing the state q̂, where Q is the set of states of the original NFA A
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 49

NFA A will reach some state q̂


a
q0 q q̂

DFA A′ will reach some state . . . q̂ (. . . q̂ ⊆ Q and contains q̂)


a
. . . q0 ...q . . . q̂

This is because state . . . q of DFA A′ contains q, which transitions DFA A′ to state . . . q̂ (a set
containing q̂) when symbol a is encountered.
Thus, we have shown that for all n ≥ 0, and for every word w where w ∈ Σ∗ such that |w| = n,
NFA A can reach state q ∈ Q on reading word w AND DFA A′ reaches state S ⊆ Q such
that q ∈ S.

Hence, proved the condition and the claim.


Remarks:

(a) As the length of the word increases, the number of choices for state transitions in the
NFA grows exponentially.
(b) If an NFA has N states, the equivalent DFA can have an exponential number of states in
the worst case.

3.5 Compiler Special Case – Lexical Analyzer

INPUT Lexical Tokens Machine Code OUTPUT


Parser
Source Code Analyser Generator

Figure 3.1: A Simple Compiler

In simpler terms, a compiler is composed of three main components: a lexical analyzer, a parser,
and a code generator. Its primary function is to process a sequence of characters as input. The
lexical analyzer breaks down a sequence of characters into tokens, which are then analyzed by a
parser. To accomplish this, the lexical analyzer employs a NFA to determine whether a particular
state can be reached in the corresponding DFA and traces a path accordingly. When faced with a
long string, finding the final state can be challenging. At each step, there are multiple choices to
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 50

explore. However, as the length of the input increases, exhaustively exploring each choice becomes
increasingly difficult. Converting an NFA to a DFA may result in an exponential increase in states.
However, itâs important to note that not all states are necessary to reach the final state. This
excessive expansion of states may lead to unnecessary complexity, creating the entire DFA when itâs
not actually required.
So, the lexical analyzer determines whether there exists a path in the automaton that leads to an
accepting state for a given word without constructing the equivalent DFA.
Suppose the Lexical Analyser of Compiler have to check whether a GIVEN WORD is accepted by
the following NFA.

a b b a
a

start q0 q1 q2

b a

Figure 3.2: Lexical analyser Automaton

Question: How can a lexical analyzer determine whether a given word leads to an accepting state
without constructing the equivalent DFA?
Answer:

a b a
{q0 } {q0 , q1 } {q0 , q1 , q2 } {q0 , q1 , q2 }

{q0 , q1 , q2 } {q0 , q1 , q2 } {q0 , q1 , q2 }


b a

Lexical Analyser can track the set of states the NFA could be in after reading each symbol of the
GIVEN WORD, updating the state based on the transitions specified by the GIVEN NFA.

• If the final set of states contains at least one accepting state of the original NFA, then there
exists a successful path for the given word to reach an accepting state, indicating acceptance
by the given automaton.

• The time complexity for this process is O(n · k), where n represents the size of the automaton
and k denotes the length of the word. This complexity is significantly lower than the usual
exponential time complexity observed in similar processes.3

• Here the GIVEN WORD is accepted by the GIVEN NFA4


3
Size of automaton = No. of states in Automaton + No. of Transition Arrows in Automaton
4
q2 is contained in the last set
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 51

3.6 NFA with ϵ−edges


A variation of NFA is NFA with ϵ edges, it may expand the language by making words accepted
that were otherwise unaccepted.

q1 ϵ

0
1
0
ϵ
1

start q0
ϵ
q2

Figure 3.3: Automaton with Epsilon Edges

ϵ−edges bring non-determinism to the NFA, as you can sit on a node and take one of the ϵ−edges
possible from that node to jump to another node without consuming any letter of the input.
Figure 3.3 shows how ϵ edges are used to connect states of an automaton for free(without consuming
any letter from input), we can see that 10 ∈
/ L without the ϵ edge between q0 and q1 , but with the
presence of this ϵ edge 10 ∈ L.
ϵ−edge also allows us to connect two automatons. In Figure 3.4, the accepting states of L1 are
connected to the start states of L2 automaton, which generates an automaton for accepting L1 · L2 .
Here · is the concatenation operator. So if ω1 ∈ L1 and ω2 ∈ L2 , then ω1 · ω2 ∈ L1 · L2 will be
accepted by this new automaton.

Figure 3.4: L1 accepting automaton and L2 accepting automaton connected

Now, we will try to find an equivalent DFA of this NFA having ϵ edges. For that, we will first find
an NFA without ϵ edges which will preserve the original NFA, and this obtained NFA can then be
constructed into a DFA.
Initially, just look at the ϵ−edges only and find for each state its ϵ−closure, which is the set of the
states that we can reach from it by taking only ϵ−edges.
From each node, we can go to every node that is present in the ϵ−closure of that node for free. So
wherever non-ϵ edges take us from these nodes in ϵ−closure, we can reach from the node itself whose
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 52

closure it was. So all these states will be connected to this node, in the new NFA.
The starting states and the final states of this new NFA will be the ϵ−closures of the start and final
states respectively of the original NFA.
So NFA without ϵ−edges for the NFA in Figure 3.3 will look like as shown in Figure 3.5

0,1

start q1

0 0, 1
0
0, 1
0, 1 0
0, 1
0
start q0 q2

Figure 3.5: Automaton without Epsilon Edges

3.6.1 Equivalence with DFA


• If we can show that ϵ-NFA have an equivalent NFA, then the equivalence of ϵ-NFA and DFA
is proved because every NFA has an equivalent DFA5 .

1 ϵ
ϵ,0,1

start q0 q1

ϵ 0

q2

Figure 3.6: A Simple ϵ-NFA

Focus on ϵ-Edges of the Automaton to get the Epsilon Closure of the states in an Automaton
For Figure-3.6 Automaton:
5
equivalence of NFA and DFA is already proved
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 53

q0 {q0 , q1 } ϵ-closureq0

q1 {q1 } ϵ-closureq1

q2 {q0 , q1 , q2 } ϵ-closureq2

Rules for converting ϵ-NFA to equivalent NFA6 :

1. Non-epsilon transitions for a symbol, denoted as a ∈ Σ, originating from any state q ′ within
the epsilon closure of a state, say q, will also be present as non-epsilon transitions for state
q in the equivalent NFA. These transitions leads to the same destination states for q in new
NFA as they were for the state q ′ in the original epsilon-NFA.

2. The states in the epsilon closure of the initial state in the epsilon-NFA can serve as the initial
state in the new NFA. However, this is not necessary, we can make an equivalent NFA where
ϵ-NFA and the new NFA have same initial state(s).

3. All states within the epsilon-NFA that include the accepting state in their epsilon closure will
also act as final states in the new/equivalent NFA.

For the Figure-3.6 epsilon-NFA, after applying the aforementioned rules, we obtain the following
equivalent NFA:
6
These rules will become more clearer in the next class, when we will learn about leading epsilon transitions and
trailing epsilon transitions within the states of ϵ-NFA.
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 54

0,1
0
start q0 start q1 0,1

0,1

0 0,1

0,1 0

q2

Figure 3.7: Equivalent NFA for Figure-3.6

3.7 Recap
We were trying to convert an NFA with ε edges to an equivalent NFA without ε edges in the previous
lecture. We’ll do that in more detail in this lecture.

3.7.1 Converting ε-NFA to non-ε-NFA


• ε-closure of a node is defined as set of those nodes which can be reached from that node by
traversing over ε-edges, i.e. without consuming any character from the alphabet Σ. Also, the
node itself is trivially a part of its ε-closure.

• ε-edges are those edges which can be traversed without consuming any character from the
alphabet Σ, i.e. by consuming an empty string . Observe that the string "10" ∈ L with ε-edges
but without ε-edges, string "10"∈
/ L where L is the Language of the NFA.
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 55

1 ε 0,1

start q0 q1 ε

ε 0

q2

Figure 3.8: A Simple ε-NFA with alphabet Σ = {0, 1}

For example in this NFA,


ε-closure(q0 ) = {q0 , q1 }
ε-closure(q1 ) = {q1 }
ε-closure(q2 ) = {q0 , q1 , q2 }
Now for the algorithm to convert an NFA with ε-edges to an equivalent NFA without ε-edges with
the same alphabet Σ = {0, 1} and with the same states, apply the following three steps:-

1. For each node q in NFA, find its ε-closure (say S) and mark all its non-ε-edges as they are.
Now, for each node q ′ ̸= q in S, mark all non-ε-edges starting from q ′ going to q ′′ as extra edges
starting from q going to q ′′ . For eg., for the node q0 , the only distinct node in its ε-closure is
q1 so mark the edges {0, 1} from q1 to q1 , {0} from q1 to q0 and {0} from q1 to q2 as extra
edges {0, 1} going from q0 to q1 , {0} from q0 to q0 and {0} from q0 to q2 respectively (marked
in red).

2. Mark all those states as accepting whose ε-closures contain atleast one of the accepting states
of the ε-NFA. For eg., here only q2 has the accepting state q2 in its ε-closure, so only q2 is
marked as accepting.

3. Starting states in the new non-ε-automaton will be the same as the starting states in the
original ε-automaton.
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 56

1 0,1 0,1

start q0 q1

0 0

0,1
0,1
0
0

q2

0
Figure 3.9: Equivalent non-ε-NFA

Subsequently, we’ll use "original/ori" for the ε-automaton and "new" for the non-ε-automaton.
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 57

• One may ask why the states which are in the ε-closures of the accepting states of the original
automaton are not marked as accepting in the non-ε-NFA. The answer is :- every string which
reaches an accepting state a can also reach any of the states in ε-closure(a) by taking ε-edges and
those strings are indeed in the language of the ε-NFA. But there are many such strings also which
can reach at the states in the ε-closure(a) which are not in the language of the ε-NFA. So, if we
mark those states as accepting, we are changing the language which is not our intention. For eg.,
q0 ∈ ε-closure(q2 ) but if we mark q0 as accepting in the non-ε-NFA, then the string "1" will also
be included in the language of the non-ε-NFA but "1" ∈ / L(ε-NFA).

3.7.2 Correctness of Algorithm


• If we say w ∈ L(ori) it means that either w as it is ∈ L(ori) or w with some ε’s inserted
∈ L(ori). On the other hand, if we say w ∈ L(new) then it means that w as it is ∈ L(new).

Now, what we mean by correctness of the algorithm is that the language of the new non-ε-automaton
should exactly be equal to the language of the original ε-automaton. So we need to prove that

L(new) = L(ori)

1. Let’s prove L(new) ⊆ L(ori)


Let’s take any arbitrary string w = c1 c2 . . . cm ∈ L(new)

(a) Case 1: w is empty


It means atleast one of the starting states s in the new automaton is an accepting state.
Now, according to our algorithm (step 2), all accepting states in the new automaton are
either accepting states in the original automaton as well or are those states which have
atleast one accepting state in the original automaton in their ε-closures. So, ε-closure(s)
definitely contains an accepting state in the original automaton. So, we have a path εk for
some k ≥ 0 from s to an accepting state in the original automaton and thus w ∈ L(ori).
(b) Case 2: w is non-empty
It means that we have a sequence of jumps from a starting state s0 in the new automaton
to s1 upon reading c1 and then from s1 to s2 upon reading c2 and so on till the state sm
upon reading cm such that sm is an accepting state. Now, according to our algorithm
(step 1), for s0 in the original automaton either we have a direct edge {c1 } from s0 to s1
or we have some s′0 ∈ ε-closure(s0 ) which has the edge {c1 } from s′0 to s1 . So, effectively
we can reach s1 from s0 in the original automaton as well by following a path εk c1 for
some k ≥ 0. Similarly, we can have a path εk1 c2 εk2 c3 . . . for ki ≥ 0 till sm in the original
automaton. Now, according to our algorithm (step 2), either sm is an accepting state in
the original automaton as well or there lies some accepting state in the ε-closure(sm ). So,

finally, we can have a sequence of εk for some k ′ ≥ 0 to reach to an accepting state from
sm in the original automaton and thus, w ∈ L(ori).
Hence, proved.

2. Let’s prove L(ori) ⊆ L(new)


Let’s take any arbitrary string w = c1 c2 . . . cm ∈ L(ori)

(a) Case 1: w is empty


It means that there exists atleast one starting state s in the original automaton such
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 58

that ε-closure(s) contains an accepting state s′ . Now, according to our algorithm (step
2), all accepting states in the new automaton are either accepting states in the original
automaton as well or are those states which have atleast one accepting state in the original
automaton in their ε-closures. So, the starting state s is an accepting state in the new
automaton and thus w ∈ L(new).
(b) Case 2: w is non-empty
It means that we have a path εk0 c1 εk1 c2 . . . εkm−1 cm εkm for all ki ≥ 0 ∀i ∈ {0, 1, 2 . . . , m}
from a starting state s0 in the original automaton to an accepting state sm+1 where taking
εki from any state x means going to some state y ∈ ε-closure(x). Here, s1 is the state
reached after reading c1 , s2 after reading c2 . . . sm after reading cm and sm+1 after taking
εkm . Now, according to our algorithm (step 1), for s0 in the new automaton, we have
a direct edge {c1 } from s0 to s1 . Similarly, we’ll have direct edges from s1 to s2 and so
on. So, finally, we can reach sm from s0 in the new automaton. Now, according to our
algorithm (step 2), sm is an accepting state in the new automaton because ε-closure(sm )
contains accepting state sm+1 in the original automaton. Thus, w ∈ L(new).
Hence, proved.

Thus, we’ve shown that L(new) = L(ori)

3.7.3 Intuition of the algorithm


• It should be clear from the correctness proof itself where we constructed paths in the new
and original automatons using the step 1 of the algorithm. Also, if the path in the original
automaton had trailing ε’s we weren’t able to reach the same state in the new automaton but
had to make the languages of the two automatons exactly same so we concluded step 2 of the
algorithm.

3.7.4 Extras
• Language of a node q is defined as the set of all those strings w ∈ Σ∗ such that upon reading
w character by character we can reach q from any of the starting states of the automaton.

• Note that in ε-automaton we could also take ε-edges in between the characters, before the
first character as well as after the last character of w and reach q so such strings would also
be considered in the language of q.

• And the language of an automaton is defined as the union of the languages of all its accepting
nodes. So, we have
[
L(new) = L(qi ) ∀ accepting states qi of the new non-ε-automaton
i
[
L(ori) = L(qj ) ∀ accepting states qj of the original ε-automaton
j

• A lemma: L(qi )original ⊇ L(qi )new holds true where qi is any node of the NFA and L(qi ) is the
language of that node.
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 59

• Proof: Going by the same idea as we did in Correctness proof part 1, we can prove the lemma.
Like if w ∈ L(qi )new and w is empty then qi must be one of the starting states in the new
automaton and also in the original automaton so w ∈ L(qi )original since starting states are the
same in both the automatons. If w ∈ L(qi )new and w is non-empty, then we can have a path
with some ε’s in between the characters of w from some starting state s0 to qi in the original
automaton and thus L(qi )original ⊇ L(qi )new is indeed true.
Where the ⊃ sign comes in is the case when the path of w has some trailing ε’s. For eg.
consider this ε-NFA and its equivalent non-ε-NFA,

0,1 0

q0 1 q1 ε q2
start
Figure 3.10: ε-NFA

0,1 0

q0 1 q1 q2
start
Figure 3.11: Equivalent Non-ε-NFA

L(q2 )original is non-empty and for instance contains the string "1" with the path "1ε" from
q0 to q1 but L(q2 )new is clearly empty,i.e., the null set ϕ. This happened because the path
contained a trailing ε. So, L(q2 )new ⊂ L(q2 )original .

3.7.5 Significance of ε-edges


ε-edges allow us to jump from one part of the NFA to another part of the NFA without consuming
any additional character. Thus, if we have to do some operations sequentially we can make our
life simple by using ε-edges. For eg., suppose we have to check w = u.v such that u contains
an even number of 1’s and v contains 1 mod 3 number of 1’s. We can accomplish the above task
by constructing two NFA’s wherein the first NFA will accept all the satisfying u’s and the second
NFA will accept all the satisfying v’s. Now, we’ll just join the accepting states of first NFA to the
starting states of the second NFA using ε-edges and convert the accepting states of first NFA & the
starting states of the second NFA into normal intermediate states. Thus, we will get a single NFA
with starting states same as the starting states of the first NFA and accepting states same as the
accepting states of the second NFA. Here, ε-edges allowed us to capture the non-determinism of the
breaking point between u & v.
* Any automaton containing ε-edges cannot be a DFA because ε-edges bring in uncertainty as we
could choose to stay in that state or take the ε-edge.
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 60

3.7.6 Examples
• L = {x ∈ Σ∗ |x = u.v.w, v ∈ Σ∗ , u, w ∈ Σ+,|u|≤2,u=w} where Σ = {0, 1, 2 . . . 9}
For instance, "0000" ∈ L because either take u = w = 0, v = 00 or take u = w = 00, v = ε
Also, "1234" ∈/ L because we cannot have any satisfiable u, v, w.
It’s NFA will be a combination of 110 NFA’s of the following form :-

0, 1, . . . 9

q0 0 q1 ε q2 ε q3 0 q4
start

Figure 3.12: ε-NFA for u = w = 0

0, 1, . . . 9

q0 0 q1 ε q2 ε q3 0 q4
start

1 1
0, 1, . . . 9

ε ε
q1′ q2′ q3′

Figure 3.13: ε-NFA for u = w = 0, 1

One for each u = w = 0, 1, 2 . . . 9 so 10 here and 100 more for u = w = 00 to 99. We can have
separate NFA’s with different possible q1′ , q3′ states. Like instead of 0 put 1 to 9 and 00 to 99
there on the edge from q0 to q1′ and q3′ to q4 and in this way we can form the whole NFA just
like in figure 6.

• L = .∗ xxx.∗ where Σ = {a, b, . . . z}


Here, we want to locate "xxx" in the text so the following NFA captures this language:-

Σ Σ

q0 ε q1 x q2 x q3 x q4 ε q5
start

Figure 3.14: ε-NFA for searching "xxx"

• Takeaway task : Think about how KMP implicitly constructs NFA for pattern matching.
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 61

3.8 Equivalence in Finite Automata


In our exploration of finite automata, we’ve encountered deterministic finite automata (DFA), non-
deterministic finite automata without epsilon transitions (NFA), and non-deterministic finite au-
tomata with epsilon transitions (ε-NFA). Surprisingly, despite their apparent differences, these three
models are fundamentally equivalent in terms of computational power.
DFA, characterized by their deterministic nature, are particularly useful in scenarios where deter-
minism is crucial, such as in hardware design where predictability is paramount.
ε-NFA, on the other hand, introduce a high level of abstraction by allowing transitions without
consuming input symbols, which can simplify the representation of certain languages and aid in
conceptual clarity.
NFA without epsilon transitions also provide a similarly high level of abstraction, allowing for flexi-
bility in modeling complex systems and languages.

3.9 DFA definition


Any DFA can be completely represented by the tuple DFA = (Q, Σ, q0 , δ, F ) where,
• Q is the set of all states
• Σ is the alphabet
• q0 is the starting state
• δ : (Q × Σ) → Q is the transition function
• F is the set of final or accepting states

3.9.1 Combinations of DFAs


We can represent the combination of two DFAs defined on the same alphabet Σ, DFA1 = (Q, Σ, q0 , δ, F )
and DFA2 = (Q′ , Σ, q0′ , δ ′ , F ′ ) by another DFA,
 
DFA3 = Q × Q′ , Σ, q0 , q0′ , δ̂, F̂ (3.1)


where δ̂ ((q, r), a) = (δ(q, a), δ ′ (r, a)) and F̂ can be defined according to the required operation on
the DFAs.

For example, if Q = {q0 , q1 } and Q′ = {r0 , r1 , r2 } and F = {q1 } and F ′ = {r1 , r2 } then for
the intersection of these Automata, F̂ = {(q1 , r1 ), (q1 , r2 )} and for the union of the Automata,
F̂ = {(q1 , r0 ), (q1 , r1 ), (q1 , r2 ), (q0 , r1 ), (q0 , r2 )}. Similarly we can define the complement operation
by taking F̂ = Q − F .

With these basic rules in place, we can go on to define more complicated combinations like
(DF A1 ∩ DF A3 ) ∪ (DF A2 ∩ DF A3 ) − (DF A1 ∩ DF A2 )

We now have another way to tell if a language is a subset of another language. To tell if L1 ⊆ L2 ,
we have to show that L1 ∩ Lc2 = ϕ. The problem thus reduces to showing that in the DFA defined
by DF A1 ∩ DF Ac2 there is no path from the start node to any accepting state.
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 62

0 0

1
start q0 q1
1

Figure 3.15: Two-State Automaton


0

q1
1 0 1
0
start q0 q2
1

Figure 3.16: Three-State Automaton


0

0
start q00 q10 q20
0
1
1 1 1
1 1
q01 0 q11 q21
0

Figure 3.17: Intersection of the two automata

3.9.2 Combinations of NFAs


For two NFAs, the union and intersection is defined in a similar way with the only difference being
the transition function since it now returns a set of states instead of a single state.

δ̂ ((q, r), a) = δ(q, a) × δ ′ (r, a) (3.2)

However, for NFAs, just flipping the accepting and non-accepting states won’t give us the comple-
ment.
Thus to take the intersection or union of two NFAs, we can follow a similar approach to DFAs but
for complementation, the NFA must first be converted to a DFA and then complemented to give the
actual complement of the original NFA.

3.9.3 Closure Properties


Given L1 , L2 are two regular languages over the alphabet Σ = {a, b}, L1 , L1 ∪ L2 , L1 ∩ L2 are also
regular languages (because their corresponding DFAs can be constructed by the combination of the
original DFAs).
CHAPTER 3. NON-DETERMINISTIC FINITE AUTOMATA (NFA) 63

q1 q1
a a

start q0 start q0

a a
q2 q2

Figure 3.18: These NFAs are not complements of each other

3.10 Substitution
We will start with an example.

Consider two alphabets Σ1 = {a, b} and Σ2 = {0, 1, 2} and languages L1 = a∗b defined on Σ∗1 and
∗1∗ ∗
La = 0∗(1+2) and Lb = 1∗(0+2) defined on Σ∗2 .
Now we define a set subst(L1 , La , Lb ) as
∗ | ∃u∈L1 =α1 α2 ...αk such that w∈Lα1 Lα2 ...Lαk
n o
subst(L1 , La , Lb ) = w ∈ Σ2 (3.3)

Or equivalently, subst(L1 , La , Lb ) =
S
Lα1 Lα2 . . . Lαk
u=α1 α2 ...αk ∈L1
Intuitively, the substitution operation is to replace each letter by a language.
Diagrammatically, it is represented as replacing each edge by an entire automaton and connecting
the initial and accepting states to the original states by ϵ edges.

Using this operation, we can easily prove that if L1 and L2 are regular languages then L1 · L2 is also
regular, since L = {a · b} is regular then subst(L,L1 ,L2 ) will also be regular.

3.10.1 Infinite Languages


If we have a DFA with n states and there exists some string of length greater than n which is
accepted by the DFA, then by the pigeon-hole principle there must exist some state q in the DFA
such that there is a cycle with q in it. Suppose that the accepting string is u · v · w, where v is
the string that starts and ends at the same state, then the strings u · w, u · v 2 · w, u · v ∗·w are also
accepted in the DFA. Hence the language formed by the string is of infinite length.
Chapter 4

Regular Expressions

4.1 Introduction
In arithmetic, we can use the operations + and × to build up expressions such as (5 + 3) × 4.
Similarly, we can use the regular operations to build up expressions describing languages, which are
called regular expressions. An example is: (0 ∪ 1)0∗ . The value of the arithmetic expression is the
number 32. The value of a regular expression is a language. In this case, the value is the language
consisting of all strings starting with a 0 or a 1 followed by any number of 0s. We get this result
by dissecting the expression into its parts. First, the symbols 0 and 1 are shorthand for the sets
{0} and {1}. So (0 ∪ 1) means ({0} ∪ {1}). The value of this part is the language {0, 1}. The
part 0∗ means {0}∗ , and its value is the language consisting of all strings containing any number of
0s. Second, like the × symbol in algebra, the concatenation symbol ◦ often is implicit in regular
expressions. Thus (0 ∪ 1)0∗ actually is shorthand for (0 ∪ 1) ◦ 0∗ . The concatenation attaches the
strings from the two parts to obtain the value of the entire expression. Regular expressions have an
important role in computer science applications. In applications involving text, users may want to
search for strings that satisfy certain patterns. Regular expressions provide a powerful method for
describing such patterns. Utilities such as awk and grep in UNIX, modern programming languages
such as Perl, and text editors all provide mechanisms for the description of patterns by using regular
expressions.

4.2 Formal Definition of a Regular Expression


Let R be a regular expression if:

1. a for some a in the alphabet Σ,

2. ε,

3. ∅,

4. (R1 ∪ R2 ), where R1 and R2 are regular expressions,

5. (R1 · R2 ), where R1 and R2 are regular expressions, or

6. (R1∗ ), where R1 is a regular expression.

64
CHAPTER 4. REGULAR EXPRESSIONS 65

In items 1 and 2, the regular expressions a and ε represent the languages {a} and {ε}, respectively. In
item 3, the regular expression ∅ represents the empty language. In items 4, 5, and 6, the expressions
represent the languages obtained by taking the union or concatenation of the languages R1 and R2 ,
or the Kleene star of the language R1 , respectively.

4.3 Semantics of Regular Language


The semantics of regular language involves understanding the structure and meaning of expressions
within the language.

4.3.1 Atomic Expressions


Consider an atomic expression [a], where a represents a single letter. In this context, [a] denotes the
language consisting of only the string a. However, it’s important to note that the a within [a] is not
considered a letter of the alphabet; rather, it signifies a language comprising a single letter string.

4.3.2 Union Operation


For expressions e1 and e2 , [e1 + e2 ] represents the union of the languages denoted by [e1 ] and [e2 ].
In simpler terms, [e1 + e2 ] encompasses all strings that belong to either [e1 ] or [e2 ].

4.3.3 Concatenation Operation


When considering e1 concatenated with e2 , denoted as [e1 .e2 ], it signifies the concatenation of
languages represented by [e1 ] and [e2 ]. This operation results in a language consisting of all possible
combinations of strings where the first part belongs to [e1 ] and the second part belongs to [e2 ].

4.3.4 Example
Let’s illustrate with an example: (ab) + a. This expression represents the union of the language
containing the string ab and the language containing the string a, resulting in {ab, a}.

4.3.5 Order of Precedence


In the semantics of regular language, the order of precedence for operations is as follows: ∗ (Kleene
star) > · (concatenation) > + (union).

4.3.6 Kleene Star


The Kleene star operation [e∗1 ] denotes the union of zero or more concatenations of [e1 ]. In other
words, [e∗1 ] encompasses all possible strings that can be formed by concatenating any number of
strings from [e1 ].

4.3.7 Example
For an expression e1 = a + b, the language denoted by [e1 ] is {a, b}. Therefore, [(a + b)∗ ] represents
the set of all possible strings comprising as and bs, including the empty string ε, a, b, ab, ba, aa, bb,
and so on.
CHAPTER 4. REGULAR EXPRESSIONS 66

4.3.8 Further Examples


• [a∗ + b∗ ] = {u ∈ Σ∗ | u = an or u = bn for some n ≥ 0}

• [a∗ · b∗ ] = {u ∈ Σ∗ | u = an bm and n ≥ 0 and m ≥ 0}

• [(a∗ · b∗ )∗ ] represents the set of all strings containing any number of occurrences of strings
composed of as followed by bs.

• 0∗ 10∗ = {w | w contains a single 1}

• Σ∗ 1Σ∗ = {w | w has at least one 1}

• Σ∗ 001Σ∗ = {w | w contains the string 001 as a substring}

• 1∗ (01+ )∗ = {w | every 0 in w is followed by at least one 1}

• (ΣΣ)∗ = {w | w is a string of even length}

• (ΣΣΣ)∗ = {w | the length of w is a multiple of 3}

• 01 ∪ 10 = {01, 10}

• 0Σ∗ 0 ∪ 1Σ∗ 1 ∪ {0, 1} = {w | w starts and ends with the same symbol}

• (0 ∪ ε)1∗ = 01∗ ∪ 1∗

• (0 ∪ ε)(1 ∪ ε) = {ε, 0, 1, 01}

• 1∗ ∅ = ∅

• ∅∗ = {ε} The star operation puts together any number of strings from the language to get a
string in the result. If the language is empty, the star operation can put together 0 strings,
giving only the empty string.

4.4 Kleene’s Theorem


In the last lecture, we introduced regular expressions and saw how they can be used to represent
languages. In this lecture, we explore one of the most fundamental theorems of Automata Theory,
Kleene’s Theorem.

Kleene’s Theorem: In terms of expressive power,

Regular Expressions ≡ NFAs with ϵ edges ≡ NFAs without ϵ edges ≡ DFAs

In previous lectures, we have already proved the equivalence of NFAs with ϵ edges, NFAs without ϵ
edges and DFAs. So in order to prove Kleene’s Theorem, it suffices to show the equivalence between
Regular Expressions and NFAs with ϵ edges.

To prove: Regular Expressions ≡ NFAs with ϵ edges


i.e., L(Reg. Ex.) = L(NFAs with ϵ edges)
CHAPTER 4. REGULAR EXPRESSIONS 67

The proof can be divided into two subparts

1. L(Reg. Ex.) ⊆ L(NFAs with ϵ edges)

2. L(Reg. Ex.) ⊇ L(NFAs with ϵ edges)

We prove each of these parts separately and then combine them to get the required result.

4.4.1 Part 1: L(Reg. Ex.) ⊆ L(NFAs with ϵ edges)


Let us consider the following example.

Suppose our alphabet is Σ = {0, 1}


and our regular expression is ((0.1)∗ +(1.0)∗ )∗ .(1.1+0.0)∗

We wish to construct an NFA with ϵ edges that accepts the same language as our regular expression.

To do this, we construct the parse tree of the regular expression and use a bottom up approach
starting from the leaves to construct an NFA for each node of the parse tree.

The parse tree of the given regular expression is as follows.

[.· [.∗ [.+ [.∗ [.· [.0 ] [.1 ] ] ] [.∗ [.· [.1 ] [.0 ] ] ] ] ] [.∗ [.+ [.· [.1 ] [.1 ] ] [.· [.0 ] [.0 ] ] ] ] ]

We start from the leaves, in this case the nodes labelled 0 and 1. Consider the leaf 0. It represents
a language with only one string, i.e., L(0) = {0}. It can be represented by the NFA

0
start q0 q1

Figure 4.1: NFA that accepts only {0}

Similarly leaf 1 can be represented by the NFA

1
start q0 q1

Figure 4.2: NFA that accepts only {1}

Now we have NFAs for each of the leaf nodes. Let us see how to construct the NFAs for the rest of
the nodes from these.

Suppose we have the NFAs of regular expressions a and b, and we want to obtain the NFA of
a.b. Since . is simply the concatenation operator, the new NFA should accept all strings that
satisfy both the NFAs in sequence. Such an NFA can be constructed by connecting the final state
of a and the starting state of b (and marking them as regular states) by an ϵ edge. This results in
CHAPTER 4. REGULAR EXPRESSIONS 68

a new NFA whose starting state is the starting state of a and final state is the final state of b. The
language of this NFA is simply a.b.

In out example, given the NFAs of 0 and 1 we want to construct the NFA of 0.1. Following
the above procedure, we obtain the NFA

0 ϵ 1
start q0 q1 q2 q3

Figure 4.3: NFA for 0.1

From the parse tree, the next node we need to construct an NFA for is (0.1)∗ .

The language accepted by a∗ is simply the set of strings in which strings in the language of a
are concatenated with themselves multiple (possibly 0) times. To obtain the NFA for a∗ , we take
the NFA of a, introduce a new state that is both starting and accepting, and draw ϵ edges from the
original final state to the new state and from the new state to the original starting state. We also
mark the original starting and accepting states as normal states. The correctness of this procedure
can be proved by induction.

Applying this to 0.1, we obtain the NFA for (0.1)∗

ϵ 0 ϵ 1
start q0 q1 q2 q3 q4

Figure 4.4: NFA for (0.1)∗

Similarly we can construct the NFA for (1.0)∗

ϵ 1 ϵ 0
start q0 q1 q2 q3 q4

Figure 4.5: NFA for (1.0)∗

Now the next step in our recursive contruction of parse tree nodes is to obtain the NFA for (0.1)∗
+ (1.0)∗ .

The language accepted by a+b is simply the union of the language accepted by a and the lan-
guage accepted by b. So in order for a string to be accepted by the NFA of a+b, it must be
CHAPTER 4. REGULAR EXPRESSIONS 69

accepted by either the NFA of a or the NFA of b. Hence the required NFA can be obtained by
introducing a new start state, drawing ϵ edges from it to the starting states of the original NFAs,
and marking the original NFA starting states as normal states.

Applying this procedure to (0.1)∗ and (1.0)∗ , we get the NFA of (0.1)∗ + (1.0)∗

ϵ ϵ

q0 ϵ q1 0 q2 ϵ q3 1 q4 r0 ϵ r1 1 r2 ϵ r3 0 r4
ϵ

s
ϵ
start

Figure 4.6: NFA for (1.0)∗ +(0.1)∗

Using the same procedures, we can construct NFAs for all the remaining nodes of the parse tree.
We observe that the size of the NFA is linear in the size of the original regular expression.

4.4.2 Part 2: L(Reg. Ex.) ⊇ L(NFAs with ϵ edges)


Once again we consider an example NFA with ϵ edges and convert it to a regular expression. Now
we have Σ = {a, b} and the following NFA

start q0 q1
b
a a

a
q2
b
b
b
a a,b

start q3 q4 b

Our strategy to obtain a regular expression is to gradually reduce the 5 state automaton to one
with fewer states, and keep repeating this process until we end up with a regular expression. To
perform this reduction, we relax a condition on NFAs; we now allow edges to be labelled by regular
expressions. An edge labelled by a regular expression simply means a path in which we consume a
string satisfying the regular expression.
CHAPTER 4. REGULAR EXPRESSIONS 70

Algorithm

Step 1: Create a new starting state and get rid of all the old ones. Connect this new state to
the old starting states using ϵ edges.

q0 q1

ϵ b
a a

start qs a
q2
b
b
ϵ b
a a,b

q3 q4 b

Step 2: Create a new final state and get rid of the old ones. Connect all the old final states to the
new state using ϵ edges.

q0 q1

ϵ b ϵ
a a

start qs a qf
q2
b
b
ϵ b ϵ
a a,b

q3
a
q4 b

Observe that our NFA now has one single initial state with no edges leading back to it, and one
single accepting state with no edges going out of it.

Step 3: Systematically remove all the states except for the inital and accepting states.

Suppose we choose state q2 to be removed first. This state was facilitating some strings’ paths
from the initial to the accepting state. In order to remove q2 , we must first create alternate paths
not involving q2 that allow these strings to reach the final state. To do this, we follow the approach
below,
CHAPTER 4. REGULAR EXPRESSIONS 71

1. Choose an incoming transition of q2 . In our example, suppose we choose the transition from
q0 to q2 on consuming a.
2. Now consider all the outgoing edges of q2 , say from q2 to qx . Our goal is to facilitate transitions
from q0 to qx without involving q2 . This can be done by introducing direct transitions from q0
to qx and labelling them with the regular expressions formed by concatenating label(q0 − q2 )
and label(q2 − qx ), and then removing the incoming edge.

Applying step 2 to q2 ,

Observe that q2 has a self loop labelled a,b, which can be replaced by regular expression
a+b. This means that while concatenating label(q0 − q2 ) and label(q2 − qx ), we must also
account for the regular expression (a+b)∗ in between the labels.
(a) Outgoing edge: q2 to q0 on consuming b

a.(a+b)∗ .b

q0 q1
b
ϵ a a
ϵ
q2
a
start qs b
b b qf
ϵ a,b
a ϵ

q3
a
q4 b

Figure 4.7: (a)

(b) Outgoing edge: q2 to q1 on consuming a

a.(a+b)∗ .b a.(a+b)∗ .a

q0 q1
b
ϵ a a
ϵ
q2
a
start qs b
b b qf
ϵ a,b
a ϵ

q3
a
q4 b

Figure 4.8: (b)


CHAPTER 4. REGULAR EXPRESSIONS 72

(c) Outgoing edge: q2 to q3 on consuming a

a.(a+b)∗ .b a.(a+b)∗ .a

q0 q1
b
ϵ a a
ϵ
b q2
a
start qs a.(a+b)∗ .a
b b qf
ϵ a,b
a ϵ

q3
a
q4 b

Figure 4.9: (c)

(d) Outgoing edge: q2 to q4 on consuming a

a.(a+b)∗ .b
a.(a+b)∗ .b a.(a+b)∗ .a

q0 q1
b
ϵ a a
ϵ
b q2
a
start qs a.(a+b)∗ .a
b b qf
ϵ a,b
a ϵ

q3
a
q4 b

Figure 4.10: (d)

The NFAs corresponding to applying step 2 to each of the above transitions are illustrated on
the next page.
Now that we have added the above four transitions, strings from q0 do not need q2 to reach
the final state. So we can remove the incoming edge from q0 to q2 (see figure 11). Now we
repeat the same procedure for all incoming edges of q2 . After doing this, q2 has no incoming
edges and hence cannot be part of the path of any string. So it does not contribute to our
NFA and can be removed along with all its outgoing edges.
CHAPTER 4. REGULAR EXPRESSIONS 73

a.(a+b)∗ .b
a.(a+b)∗ .b a.(a+b)∗ .a

q0 q1
b
ϵ a
ϵ
b q2
a
start qs a.(a+b)∗ .a
b b qf
ϵ a,b
a ϵ

q3
a
q4 b

Figure 4.11: NFA after removing the incoming edge

We repeat these steps for each remaining non-initial non-accepting state of the NFA, and
finally obtain the required regular expression.

To see how exactly the NFA becomes a regular expression, we illustrate the pre-final step
for an arbitrary NFA.

e1

e2 e4
start q0 q1 q2

e3

Eliminating the intermediate state, we get

e1

e2 .e∗3 .e4
start q0 q2

Combining the two transitions,


So we obtain the regular expression e1 +e2 .e∗3 .e4 .
CHAPTER 4. REGULAR EXPRESSIONS 74

e1 +e2 .e∗3 .e4


start q0 q2

4.4.3 Checking Subsethood of Languages


We return to the example from the previous lecture, proving the equivalence of the regular expres-
sions (a∗ b∗ )∗ and (a+b)∗ . Now we can leverage Kleene’s Theorem and convert both expressions
into their equivalent DFAs, and then check if both DFAs accept the same language or not. To check
this, we can check each of the following conditions separately,

1. L(DFA1 ) ⊆ L(DFA2 )

2. L(DFA1 ) ⊇ L(DFA2 )

and then combine their results to get

L(DFA1 ) = L(DFA2 )

But how do we check if a language L1 is a subset of another language L2 from just their DFAs? We
use the following algorithmic procedure.

We say that L1 ⊆L2 when every string in L1 is also present in L2 . So we can say

L1 ⊆ L2 ≡ L1 ∩ (Σ∗\L2 )=ϕ

The language Σ∗\L2 is just the set of all strings not accepted by L2 . So to obtain its DFA, all we have
to do is invert the acceptance status of each state (accepting states become normal states and vice
versa). Also since L2 is a regular language, Σ∗\L2 (denoted by Lc2 or L¯2 ) is also a regular language.

Now given the DFAs of L1 and Lc2 , we wish to construct the DFA of L1 ∩ Lc2 . This new DFA
should accept only those strings that are accepted by both L1 and Lc2 .

We achieve this by running both DFAs simultaneously on the same string, and checking if we
end up in a pair of accepting states. This is analogous to taking the Cartesian product of both
transition functions. Consider the following example:

1 1
0 0,1
start q1 q2 start q3 q4
0 0,1

Then the DFA representing their intersection is given by


Since the initial condition we wanted to satisfy was L1 ∩ Lc2 = ϕ, we need a way of checking this from
the DFA. If the language of a DFA is ϕ, this means the DFA does not accept any string, meaning
there should be no path from an initial to an accepting state.
CHAPTER 4. REGULAR EXPRESSIONS 75

0 q2 , q3
11
0
q1 , q4 q2 , q4
1
00
q1 , q3
1

start

This completes the systematic approach to determine subsethood.


Chapter 5

DFA Minimisation

5.1 Minimum States in a DFA


So, far we have dealt with DFA, NFA without ϵ, NFA with ϵ and Regular Expressions. We have
also seen that all of them are equivalent and inter-convertible and represent regular languages.
Consider the language which consists of all strings which are terminated by one. The regular
expression for this will be: (0 + 1)∗ 1. Here is a 2-state DFA for the same language:

0 1
1
start q0 q1
0

Figure 5.1: 2-state DFA for the language

The above DFA has two states q0 and q1 . q0 is reached when the last seen letter was 0 (also at the
start). q1 is reached when the last seen letter was 1 (also, it is an accepting state).
We can even construct a 4-state DFA for the same language. Here is an example:

0 1 0 1
1
1 0
start q1 q2 q3 q4
0

Figure 5.2: 4-state DFA for the language

The above DFA has four states q1 , q2 , q3 , q4 .


• q1 : Last letter 0 and no 1s so far

• q2 : Last letter 1 and no 101 seen so far

• q3 : Last letter 0 and more than one 1s seen so far

76
CHAPTER 5. DFA MINIMISATION 77

• q4 : Last letter 1
One can construct many more 4-state DFAs for the same language. Above is another example.

0 1 0
1
1 0
start q1 q2 q3 q4
0

Figure 5.3: Another 4-state DFA for the same language

A natural question which comes to our mind is that can we construct another 2-state DFA for this
language. One can find through trial and error, that this is not possible.
Is one state DFA possible for this language? Let us assume that this is possible. Two cases arise. If
that state is accepting, then it will also accept ϵ, which is not possible. If that state is not accepting,
then the language will be empty. We have arrived at a contradiction. Hence, one state DFA is not
possible for this language.
Therefore, for this language, the minimum states in any DFA can be 2. We also observe that
number of such 2-state DFAs is 1. So, we will now claim that for every language, the number of
minimal DFAs is 1 and try to prove this. Before we prove this, we will introduce the notion of
indistinguishability.

5.2 Indistinguishability
Two states of a DFA qi and qj are considered indistinguishable, if ∀w ∈ Σ∗ , we start with qi , process
w and reach qi′ and we start with qj , process w and reach qj′ , then either qi′ ∈ F and qj′ ∈ F or
/ F and qj′ ∈
qi′ ∈ / F , where F is the set of all final states of the DFA. So, we are basically changing
the start states and checking whether we reach the same type of states or not through the same
string.
This relation is denoted by ≡. It has the following properties:
• It is reflexive. It is clear to see that every state is indistinguishable to itself, as it will reach
a particular state on seeing w. (Due to the nature of a DFA)
• Also, it is clear to see that this relation is symmetric.
• This relation is also transitive. We can prove this by contradiction. Let us assume that
(qi ≡ qj ) ∧ (qj ≡ qk ) but qi ̸≡ qk . Then ∃w such that qi′ ∈ F and qk′ ∈
/ F , where qi′ and qk′ are
the states we reach from qi and qk respectively on seeing w. From the equivalence of qi and qj ,
we have qj′ ∈ F but from the equivalence of qj and qk , we have qj′ ∈ / F , where qj′ is the state
that we reach from qj on seeing w. Hence, we have arrived at a contradiction. Therefore, this
relation is transitive.
• A relation which is reflexive, symmetric and transitive, is equivalent. Thus the states of the
DFA, on which this relation is defined can be partitioned into equivalence classes.
CHAPTER 5. DFA MINIMISATION 78

In the above example (Figure: 5.2), q1 and q3 belong to the same equivalence class, and q2 and q4
also belong to another equivalence class. Let us now try to construct a 2-state DFA from the above
4-state DFA example (Figure: 5.2). We will choose, one element each from both of the equivalence
classes. Let us take q1 from the first class and q4 from the second class.

0 1
1
start q1 q4
0

Figure 5.4: 2-state DFA constructed from the 4-state DFA

From q1 , if we see a 0, we land at q1 itself. If we see a 1, we would have landed at q2 , but since q2
and q4 are equivalent, we replace q2 by q4 . Similarly, from q4 , if we see a 1, we remain at q4 , but
if we see a 0, we would have landed at q3 , but since q1 and q3 are equivalent, we replace q3 by q1 .
Also, q2 and q4 both were acceptable earlier, now q4 is acceptable.
So, through these equivalence classes, we have minimized our 4-state DFA into a 2-state DFA, and
also this 2-state DFA is structurally the same as the previous one (Figure: 7.10), thus again making
us think that the claim that there is a single minimal DFA for every language might be correct.
Another interesting thing we observe is that an accepting state cannot be indistinguishable from a
non-accepting state. We can take w = ϵ, and observe that the states we reach from this pair of
states do not satisfy the definition of indistinguishability relation. However, an initial state and a
non-initial state can belong to the same equivalence class. (For example, above q1 and q3 belonged
to the same class.)
Now, several important questions arise. How can we find the equivalence classes of this relation?
When we will come to know that we cannot compress our DFA further (by compress, we mean
reducing the number of states of the DFA)? How can we prove our claim that the minimal DFA will
be unique?
We will try to answer all these questions subsequently. Let us start with the easiest one.

5.3 Equivalence classes of Indistinguishability relation


Now, we will develop an algorithm to find the equivalence classes of this relation.
Firstly, we should keep in mind our previous observation that an accepting state cannot be indis-
tinguishable from a non-accepting state. That is qi ̸≡ qj ∀qi ∈ F and qj ∈ (Q\F ), where Q is the
set of all states of the DFA.
Suppose, we find two states qs and qt which are distinguishable. Thus there exists a string w such
that qs leads to an accepting state but qt leads to a non-accepting state.

w w
start qs qs′ start qt qt′

Now, w can be decomposed into a1 a2 . . . an , where |w| = n.


CHAPTER 5. DFA MINIMISATION 79

a1 a2 . . . an
start qs qs′′ qs′

a1 a2 . . . an
start qt qt′′ qt′

From, the above figures, one can observe that qs′′ and qt′′ are also distinguishable, where w′ = a2 . . . an
is the string which is making them distinguishable.
Thus, we can observe that for all states qi and qj such that qi ̸≡ qj and for all a ∈ Σ, such that qs
on seeing a lands at qi and qt on seeing a lands at qj then qs will be distinguishable with qt , where
qs and qt are two states of the DFA.(We are basically extending w′ = a + w.)
Therefore using a distinguishable pair, we have found another one. This is the basis of our algorithm.
We initialize our set with all the pairs, where one state belongs to the set of accepting states and
other state does not belong to the set of non accepting states. And then through the above step, we
keep on increasing the size of this set. (Note that this algorithm is not exponential, because there
n
are only 2 pairs possible, and we do need to check an already visited pair.)


q1 q2 . . . qn

q1

q2

X X

.
.
. X X

qn

We stop the process, when no more crosses can be inserted.


Now, it is quite natural for us to ask the question that whether our algorithm is correct or not i.e.
can we still find a pair of distinguishable states, which are not detected even after our algorithm
finishes?
Proof: Let us assume that there are two states qs and qt which are distinguishable but not recognized
by our algorithm. By definition, there exists a string w such that qs leads to an accepting state on
seeing w and qt reaches a non-accepting state.
CHAPTER 5. DFA MINIMISATION 80

w w
start qs qs′ start qt qt′

Now w can be written as a1 a2 . . . an , where |w| = n. And also assume that we reach qs′′ and qt′′ on
seeing a1 from qs and qt respectively. It is clear that, if our algorithm has not detected qs and qt as
distinguishable, then it would not even have detected qs′′ and qt′′ as a distinguishable pair. (Because,
if it would have done, then the next step would have been to make qs and qt as distinguishable.)

a1 a2 . . . an a1 a2 . . . an
start qs qs′′ qs′ start qt qt′′ qt′

Now, we will inductively move forward our algorithm. Thus qs′′′ and qt′′′ would not also be detected as
distinguishable after our algorithm finishes. But clearly, this is not possible, because our algorithm
initializes the set of pairs of {accepting, non-accepting} states and then it’s first step is to move
backwards. So, it would have marked qs′′′ and qt′′′ as distinguishable in the first step itself.

a1 a2 . . . an−1 an a1 a2 . . . an−1 an
start qs qs′′′ qs′ start qt qt′′′ qt′

We have achieved contradiction. Therefore, we can safely conclude that our algorithm terminates
and is also correct.
Now, once our algorithm detects all pairs of distinguishable states, we can choose equivalence classes
of the indistinguishability relation. (Then, we can choose one representative from each class and
move forward with our proof of existence of a unique minimal DFA.)
It is worth noting that distinguishability is not an equivalent relation. It is not even reflexive. It is
also not transitive. But, it is indeed symmetric. And also, it proved out to be very useful for finding
these equivalence classes.

5.4 Further Analysis


We have developed an algorithm which can shrink the size of a given DFA. But is that the best we
can do? Can the size of the DFA be further reduced? Are there many DFAs which are minimal or
just one unique? The following sections will try to answer these questions.
Note: We are assuming that there is no redundant nodes in final DFA obtained, i.e. those nodes
which can not be reached from starting state by any string. If there is one such, we can always
remove it to get an equivalent DFA.

5.4.1 Optimality of Acquired DFA


Let’s say we have an automaton X, which is the reduced automaton obtained by picking one state
from each of the equivalence classes of the set of states, and Y is an automaton obtained by some
other method for the same language.
CHAPTER 5. DFA MINIMISATION 81

Automaton accepting
Reduced Automaton same set of languages
obtained by our method as X but not obtained
by our method

Automaton X Automaton Y

We wish to show that the number of states of Y is at least as many as X.


Let’s define a new term "Language of State". Given a DFA A and a state s, we define LA s as the
set of all finite strings which will end in any of the accepting state if we start from s. Readers should
be able to see that when we say that two states are distinguishable, we mean that there Language is
different. Similarly, when we say that the two states are indistinguishable, we are actually asserting
that the language of the two states is same.

Let say we started with the automaton D. After termination the resultant automaton was A. Let SA
represent the set of distinct such languages of nodes of A. Since A has all states indistinguishable
from every other state, the |SA | = no of states.
Claim: Given any DFA B, equivalent to A,

S ≡ L)), where S represent a state of B


∀L ∈ SA (∃S(LB

Proof: If L ∈SA , then by definetion it must be the language of some node in A, say s. Since all the
nodes of A are ir-redundant, so let’s say string w is one such string through which we can reach this
node starting from the starting state of A. Now run the same string w on the automaton B. Say we
reach the state S. Then L ≡ LA s ≡ LS , because if say string x is present in former and not in latter
B

then, string w.x will be an accepting string in A and not in B, and vice-versa. Thus LB S ≡ L.
For every language in SA , we must have at least one node in B. Since one node has a specific
language, this gives us a lower bound to number of nodes, |SA |.
A achives the lower bound of states, hence it is the(?) minimum state DFA which
represent the same regular language represented by D.

5.4.2 Uniqueness of Acquired DFA


So far we have discussed about proving the optimality of the automaton A, which is the output of
our algorithm. We defined an important notion of "Language of State" and proved a very important
result which can be summarized as follow Given two DFAs A and B where both represent
some particular regular language, then for any string w, the states reached in A and
B by reading it will have same language,i.e. LA sA ≡ LsB . Now we will try to answer that if
B

there are mutiple DFAs which are optimal or just one.


Let’s consider two automatons, one is our DFA A which has been proved to be optimal and another
DFA C which is a equivalent DFA and is also optimal(given). We will show that C has to be
isomorphic to A.
CHAPTER 5. DFA MINIMISATION 82

Since C is an optimal DFA, it can not be further shrunk. That implies two things: No redundant
nodes, no pair of nodes is indistinguishable. That means that the language of the nodes of C is
distinct and unique. Since A and C are both optimal, both must have same number of states, and
all the nodes of both of them are reachable, and all the nodes of both of them have unique languages(
unique over the domain of single DFA not collectively).
Consider any node s of A, take a string w which take us to it, run it over B, say we reach the node S,
then s and S have same language. Since the nodes of B are distinguishable, so S does not depend on w.

This can be used to define an injection from the nodes of A to the nodes of B. It is injective
function because of distinguishability of the nodes of A as well as of B. The function is also bijec-
tive because the number of nodes of A and B are both finite and equal, making it both injective
and surjective, thus bijective.

q1 q1
0 0

start q0 start q0
0,1 1 0,1 1

q2 q2

Figure 5.9: Two DFAs with corresponding states connected by dotted lines.

Claim:
As per this function, starting state of A will be mapped to starting stata of B. . This is
because the languages of these two states is essentially the language of there respective DFA which
is same( given ).
Claim:
An accepting state will be mapped to an accepting state of B. . This is because any state
which has ϵ in it’s language is an accepting state by the definition of "language of state", similarly
an accepting state will have ϵ in it’s language. So an accepting state of A mapped to whatever state
of B, that state must have ϵ in it’s language making it an acceptable state of B.
Claim:
Consider Two states of A, S1A and S2A . Suppose that they are mapped to S1B and S2B
respectively. Then if there is an α labelled edge going from S1A to S2A , there must be an
α labelled edge going from S1B to S2B also. Consider any string w that takes us to S1A , we can
see that w.α will take us to S2A . Now since S1A is mapped to S2A and S1B is mapped to S2B we can
say that "any" string that takes us to one in A will take us to the image of it when run on B(Why?
proved above:)). Thus w takes us to S1B and w.α takes us to S2B . Because a DFA is deterministic,
there should be α labelled edge going from S1B to S2B .(Food for thought: Is mentioning determinism
important here?)
Isomorphism of labelled graphs:
An isomorphism is a vertex bijection which is both edge-preserving and label-preserving.
Isomorphism of DFAs:
An isomorphism is a vertex bijection which is both edge-preserving and label-preserving, where im-
CHAPTER 5. DFA MINIMISATION 83

age of starting state is starting state and image of a accepting state is an accepting state.
All these three claims shows that A and B are same automatons.

This tool can be used to check wether two different regular expressions represent the same language
or different languages. Just convert them into DFAs, and then to the minimal DFAs. Check for
the isomorphity of the two DFAs if they are isomorphic, then the regular expressions are equivalent
otherwise not. ( Isomorphism ⇐⇒ equivalence )

5.5 From states to words


Till now we were talking about languages of states and equivalence of two states. Now we will extend
this notion for words.

5.5.1 Language of word


Consider a language L. we define Language of word w, [w ] as set of all strings x such that w.x ∈ L.
Now we will define a relation "˜L ". If languages of two words w1 and w2 is same for L, then we say
that w1 is related to w2. This is a reflexive, symmetric and transitive relation. (Basically, if set 1
and set 2 are same and then it is also true for the other way round that is set 2 and set 1 are same.
If set 1 and set 2 are same and if set 2 and set 3 are same then set 1 and set 3 will also be same.)
Thus we have defined an equivalence relation over words, where the equivalence classes depends on
language L.

5.5.2 Relation between states of minimal DFA and equivalence classes for ˜L
If a string w ends up in a state of DFA, say s, then the language of the state is equivalent to [w]˜L .
This can be easily proved by using definition of the "language of state" and "language of word". If
a string w.x ∈ L, then if you run the w.x on DFA, you first reach that state using w then x takes
to some accepting state. Hence x ∈ LDF s
A . Also if x ∈ LDF A , then essentially w.x will end in a
s
accepting state because w ends in state s.
A particular equivalence class represent a particular language and a state represent a particular
language. Since for a minimal DFA, all the states represent distinct language, we can define a
bijection from equivalence classes to the state of minimal DFA, where a equivalence class is mapped
to that state which represent the same language as that of any string in that equivalence class.

5.6 Setting up the Parallel


We defined the Nerode equivalence relation on a language L over the alphabet Σ. This relation
states that ∀w1 , w2 ∈ Σ∗ , w1 ∼L w2 iff ∀x ∈ Σ∗ , (w1 · x ∈ L ⇐⇒ w2 · x ∈ L)

If the language L is regular, then a minimized DFA A = (Q, Σ, δ, q0, F ) can also be defined for the
language L.
CHAPTER 5. DFA MINIMISATION 84

For this language L, w1 and w2 are two words in Σ∗ such that w1 ∼L w2 . Let qi and qj be two
states ∈ Q such that

qi → state reached in A after reading w1


qj → state reached in A after reading w2

Since w1 ∼L w2 , it follows that for all x ∈ Σ∗ , the state reached in DFA A after reading w1 · x and
the state reached in DFA A after reading w2 · x will either both belong to F or both belong to Q \ F .
Otherwise, one word would be accepted by L while the other would not, leading to a contradiction
in the definition of the equivalence relation.

Therefore, by the definition of indistinguishability, states qi and qj can be concluded to be indistin-


guishable.However, in a minimized DFA, all distinct pairs of states are distinguishable. Consequently,
the only states in A that can be indistinguishable are those where the state is compared with itself.

This demonstrates that if two words/strings belong to the same equivalence class with respect to L,
then both strings will end up in the same state q ∈ Q in the minimized DFA.

Till now, we have defined two equivalence relations ∼L over the language L and ≡ over the states of
a DFA characterizing a language L. We aim to define a relation between the number of equivalence
classes of both these relations.

5.6.1 Can | ∼L | > | ≡ | ?


NO. We will in fact prove below that | ∼L | ≤ | ≡ |

Let there be strings w1 ,w2 s.t.

w1 ∈ ith equivalence class of ∼L


w2 ∈ j th equivalence class of ∼L
where i ̸= j

Since they belong to different equivalence classes, we can say ∃ x ∈ Σ∗ s.t. w1 · x ∈ L and w2 · x ∈
/L
(or can be vice-versa, does not matter).

w1 x
start q0 q1 q3
w2

q2 x q4

• Here, q1 represents the equivalence class of w1 while q2 represents the equivalence class of w2.

• By the definition of indistinguishability, q1 and q2 are distinguishable states as there exists a


letter in the alphabet which leads them to another pair of distinguishable states.
CHAPTER 5. DFA MINIMISATION 85

• Doing this for every pair of equivalence classes, we find that the states representing the distinct
equivalence classes are all distinguishable from each other leading us to the following relation.
| ∼L | ≤ | ≡ |
.i.e. the number of indistinguishability equivalence classes is atleast the the number of Nerode
equivalence classes

5.6.2 Can| ∼L | < | ≡ | ?


In order to inspect this , we are going to construct a DFA using ∼L ⊆ Σ∗ × Σ∗ relation which will
then accepts the language L. Let here, Σ = 0, 1
• A state denoting [ϵ] is made and is also denoted as the start state.

start [ϵ]

• Then we pick a letter from the alphabet and look the transitions from each of the existing states.
If the next word already lies in the equivalence class of one of the existing states, we draw the
arrow representing the particular transition otherwise a new state is made representing the
equivalence class of the newly made word.
0
start [ϵ] [0]
1
1
0
start [ϵ] [0]
[1] If 1 lies in [0], then the automata will look
like this.
Here, 1 does not lie in the equivalence class
of 0, i.e., [0].
• By repeatedly applying point 2, the automata can keep on expanding like as below

0 1 0
start [ϵ] [0] [01] [010]

[1]

• However, this construction is guaranteed to converge because we have already proved that
| ∼L | ≤ | ≡ | . It is known that the cardinality of equivalence classes of the indistinguishability
relation is equal to the number of states in the minimal DFA representing the language L.Since
the number of such states is finite, it follows that | ≡ | is also finite.Given that | ∼L | ≤ | ≡ |,
it can be deduced that | ∼L | is finite. Consequently, this guarantees the convergence of the
algorithm in question.[If the above algorithm does not converge, that means new states will
keep on forming contradicting the | ∼L | ≤ | ≡ | identity.]
CHAPTER 5. DFA MINIMISATION 86

• Finally, all those states whose equivalence classes have words that are accepted by the language
L are marked as accepting states.

Hence, the aforementioned algorithm successfully allows us to construct a finite state automata(DFA)
which accepts only the words that are accepted by the language L. Note that here the number of
states in the new automata is equal to ∼L .
Now, let us assume that | ∼L | < | ≡ | holds. This implies that our newly constructed automata is
the minimized DFA for the language L.
However, in the last lecture we had proved that the DFA constructed by ≡ relation is minimized
one and is unique. Hence, it leads to a contradiction,making our assumption wrong.

Thus proved that


| ∼L |= | ≡ |
Note: The Nerode equivalence relation,unlike the indistinguishability relation,can be defined for
any language L ⊆ Σ∗ , irrespective of the fact that whether the language is regular or not.

5.7 Myhill-Nerode Theorem


L is regular if and only if ∼L has a finite number of equivalence classes.

This theorem provides an exact characterization of a regular language, unlike the Pumping Lemma.
While the Pumping Lemma does not guarantee that L is regular if it holds, here, if ∼L has a finite
number of equivalence classes, the language L must be regular. The proof of this theorem can be
found here.
Chapter 6

Pumping Lemma for Regular Languages

If L is a regular language, then there exists a finite DFA with a minimum number of states
(say p) that recognizes it. Let’s consider a string w of length at least p that is accepted by this DFA.
By the Pigeonhole Principle, we can deduce that when the DFA processes the string, it must revisit
at least one state, thereby implying the existence of a loop.

6.1 Example
Consider string w ∈ L, where L is a regular Language. Suppose the first state in DFA which is
revisited again on processing string w is q1 .
x z
start q0 q1 q3

Hence for any i ≥ 0 xy i z ∈ L


The length of substring xy cannot exceed the size of the DFA. Hence,
|xy| ≤ n
where n is the length of DFA
Also |y| ≥ 1 as it should be possible to go back to the same state

6.2 Formal Statement of Pumping Lemma


For any regular language L, the following holds
∃ p > 0 ∀w, (w ∈ L) ∩ (|w| ≥ p) ∃ x, y, z (w = x.y.z) ∀n (n ≥ 0) ⇒ xy n z ∈ L
Contrapositive:
∀ p > 0 ∃ w (w ∈ L) ∩ (|w| ≥ p) ∀ x, y, z (w = x.y.z)∃ n(n ≥ 0) ∩ xy n z ∈
/L
If the above formula holds true then L is not a regular language.

87
CHAPTER 6. PUMPING LEMMA FOR REGULAR LANGUAGES 88

• If a language L is regular then Pumping Lemma holds true for L but if Pumping Lemma holds
true for Language L then it does not necessarily mean that L is regular.

• But if Pumping Lemma does not hold true for Language L then L is not regular.

6.3 Pumping Lemma as Adversarial Game


Game between an adversary (who wants to show that the language L is not regular) and a believer
(who believes that the language is regular) is as follows:

• The believer chooses an integer p > 0 and claims this is the count of states in the DFA that
she believes recognizes the language.

• Adversary chooses a string w ∈ L such that |w| > p.

• Believer then splits w into three parts w = xyz, where |xy| ≤ p and |y| > 0

• Adversary now chooses an integer n ≥ 0 such that xy n z ∈


/ L thereby winning the game. If the
adversary can’t do this then they lose.

6.4 If a language L is regular and the number of states in DFA for


L is n, is L infinite?
Given the DFA for language L, if there is a path from the initial state to a state forming a
cycle, and from that state there’s a path to an accepting state, then L is infinite because the DFA
can endlessly loop within this cycle, generating an infinite number of accepted strings.
Instead if only a black box is provided, which can determine whether a given string w is in L or not.

yes
Is
? Oracle
w∈L
membership in
L no

Test all strings w, such that n < |w| ≤ 2n for membership in L. If we find even one string which
returns ’yes’ on testing then L is infinite. (Since pumping lemma can be applied to such a string,
and it can also be pumped to show that an infinite sequence of strings are in L).
If all strings w, ( n < |w| ≤ 2n) return ’no’ on testing then we claim that there are no strings in L
of length more than 2n, and thus there are only a finite number of strings in L.
Proof: On applying the pumping lemma for such w repeatedly, we can obtain strings of length
between n and 2n, which again belong to L. This is because, for w = xyz, removing loop y every
time decreases the length by at least 1. This is a contradiction to our assumption that there are no
strings of length between n and 2n in L.
Hence L is infinite if it contains atleast one string of length at least n + 1.
CHAPTER 6. PUMPING LEMMA FOR REGULAR LANGUAGES 89

6.5 Example
k
Consider the Language L = {(0 + 1)∗01 | k is prime} with alphabet
P
= {0, 1}
The application of the Pumping Lemma is independent of the initial state chosen such that length
of string remains ≥ p (pumping length)
In the case of this language, the Pumping Lemma can indeed be applied to the substring 1k , where
k is any prime number.
Now we can break w = 1k into w = xyz such that y ̸= ϵ and |xy| ≤ n. Let |y |= m (m > 0), then
|xz |= k − m
Consider the string a=xy k−m z

|a |= |xz| + (k − m)|y| = (m + 1)(k − m)

As length of string a is not a prime (it has factors m + 1,k − m), we have a ∈
/L
Hence Language L is not regular.
Reverse Language
If Language L is regular then its reverse language LR is also regular.
To construct automaton for LR from that of L swap initial and final states and reverse the edges.
Chapter 7

Pushdown Automata

7.1 Pushdown Automata for non-regular Languages


Thus far, the finite automata we have studied cannot be used to represent non-regular languages.
One such example of a non-regular language is the language of balanced parentheses.Let it be L.

Proving why L is not a regular language It is known that intersection a language


L with any regular language preserves the regularity of the original language.i.e. the new
language formed after intersection will hav ethe same regularity as the original language
L.
Therefore, considering L∩(∗ )∗ results in (n )n which by applying the Pumping Lemma can
be shown that that it is not a regular language, it can be thus concluded that the language
of balanced parenthesis is also not a regular language.

So if we can equip the finite state automata with more structures like a stack, it will be abl to accept
the non-regular languages too.

0 1
0,0/01
start q0 q1
1 1
Transition
0 0
q2
1 1

Stack 1 Stack 2

Figure 7.1: Automaton and Stacks

90
CHAPTER 7. PUSHDOWN AUTOMATA 91

7.2 Description
PushDown Automata (PDA) are essentially NFAs equipped with a stack that is maintained on an
alphabet that is different from the alphabet used for state transition. PDAs have greater accepting
power than regular NFAs and can accept non-regular languages as well.

7.2.1 Conventions
Just like we described a NFA L as L(Q, Σ, q0 , δ, F ), we describe PDAs as characterized by

L(Q, Σ, Γ, q0 , Z0 , δ, F )

where-

• Q is the set of all states in the PDA

• Σ is the alphabet accepted by the PDA

• Γ is the set of symbols that are pushed/popped from the stack associated with the PDA

• q0 is the starting state of the PDA. There may be more than one starting state for the PDA.

• Z0 is the starting symbol in the stack. This is so that a symbol can be popped on the first
transition.

• δ is the transition function for the PDA. While the transition function for regular NFA was
of the form δ : Q × Σ ∪ {ϵ} → 2Q , the transition function for PDAs is characterized by

δ : Q × Σ ∪ {ϵ} × Γ → 2Q×T , where Γ∗ is the set of strings made by the alphabet Γ.

• F is the set of final states of the PDA

A single transition on a PDA from state q1 to q2 is represented by x, a/ba where x ∈ Σ is the original
symbol and may be ϵ, a is the top element of the stack and is popped off (it cannot be ϵ), and ba is
the string that is pushed onto the stack after a is popped off. The final top element of the stack is
b after this transition takes place.
This places a restriction on PDAs as compared to NFAs, that is, for any transition to occur, the top
element of the stack also has to match the top element specified in the transition.

7.2.2 Example
Let us characterize the non regular language {0n 1n |n ≥ 0} through a PDA with a finite number of
states, as shown in Figure 7.2. We define Γ = {z0 , a} as our stack alphabet with our PDA starting
with only z0 in the stack. a will represent a counter for the number of zeros in our string.
Notice that if a string deviates from the form 0m 1n the automaton goes to trap. For every 0 in the
string a is pushed onto the stack and hence the l(S) = m + 1 where l(S) is the length of stack. For
strings following the given structure, if m = n, then the string ends at an accepting state. If m > n,
the string goes to trap and if m < n, the string ends the run at a non accepting state.
We reiterate that a string is accepted if and only if the run does not halt before the string is completed
and the final state is an accepting state.
CHAPTER 7. PUSHDOWN AUTOMATA 92

q1 0, a/aa

z0

1,
/a

a/
z0

ϵ
0,
1, z0 /z0 0, a/a
start q0 trap q3 1, a/ϵ
0, z0 /z0

ϵ,

0
/z
z0

z0
/z

ϵ,
0
q2

Figure 7.2: Automaton accepting {0n 1n |n ≥ 0}

7.2.3 Example
Now let us extend our previous automaton to accept the language {0n 1m |n ≥ m ≥ 1} as shown in
Figure 7.3. Herein, we introduce a different notion of string acceptance, that is, if the stack becomes
empty upon completion of the run, we can say that the string is accepted. This is called acceptance
through empty stack.

7.3 Acceptance through empty stack


We introduce a new way of characterizing string acceptance in PushDown Automata wherein there
is no need to specify accepting states explicitly. Here, a string is accepted if it satisfies the following
constraints for atleast one possible run on the PDA-

1. String should not halt in between its run, that is, if any symbol is left in the run on the string,
there must exist atleast one entry in the transition function δ corresponding to any of the
possible current states, the next character and any of the possible stack tops.

2. Upon completion of the run, either the stack should be naturally empty or through a series of
ϵ transitions, should be able to make its stack empty for the string to be accepting.

3. There are no constraints on the current state and the string can be accepted through any state
q ∈ Q.

This gives a new method of acceptance. We represent the language accepted by this method over
an automaton A as N (A) and the language accepted by our original method as L(A).
However, we do not yet know if the two methods of acceptance are equally powerful. We shall prove
this claim by showing the equivalence of the two methods over PDAs by transforming one into the
other and vice versa.
CHAPTER 7. PUSHDOWN AUTOMATA 93

q1 0, a/aa

z0

1,
/a

a
z0


0,
1, z0 /z0 0, a/a
start q0 trap q3 1, a/ϵ
0, z0 /z0

ϵ,

ϵ, z 0
z0
z

0/

ϵ, a
0
/z


ϵ,
0


ϵ, z0 /ϵ ϵ, z0 /ϵ
q2 q5
ϵ, a/ϵ ϵ, a/ϵ

Figure 7.3: Automaton accepting {0n 1m |n ≥ m ≥ 0}

7.3.1 For any automaton A, L(A) is not necessarily equivalent to N (A)


We must observe a subtlety, if a language L is accepted through PDAs through both methods, the
automata that accept these need not be the same. This is easily shown through the previous two
examples.

• In Figure 1, L(A) = {0n 1n |n ≥ 0} but N (A) = Φ as the stack never gets empty.

• In Figure 2, N (A) = {0n 1m |n ≥ m ≥ 1} but L(A) = Φ as their are no accepting states.

So we want to prove that-

1. If L(A1 ) is the language accepted by a PDA A1 , there exists another PDA A2 such that
L(A1 ) = N (A2 ).

2. If N (A1 ) is the language accepted by a PDA A1 , there exists another PDA A3 such that
N (A1 ) = L(A3 ).

We will show the existence of both of these by giving a construction which when applied to A1 gives
us A2 and A3 respectively.

7.3.2 Construction for A2 such that L(A1 ) = N (A2 )


We need to modify A1 to A2 without making any assumptions about its structure except that it is
a PDA, and make it such that it satisfies two conditions for the equivalence of the languages-

1. Rejection - N (A2 ) should not accept any string s that is not a part of L(A1 )

2. Acceptance- If a string s is a part of L(A1 ) it must also be a part of N (A2 )

We shall assume A1 has only one starting state, and if not we can simply add extra ϵ-transitions to
reach any start state from our original start state.
CHAPTER 7. PUSHDOWN AUTOMATA 94

Before this start state, we append a new start state s0 and have our stack start with x0 ̸∈ Γ ∪ {z0 }
and connect it to start with an ϵ-edge pushing z0 x0 to the stack. This is absolutely identical to the
original stack with the exception of an extra x0 at the bottom of the stack.
We will convert A1 to A2 without modifying the basic structure of A1 such that for any accepting
state in A1 , we add ϵ-transitions to a new final state f ′ , which does not take in any input word and
just pops the stack (x0 or z0 or γ ∈ Γ) without performing any additions onto it.

Acceptance
If string s is accepted in A1 it will reach f ∈ F on atleast one of its possible runs on A1 . If the
stack has atleast one element upon reaching f , the string can go to f ′ and there the stack can get
emptied through ϵ-transitions from f ′ to itself, thus putting s in N (A2 ) as well.
If the stack became empty when the string s ran on A1 we will now have x0 as the only element left
in the stack (as no transition in the original automaton can pop x0 because x0 ̸∈ Γ ∪ {z0 }). Our
automaton proceeds to f ′ by popping x0 and is accepted as stack is emptied.

Hence, acceptance is proved as any string in L(A1 ) is also present in N (A2 ).

Rejection
A run of string s is rejected in L(A1 ) if-
1. Run successfully completes but the string is not in an accepting state

2. Run does not complete successfully because of empty stack

3. Run does not complete successfully because of no valid transitions


We show that our modifications can correctly reject all such cases.
If the run does not successfully complete on A1 due to an empty stack, upon running on A2 , the
stack will only have x0 and as x0 ̸∈ Γ ∪ {z0 }, there is no transition with x0 as top, leading to the
run halting and string being rejected.
If the run does not successfully complete on A1 due to no valid transition, there will not be any
valid transitions in A2 as well because the stack is the same except for an extra x0 at the bottom.
CHAPTER 7. PUSHDOWN AUTOMATA 95

If the run successfully completes but the string is not in an accepting state, then the stack never
becomes empty because x0 will always be present.

Hence, rejection is also proved as no string rejected by L(A1 ) is accepted by N (A).

7.4 From Empty Stack PDA to Final State PDA


We will now see how to construct a PDA with final state acceptance from a PDA with empty stack
acceptance.
Consider PDA PN = (Q, Σ, Γ, δN , q0 , Z0 ) and N(PN ) be the set of strings that the PDA accepts
by empty stack .

q0

Figure 7.4: PN - PDA with empty stack Acceptance

7.4.1 Construction
Now, Consider a construction to P that involves

• Addition of new stack variable X0 ∈


/ Γ as the bottom of the stack.

• Addition of a new start state P0 and an ϵ - transition from this new start state on reading X0
on the stack to q0 and Z0 as top of stack.

start ϵ, X0 /Z0 X0
P0 q0

• Introducing a new final state Pf and ϵ-transitions from all states in Q such that on reading
X0 on the stack transition happens to state Pf and an empty stack.

After the construction, the new automaton PF will be (Q ∪ {p0 , pf }, Σ, Γ ∪ {X0 }, δF , p0 , X0 , {pf })
whose acceptance is by final state.
CHAPTER 7. PUSHDOWN AUTOMATA 96

ϵ,X0 /ϵ

ϵ,X0 /ϵ

start ϵ, X0 /Z0 X0
p0 q0 pf

ϵ,X0 /ϵ

Figure 7.5: PF (After the construction)

7.4.2 Required to show N (P ) = L(PF )


• First, Let us show that if a string w is accepted by P then it will also be present in L(PF )

If w is accepted by P, then the stack in P must have been emptied after consuming w, which
implies that stack in PF now has X0 at its top. Now , taking the ϵ-transition by reading X0
will lead us to pf which is an accepting state of PF .Hence w will also be present in L(PF ).

• Now, Let us show that if w is in L(PF ) then it will also be present in N (P )

In the PDA L(PF ), the only transitions that can be used to reach pf state are ϵ transi-
tions on reading X0 . Since w is present in L(PF ), it must have taken one of these transitions
and read X0 at the top of the stack which implies that stack in P must have been emptied by
w. Hence, w would also be accepted by P.

Finally one can say that,


Acceptance of PDA using final states ≡ Acceptance of PDA using empty stack.
i.e For every PDA A whose acceptance is by final states there exits a PDA A’ whose acceptance is
by empty stack and vice-versa.

7.5 Context Free Languages


Languages accepted by the PDAs are called Context Free Languages. These are strict superset of
regular languages.

7.5.1 DPDA and NPDAs


A PDA is considered deterministic if, from a given state and upon reading a specific symbol from the
top of the stack, there exists only one possible transition for a given input and a PDA is considered
non-deterministic if there exists either no transition or many transitions possible for an input.
CHAPTER 7. PUSHDOWN AUTOMATA 97

The language that can be represented using DPDAs can also be represented using the NPDAs.i.e
DP DA ⊆ N P DA
But, there are certain languages which can be represented only by NPDAs but not DPDAs.Here is
an example,
L = {wwR | w is in (0 + 1)∗}
If we try to represent L using a DPDA by pushing a copy of input symbol seen on to stack, one has
to make a guess every time whether input has reached end of string w or not so that popping can
be done to check the palindrome nature of the string. Hence non-determinism has to be involved to
represent L.
Hence,
DP DA ⊂ N P DA

7.5.2 Build-up and emptying of the stack of a PDA

Figure 7.6: Behaviour of stack with the input being processed

In the above Figure, if we observe the part of graph where strings xA...xB and xC....xD are read,
the behaviour is independent of the content of the stack before xA but just depends on the state
and top of the stack when the symbol xA is being read.
Hence, a context-free language can be fragmented into some finite no.of sets (these sets need not be
finite) each defined by (top of stack,state of automaton )

7.6 Acceptance by PDA


Let us begin by taking the example of the following PDA with Γ = {Z0 , X} as the stacks symbols
and Z0 as the initial stack state
The language accpeted by this automata is N (A) = {0n 1n |n ≥ 1}
CHAPTER 7. PUSHDOWN AUTOMATA 98

0,X/XX 1,X/ϵ

0,Z0 /X 1,X/ϵ
start q0 q1 q2

Figure 7.7: PDA A

7.6.1 Run on the PDA


Let us have a look at the run of the string 0011 on this PDA

Stack

X
Z0 X X X ϵ
q0 0 q1 0 q1 1 q2 1 q2

Figure 7.8: Run on 0011

We can see how the string 0011 starts with the state q0 and Z0 initially on the top of the stack and
ends with an empty stack. Let us define a mathematical notation for the language that the string
0011 belongs to.

Lqi Z0 qj = set of all strings that start at the state qi with Z0 on the top of the stack and end
at the state qj with Z0 popped off the stack and the remaining stack unaltered

Note: The stack may have risen or fallen during the transitions but finally only Z0 is popped
off the stack and the remaining stack is unaltered
Z0
w
Lqi Z0 qj = {w | }
qi qj

For PDA A we can say:


• 011 ∈ Lq1Xq2
• 1 ∈ Lq2 Xq2
• 0011 ∈ Lq0 Z0 q2
• and many more ...
Now lets try to define a language on PDA A, N(A) such that it starts at state q0 and an initial stack
containing Z0 and empties it.
N (A) = Lq0 Z0 q0 ∪ Lq0 Z0 q1 ∪ Lq0 Z0 q2
CHAPTER 7. PUSHDOWN AUTOMATA 99

We took the union of languages that started at q0 emptied the stack and ended at any of the states
present in the PDA A.

7.6.2 Recurrence Relations on Languages


Lets take a look at the individual transitions of PDA A

0,X/XX

q1

Figure 7.9: self loop on q1

Now what can we say about the language Lq1 Xq2

If we consider a string w such that 0 is the first letter of w then this transition is undertaken
and we have added another X to the stack. So now w would have to pop two X from the stack and
arrive at the state q2 for w to belong to the language Lq1 Xq2 . This can be written as

Lq1 Xq2 ⊇ 0.Lq1 Xq0 .Lq0 Xq2 ∪ 0.Lq1 Xq1 .Lq1 Xq2 ∪ 0.Lq1 Xq2 .Lq2 Xq2

Now if we consider the transition

1,X/ϵ
q1 q1

Figure 7.10: self loop on q1

We can say
Lq1Xq2 ⊇ 1
We can keep doing this for all transitions from q1 and get a super set recurrence relation for Lq1 Xq2

7.7 More on recurrence relations


Talking about a general PDA with n states and k stack symbols n2 k languages of the form Lqi Sj qk
can be defined and considering individual transitions super set recurrence relations can be obtained
for them. For example
L1 ⊇ 1 ∪ 0.L2 .L3 ∪ . . .
..
.

Ln2 k ⊇ 0.1 ∪ 0.L1 .L3 ∪ . . .


CHAPTER 7. PUSHDOWN AUTOMATA 100

7.7.1 Smallest and Largest Languages


Consider the relation as follows
L1 ⊇ 0.1 ∪ 0.L1 .1
Notice how both Σ∗ and 0n 1n n ≥ 1 are both languages that satisfy this relation but Σ∗ is largest
language that satisfies this relation and 0n 1n n ≥ 1 is the smallest(can be proven smallest by induc-
tion if 0i 1i belongs to L1 then so does 0i+1 1i+1 )

We will be particularly interested in finding these smallest languages satisfying the relation

NOTE: Σ∗ is always a solution(largest) of the super set recurrence relation since it contains all
strings

7.7.2 Context Free Grammar


Above equations imply that wherever we see a word of the language Lq1 Xq2 , we can replace it with
any word from the languages on the RHS. Thus we can write:

L1 → 0L2 L3 | 0L4 L1 | 0L1 L5 | 1

where L1 = Lq1 Xq2 , L2 = Lq1 Xq0 , L3 = Lq0 Xq2 , L4 = Lq1 Xq1 and L5 = Lq2 Xq2 .
Similarly, we can do the same for the languages Li ∀i ∈ {1, 2, . . . , n2 k}. This will form a context
free grammar.

WIth proper reductions, we can say that the context free grammar generated for the language
L = {0n 1n | n ≥ 1} is:

S → 0S1 | 01
∗)
We observe that the universal language Σ∗=L((0+1) also satisfies the context free grammar, but the
language L = {0n 1n | n ≥ 1} is the minimal language that satisfies the context free grammar.
Chapter 8

Context Free Grammer

8.1 Context-free Grammar

Definition: A context-free grammar (CFG) is a formal grammar whose production rules can
be applied to a nonterminal symbol regardless of its context. In particular, in a context-free
grammar, each production rule is of the form V → (V ∪ T )∗ . Where V is set of Non-Terminal
and T is set of Terminals.
Formally, a context-free grammar can be represented as follows -
G = (V, Σ ∪ {ϵ}, P, S)
where
V - is the set of non-terminals. These are symbols that can be replaced or expanded.
Σ ∪ {ϵ} - is the set of terminals. These are symbols that cannot be replaced or expanded further.
P - are the set of production rules
S - is the start symbol (S ∈ V )

Again, a grammar is said to be the Context-free grammar only if every production is in the form of
G → (V ∪ T )∗, where G∈V

Let’s consider a context-free grammar defined as follows:

S → A.S | ϵ
A → A1 | 0A1 | 01
Here, S and A are non-terminal symbols representing languages. ϵ is a special symbol representing
an empty string and is in the language S. The symbols 0 and 1 are terminal symbols and are in
the language A. The set of strings that can be generated using a context-free grammar is called a
context-free language. This language of A contains strings that are of the form 01 or A1 or 0A1, by
just substituting all possible strings of A in the form recursively
A→
− 01 | 011 | 0011 | 0111 | 00111 | ...
Thus, Language derived from A can be intutively be told as
0i 1j , where j ≥ i ≥ 1

101
CHAPTER 8. CONTEXT FREE GRAMMER 102

8.1.1 PDA to CFG


In the previous lecture, we talked about the language accepted by a Pushdown Automata (PDA),
starting from a state qi , with a symbol X on the top of the stack, and reaching a state qj , with the
stack remaining the same, except the fact that the top element, X, was popped off.
Such a language is denoted by Lqi Xqj , where qi is the start state, X is the top element of the stack,
and qj is the final state. Given certain transitions in a PDA, we could also come up with recurrence
relations for such languages. For eg., if the following states and transitions were given to us, and we
want to find Lq1 Xq2

0, X/XY

q1 q2

We can say that, for all states qi in the automaton,


Lq1 Xq2 ⊇ 0 · Lq1 Xqi · Lqi Y q2

If we denote the languages by L1 , L2 . . . Ln2 k , where n is the number of states and k is the number
of stack symbols, then we can write the above equation as a recurrence relation for all languages.
.
For example, [mathescape=true] L → 1 | 0 · L · L | 0 · L · L | . . . L → . . . .. L 2 → . . . This
1 2 3 4 1 2 n k
set of recurrence relations is called a Context-free grammar. Hence, given a PDA, we can write the
language accepted by the PDA by an empty stack as a CFG.

8.2 Parse Trees and Ambiguous Grammars


Suppose we have the string 01101 and we want to check whether this lies in LS .
First, we will apply the rule S → A · S, since the rule S → ε is not useful yet. Now, we further see
A · S recursively. For S, S → ε is not useful since 01101 does not lie in LA . In this manner, we
further recursively break up our string until we reach terminals and check whether the sub-parts can
be further be made useful. In this manner, we can create parse/derivation trees for a string, given
a grammar. The leaves in these trees are our terminals. This is shown in disgram below.

A S

A 1 A S

0 1 0 1 ε
CHAPTER 8. CONTEXT FREE GRAMMER 103

As we can see, the leaves form 01101ε, from left to right, which is exactly the string we wanted.
Hence, 01101 ∈ LS , using the production rules in the grammar.
While drawing the trees, we must ensure that the root is the start symbol, and the node at which
we want to split our tree, we are using production rules with that value in the left hand side. For
example, if we want to split the tree at an A, we must only use production rules of the form A → . . . .

For this particular string and grammar, only one parse tree is possible. Let us now consider the
string 00111 ∈ LS .

S S

A S A S

A 1 ε 0 A 1 ε

0 A 1 A 1

0 1 0 1

As we can see, these are two completely different, and correct parse trees for the string in the same
grammar. Such grammars, where there exist multiple parse trees for some strings, are called am-
biguous grammars. Hence, the grammar we have defined is indeed ambiguous.

Are there CFLs for which every CFG representing it are ambiguous grammars? Yes, there indeed
are such CFLs. Such languages are called inherently ambiguous languages. If there exists even one
grammar for the CFL which is unambiguous, then the language is an unambiguous language as well.
Is LS inherently ambiguous or is it an unambiguous language?

The answer is that it is not inherently ambiguous. The intuition is that, we must force our grammar
to first match up all the zeros in the strings with ones, and following that, add the terminating
ones. Our current grammar puts no such restriction. We could first add zeros and ones to the start
and end, add ones to the end, and then go back to adding zeros and ones to the start and end.
This leads to multiple parse trees. An example of a CFG representing LS , which is unambiguous is
[mathescape=true] S → C · S | ε C → A · B A → 0 · A · 1 | 01 B → 1 · B | ε As we can see, A does
the job of adding zeros and ones to the start and end, B does the job of creating strings of ones,
and C does the job of concatenating A and B.
This is important, because every programming language is specified using CFGs, for which we can
build compilers and interpreters, and it is important for these CFGs to be unambiguous since we do
not want multiple interpretations of a program.
CHAPTER 8. CONTEXT FREE GRAMMER 104

8.2.1 CFG to PDA conversion


Consider the Context-free grammar defined as follows

S → AS | ϵ
A → A1 | 0A1 | 01

Where,
Starting Symbol: S
Production Rules: S → AS, S → ϵ, A → A1, A → 0A1, A → 01
Non-terminals: {S, A}
Terminals: {0, 1}

Now Consider the following 2 alphabets,

Σ = {0, 1}, Γ = {S, A, X0 , X1 }

where Σ represents the alphabet of the PDA which we are going to construct and Γ represents the
alphabet of the stack.
Note that we have exactly same number of symbols as total number of terminals and non-terminals
in our CFG each symbol corresponding to each terminal or non-terminal.

We will start constructing our PDA by adding some edge(s) to a single node, pushing or popping
some letter from the stack while constructing derivation tree/ Parse tree by using each production
rule.

Step 1:
We initialize our stack by pushing S as S is
our starting symbol and we represent it as fol-
S
lows ,Where left means bottom of the stack
and right is it’s top
Stack = S,

Step 2:
We used the rule: S → AS here and added
the edge ϵ, S/AS into our single node as we can ϵ, S/AS
replace S to AS anywhere if we want to as it
S
is defined in our CFG’s production rules. Also
now we pop S from our stack and push S,A A S
(here A and S are from Γ).
Stack = S,A,
CHAPTER 8. CONTEXT FREE GRAMMER 105

Step 3: ϵ, S/AS
We used the rule: A → A1 here and added
the edge ϵ, A/AX1 (here X1 is just the stack S ϵ, A/AX1
symbol corresponding to the terminal ’1’) into A S
our single node as we can replace A to A1 any-
where while forming a string. Also now we pop A 1
A from our stack and push X1 ,A (here A and
X1 are from Γ).
Stack = S,X1 ,A,

Step 4: ϵ, S/AS
We used the rule: A → 01 here and added the
S ϵ, A/AX1 ϵ, A/X0 X1
edge ϵ, A/X0 X1 (here X1 ,X0 are just the stack
symbol corresponding to terminals 1 and 0) A S
into our single node. Also now we pop A from A 1
our stack and push X1 ,X0 (here X0 and X1 are
from Γ). 0 1
Stack = S,X1 ,X1 ,X0 ,

Step 5:
Now as we know that when terminals come on ϵ, S/AS
top of the stack and we read it from the top of
S ϵ, A/AX1 ϵ, A/X0 X1
the stack and pop it off and if a non terminal
is seen then we apply any of the given produc- A S
tion rules. So for this pop operation we now A 1
add the edges 0, X0 /ϵ and 1, X1 /ϵ into our sin- 0, X0 /ϵ
gle node. So we pop all the terminals until we 0 1
1, X1 /ϵ
see a non-terminal.
Stack = S,

ϵ, S/AS
Step 6:
S ϵ, A/AX1 ϵ, A/X0 X1
We now use the rule: S → ϵ here and added
the edge ϵ, S/ϵ into our single node. Also now A S
we pop S from our stack and push ϵ that is A 1 ϵ
stack becomes empty again. 0, X0 /ϵ ϵ, S/ϵ
Stack = 0 1
1, X1 /ϵ

Now this stack helps in doing the DFS of the derivation tree, we first visit all the nodes on left and
then all the nodes on right.
So to conclude we can say that given any arbitrary Context-free grammar we can easily construct
PDA out of it by using some simple steps as described above which recognizes the exact same lan-
guage as the given CFG.
CHAPTER 8. CONTEXT FREE GRAMMER 106

8.3 Cleaning Context-free Grammar


Consider the Context-free grammar defined as follows

S → ABS | 0A1
A → 1A0 | D | 01 | ϵ
B → 1B | BB0
C → A0 | 01
D → S | 0AD
Where,
Starting Symbol: S
Non-terminals: {S, A, B, C, D}
Terminals: {0, 1, ϵ}

8.3.1 Eliminating Useless Symbols


We can eliminate the production rules for the non-terminal symbol "C" from our context-free gram-
mar (CFG). Here’s the reasoning:
• The starting symbol "S" can generate strings using symbols from "A," "B," and itself.

• While "B" depends on itself and "A" depends on itself and "D," we ultimately find that "D"
relies on "A," "S," and itself.

• Therefore, knowing the strings accepted by "C" is unnecessary for generating strings from "S."
In other words, even without the production rules for "C," "S" can still generate valid strings
by appropriately utilizing "A," "B," and "D."
We can further simplify the grammar by identifying and removing another redundant non-terminal
symbol: "B". Analyzing the production rules for "B", we observe two key characteristics:

• Absence of Terminal Productions: None of "B"’s production rules generate strings con-
sisting solely of terminal symbols.

• No Introduction of Useful Non-Terminals: Additionally, "B"’s rules do not introduce any


other non-terminal symbols that could potentially lead to terminal strings through subsequent
productions in the parse tree.

• So the language generated by the non-terminal symbol "B" is empty (as it cannot produce any
finite length string by terminating at some point), we can safely remove all production rules
that involve "B" from the context-free grammar.

Now that we’ve removed the unnecessary rules, here’s the updated CFG:

S → 0A1
A → 1A0 | D | 01 | ϵ
D → S | 0AD
CHAPTER 8. CONTEXT FREE GRAMMER 107

8.3.2 Eliminating ϵ-Productions


For removing epsilon just see where you can apply the rule A → ϵ.
We can add one more rule in all that places where A appears by replacing it with ϵ.
So after doing these changes we won’t need the production rule A → ϵ in our CFG.
Updated CFG:

S → 0A1 | 01
A → 1A0 | D | 01 | 10
D → S | 0AD | 0D

8.3.3 Eliminating Unit Productions


Similar to the above property we can do the same thing here also.
Let’s say we have a production rule like this X → Y where X, Y are non-terminals then we can add
one more rule in all that places where "X" appears by replacing it with "Y".

Let’s first remove the rule D → S by doing the changes described above,

S → 0A1 | 01
A → 1A0 | D | S | 01 | 10
D → 0AD | 0AS | 0D | 0S
As we can see that after doing this modification to our CFG we now have two more production rules
A → S and A → D so let’s remove them also,

S → 0A1 | 0D1 | 0S1 | 01


A → 1A0 | 1D0 | 1S0 | 01 | 10
D → 0AD | 0AS | 0DD | 0DS | 0SD | 0SS | 0D | 0S
After doing this we now have at least one non-terminal symbol on the right hand side of any rule or
only terminals.

8.3.4 Reducing further to get at least two non-terminals on the right hand side or
a single terminal
Just replace the terminals 0 and 1 with their corresponding symbols "X0 " and "X1 " in the CFG
and add corresponding new rules.

S → X0 AX1 | X0 DX1 | X0 SX1 | X0 X1


A → X1 AX0 | X1 DX0 | X1 SX0 | X0 X1 | X1 X0
D → X0 AD | X0 AS | X0 DD | X0 DS | X0 SD | X0 SS | X0 D | X0 S
X0 → 0
X1 → 1

8.3.5 Reducing further to get exactly two non-terminals on the RHS of any pro-
duction rule which has >2 non-terminals on it’s RHS
Let us say our production rule has 3 non-terminals on it’s right hand side then we can reduce it to
exactly 2 non-teminals as follows,
CHAPTER 8. CONTEXT FREE GRAMMER 108

S → X0 AX1 can be reduced to S → X0 S1 and S1 → AX1


Note that the CFG after doing this modification will accept exactly same strings as accepted by it
initially because now in the parse tree S → X0 AX1 will just take one step more, it first expands to
S → X0 S1 and then to S1 → AX1 . However, the final string generated remains unchanged.
So it turns out that for any number of non-terminal symbols we can do this inductively to get exactly
2 non-terminals on RHS of any production rule.
Updated CFG:

S → X0 S1 | X0 X1
S1 → AX1 | DX1 | SX1
A → X1 A1 | X0 X1 | X1 X0
A1 → AX0 | DX0 | SX0
D → X0 D1 | X0 D | X0 S
D1 → AD | AS | DD | DS | SD | SS
X0 → 0
X1 → 1
The CFG written above is said to be in it’s Chomsky normal form (defined below).

8.3.6 Chomsky Normal form (CNF)


A Context-free grammar is said to be in it’s Chomsky normal form if it does not contain any useless
symbols, no epsilon productions (X → ϵ), no unit productions (X → Y ) and all production rules
are either of the form A → BC, where A, B, C are non-terminals or of the form D → u, where D is
a non-terminal and u is a terminal.

Every Context-free grammar can be expressed in it’s Chomsky normal form by application of various
rules as described in section 2 which accepts the same set of strings which is accepted by the original
CFG.

8.4 Pumping Lemma for Context-free Languages


Let G be the Chomsky normal form of any Context-free grammar having n non-terminals. Consider
derivation tree of a word "s" having length greater then 2n as shown in the figure (8.1) below.

Now due to the restricted nature of the CNF form of CFG (can contain maximum 2 symbols in it’s
RHS) the derivation tree will be a binary tree.
Now a binary tree of height n can have maximum of 2n leaves but here we are taking |s| > 2n so
the height of derivation tree must be > n. But we are having only n non-terminals in our CFG so
by pigeon hole principle there must exist some non-terminal "N" which repeats atleast twice.
We split up s into 5 parts as shown in the figure (8.1),

s = u.v.w.x.y
CHAPTER 8. CONTEXT FREE GRAMMER 109

Figure 8.1: Derivation tree of a word of length > 2n

Figure 8.2: Examples of some possible derivation trees from figure (8.1)

• Using the property of derivation trees (of CFG) we can say that whatever I can generate from
first "N" can also be generated by the second "N" and vice-versa.

• So after encountering a "N" in our derivation tree we can choose to generate a new "N" or
terminate after some productions by choosing the "N" which doesn’t produce another "N".

• So, to formally state pumping lemma for CFL we need to have some constraints like the
sub-derivation tree of "N" should not have repetition of any non-terminal that is |vwx| ≤ 2n .

• Also as "N" is a non-terminal therefore |w| ≥ 1.

8.4.1 Lemma:
If a language L is context-free, then there exists some integer n ≥ 1 (number of non-terminals in G)
such that every string s in L that has a length of 2n or more symbols (i.e. with |s| ≥ 2n ) can be
CHAPTER 8. CONTEXT FREE GRAMMER 110

written as
s = uvwxy,
with sub strings u, v, w, x, and y, such that:

1. |w| ≥ 1

2. |vwx| ≤ 2n , and

3. u.v i .w.xi .y ∈ L for all i ≥ 0.

Example: Consider a language L = {0n 1n 2n |n ≥ 0} and Σ = {0, 1, 2}. Now we need to prove that
L can not be a CFL.

Suppose it was a CFL having n distinct non-terminals, we define k = 2n and take a very large
string w = 02k 12k 22k ∈ L.
Now as |vwx| ≤ 2n or |vwx| ≤ k so v.w.x can either lie completely in 02k , 12k or 22k or in 02k 12k or
12k 22k .

In any case we can can pump the string to get a word which is not in L (any case will lead to
increase some but not all alphabets in that word so it won’t be in L) but it should be in L, therefore
L is not a CFL.

8.5 Closure Properties of CFL


In this section we will look at closure properties of Context Free Languages.

8.5.1 Union & Concatenation


Context Free Languages are closed under Union.

Let L1 , L2 be two Context Free Languages.


Their Context Free Grammars will be like,
S1 = ... , and S2 = ....

To represent L1 ∪ L2 , we can just represent its grammar as S = S1 |S2 .


Hence, L1 ∪ L2 will be Context Free Language.

Also to represent L1 .L2 , we can just represent its grammar as S = S1 S2 .


Hence, L1 .L2 will be Context Free Language.

8.5.2 Intersection
Context Free Languages are not closed under intersection.
CHAPTER 8. CONTEXT FREE GRAMMER 111

Lets see this with an example,


Consider languages L1 = {0n 1n 2k |n, k ≥ 0} and L2 = {0k 1n 2n |n, k ≥ 0}.

Both L1 and L2 are Context Free Languages.


L1 ∩ L2 = {0n 1n 2n |n ≥ 0}, which we saw above is not a Context Free Language.
Hence, Context Free Languages are not closed under intersection.

8.5.3 Complement
Suppose CFLs were closed under complementation. Then for any two CFLs L1 , L2 , we have
L1 and L2 are CFLs. Then, since CFLs are closed under union, L1 ∪ L2 is CFL.
Then, again by hypothesis, L1 ∪ L2 is CFL. i.e., L1 ∩ L2 is a CFL i.e., CFLs are closed under
intersection. which is Contradiction! Thus CFLs are not closed under Complement.
Another example let L = {x | x not of the form ww} is a CFL.
But L = {ww|w ∈ {a, b}} which is not a CFL. Thus CFLs are not closed under complement.

8.5.4 Substitution
Context Free Languages are closed under Substitution, this means that replacing any symbol with
a Language and the result will still be a CFL.
Given grammar G: S → 0S0 | 1S1 | ε Substituting h: 0 → aba and 1 → bb
Rules of G′ such that L(G′ ) = L(h(L(G))):

S → X0SX0 | X1SX1 | ε

X0 → aba
X1 → bb
Thus, CFLs are closed under substitution.
Chapter 9

Turing Machine

head

L R

0 0 1 b b

Tape

Figure 9.1: Turing Machine

Turing machines are a fundamental model of computation that can simulate the logic of any computer
algorithm, regardless of complexity. They are composed of:

1. Tape: The machine has an infinitely long tape divided into cells, each of which can contain a
symbol from a finite alphabet.

2. Sybmols: The tape cells contain symbols from a set. b is used to represent ’black’ cell (kind
of like NULL).

3. Head: The machine has a head that can read and write symbols on the tape and move the
tape left or right one cell at a time.

4. Transition function: The machine uses a transition function that dictates the machine’s
actions based on its current state and the symbol it reads on the tape. Actions include writing
a symbol, moving the tape left or right, and changing the state.

• 0/1, L: Read 0, replace with 1, Move to the left.


• 0/1, S: Read 0, replace with 1, stay there.
• 0/1, R: Read 0, replace with 1, Move to the right.
• −/0, S: no matter what’s present at head, replace with 0, stay there.
This action is similar to pushing into a stack.
• 0/b, L: here b is the blank symbol kept in every cell initially(means the tape cell is free).
This action is similar to poping from a stack.

112
CHAPTER 9. TURING MACHINE 113

We can minic a stack by the following interconversion of moves:

1. Pushing into the stack is equivalent to moving to the right and writing the symbols to be
pushed on the tape.

2. Popping off the stack is equivalent to moving to the left and writing black symbols to the tape.

The problems which can be solved using a deterministic turing machine along with a polynomial
number of cells in the tape compared to the length of the input are in P and those which can be
solved similarly but using non-deterministic turing machine are in NP.

9.1 Configuration of the Turing Machine


At any point when the Turing machine is running, we need to know which cell is the tape head and
which state the Turing machine is in. If the finite string we gave as input is X1 X2 .....Xn and the
tape head at an instant is Xi , and the turning machine is in state ‘q’. Then, we split the input
string into two parts, the left is from the first element of the string till the element before the
tape head and the right part is from the tape head to the last element.
The configuration is represented as X1 X2 ...Xi−1 qXi Xi+1 ...Xn .

Head

··· X1 X2 ··· Xi Xi+1 · · · Xn ···

Figure 9.2: Turing Machine Tape

Let us understand more through an example:


Suppose we have been given the string 011010 b0 (where  b is blank), the tape head at an instant
is 1 at the third position, and the turning machine is in state q5 . Therefore, the configuration is
01q5 1010 b0.

Head

··· b
 0 1 1 0 1 0 b
 0 b
 ···

Figure 9.3: Example1 - Turing Machine Tape

Note: Here the ‘ b’ before the first 0 and the ‘


b’ after the last zero are to represent the blanks
initialized in the tape and aren’t part of the input string.

Now let us see what happens:

• The Turing Machine scans cell ‘1’ (the tape head)


• The machine replaces the cell’s content with 0 according to the transition 1/0, L
CHAPTER 9. TURING MACHINE 114

0/1, R
q5 q9
1/0, L

Figure 9.4: Example - Turing Machine State Transition

• The machine transitions to state q9


• The tape head is shifted to the left
• Therefore, our updated configuration is: 0q9 1010
b0.

For simplicity, let us call the left part of the input string α and the right part β and say the state
is qi , so we can represent the configuration as αqi β

If the configuration changes from α0 q0 β0 to α1 q1 β1 and then to α2 q2 β2 .


This is depicted as
α 0 q0 β0 α 1 q1 β1 α 2 q2 β2


If α0 q0 β0 αi done βi , we have reached the end of the computation.
Note: ‘*’ on indicates multiple steps.

Let us look at another example to understand halting:

b/
 b, L

1/1, R
start q0 q1

0/0, R

Figure 9.5: Example 2 - Turing Machine State Transition Diagram

Let us look at the computation of the string 0 where the initial configuration is 
b q0 0.

Head

··· b
 0 b
 b
 ···

Figure 9.6: Example 2 - Turing Machine Tape (1)

Now the Turing machine scans the cell ‘0’, replaces it with ‘0’, moves right (to a blank cell), and
stays in the same state q0 . So now the configuration becomes 0 q0 
b.
CHAPTER 9. TURING MACHINE 115

Head

··· b
 0 b
 b
 ···

Figure 9.7: Example 2 - Turing Machine Tape (2)

This time, the Turing machine scans the cell ‘b’, replaces it with ‘
b’, moves left (to ‘0’), and stays
in the same state q0 , because of which the configuration changes to b q0 0.

Head

··· b
 0 b
 b
 ···

Figure 9.8: Example 2 - Turing Machine Tape (3)

As we can see, the machine does not halt on this input string.

Now, consider the input string 1 where the initial configuration is 


b q0 1.

Head

··· b
 1 b
 b
 ···

Figure 9.9: Example 3 - Turing Machine Tape (1)

The Turing machine scans the cell ‘1’, replaces it with ‘1’ moves right, and transitions to state q1 .
Hence the configuration switches to 1 q1 
b.

Head

··· b
 1 b
 b
 ···

Figure 9.10: Example 3 - Turing Machine Tape (2)

As we know from the transition diagram of the Turing machine, we can’t go to any other state of
q1 . So the input string 1 halts on this machine whereas 0 doesn’t.

We can notice that certain strings halt when processed by a Turing Machine, while others keep
don’t. This concept helps us define what it means for a Turing Machine to "accept" a string.
In essence, Turing machines provide us with a way to describe different languages.
CHAPTER 9. TURING MACHINE 116

9.2 Language described by a TM


• As described above, we can use a TM to describe a set of strings based on termination. This
will form the language of a TM. 1

• We can also talk about language a TM accepts using the notion of accepting and non-accepting
states. In this case, we will accept a string if the TM ever enters an accepting state during its
operation on the string as input.

• However, we will primarily be interested in acceptance by halting.

9.2.1 Inter conversion between acceptance by final state and by halting:


To summarize, the following are the two notions of acceptance of a string by a Turing Machine(in
both cases, the input is the standard string input described above):

• Acceptance by Halting: All those strings are accepted for whom the Turing Machine halts
its operation in finite time. We will primarily be interested in this kind of acceptance.

• Acceptance of Final State: All those strings are accepted for whom during the operation
of the Turing Machine, a final state is reached at some point in the journey.

It is not very difficult to convert the notion of acceptance by final state to that by halting. Once
you reach a final state, move on to a state whose only job is to finish parsing (similar to emptying
a stack with a PDA). Moreover, we must also ensure that the TM does not halt in a non-accepting
state. First, let us consider when a TM would halt on a non-accepting state:- when the TM has no
outgoing transition for a particular action/input:

0/1, L

1/1, R
start q0 q1

0/0, R
0/0, L

q2

b/b, R
1
As a side note, consider the string 0 1 0. The behavior of termination of TM described above using this string
as input depends on where you start the tail from. So, we will adopt a standard to describe the language described
by a TM by assuming that the string we have taken as input lies on the tail and head is at the leftmost character of
the string. If the TM halts taking this configuration as the starting configuration, then we say that the string lies in
the language of the TM.
CHAPTER 9. TURING MACHINE 117

1/1, R

0/0, R trap b/b, R

b/b, R 1/1, R

b/b, R
1/1, R
start q0 q1 0/1, L

0/0, R
0/0, R

q2

So, the process for conversion seems straightforward now: just remove the transitions going out of
the final state (in this case, we removed the loop of the final state), and if some string halts on
reaching some other state, make the state non-halting using trap state.

A slight note: The typical notion for acceptance by final state involves accepting the string only if
after reading the entire string, you reach an accepting state. However, we have a minor modification
that if you reach a final state at some point in the journey, read the rest of the string and stay
in the same state. If this modification had been considered in previous things we studied, such as
DFA, then it would have created a problem since that would mean all strings with their prefix as
the smallest string accepted by the DFA would be accepted (there can be other strings also, but at
least these would be accepted). In the case of a TM, this does not cause a problem due to 2 reasons:

• A TM can move left as well as right on the tape.

• A TM not only parses the string but also overwrites it.

So, in the case of a TM, it is not necessary that strings formed by concatenating something with an
accepting prefix will be accepted.

You might also like