0% found this document useful (0 votes)
13 views131 pages

CSE273 mmk1

CSE 273 focuses on the Theory of Computation, exploring abstract machines like automata and the limits of computability. Key contributions include Turing Machines and Lambda Calculus, which define computable functions and influence programming languages. The course covers finite automata, their types, and applications in compilers, text processing, and system verification.

Uploaded by

faheematamanna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views131 pages

CSE273 mmk1

CSE 273 focuses on the Theory of Computation, exploring abstract machines like automata and the limits of computability. Key contributions include Turing Machines and Lambda Calculus, which define computable functions and influence programming languages. The course covers finite automata, their types, and applications in compilers, text processing, and system verification.

Uploaded by

faheematamanna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 131

CSE 273: Theory of Computation

A theoretical branch of computer science


• study of abstract machines and the computation problems that can be solved
using these machines.
• The abstract machine is called the automata.
• a team consists of biologists, psychologists, mathematicians, engineers and few computer
scientists

 to model the human thought process, in a machine


Goals:
 the need to formally understand what can (and cannot) be computed.

References
 "Introduction to Automata Theory, Languages, and Computation" by John E. Hopcroft, Rajeev Motwani, and Jeffrey D. Ullman.
 "Theory of Computer Science: Automata, Languages and Computation" by K.L.P. Mishra and N. Chandrasekaran.

MMK@CSE
Notable early contributions on Computability
 Turing Machine - Alan Turing
• Provided a simple yet powerful model to define what it means for a function to be
computable.
• Established the theoretical underpinnings for modern computing and algorithms.

 Lambda Calculus - Alonzo Church


• Introduced a mathematical framework for expressing computation via functions.
• Influenced the development of programming languages like Lisp, Haskell, and Scala.
• Played a crucial role in exploring the theoretical limits of what can be computed.

MMK@CSE
Notable early contributions on Computability (cont.)
 Church-Turing Thesis
• Hypothesizes that any function effectively calculable by an algorithm can be computed by a
Turing Machine or expressed in Lambda Calculus.
• Unifies various models of computation (recursive functions, Lambda Calculus, Turing Machines).
• Widely accepted despite not being formally provable.
• Underpins much of modern computational theory, capturing the essence of computation.

 Uncomputable Problems Undecidable Problem (e.g. Halting Problem)


• Alan Turing's introduction of the Halting Problem was a pivotal moment in theoretical CS.
• By proving that the Halting Problem is undecidable and, consequently, uncomputable
• demonstrated that there are clear limits to what can be achieved through algorithms and
computation.
• illustrated that some problems cannot be solved by any algorithmic means, no matter how
powerful our computational models become.
MMK@CSE
Notable early contributions on Automata
 Two neurophysiologists, Warren McCulloch and Walter Pitts, were the first to present a
description of finite automata in 1943.
 Stephen Kleene introduced regular expressions and the concept of regular sets, which
are crucial in the theory of finite automata.
• His work on these topics laid the groundwork for text processing and compiler
construction.
 Later, G.H. Mealy and E.F. Moore, generalized the theory to design much more powerful
machines in 1955.

MMK@CSE
Applications and objectives
 Basis of many Applications

• Compilers and interpreters


• Text editors and processors
• Search engines
• System verification components

 Study the limits of computations

• What kinds of problems can be solved with a computer?


• What kinds of problems can be solved efficiently?
MMK@CSE
Important Terminologies
 An alphabet is a finite set of symbols.
1. Alphabets •Usually, we use Σ to represent an alphabet.
Examples:
2. Strings o Σ = {0, 1}, the set of binary digits.
3. Languages o Σ = {a, b, ..., z}, the set of all lower-case
4. Problems letters.
o Σ = {(, )}, the set of open and close
parentheses.

A string is a finite sequence of symbols from an alphabet.


Examples:
o 0011 and 11 are strings from Σ = {0, 1}.
o abacy and cca are strings from Σ = {a, b, ..., z}.
o (()()) and () are strings from Σ = {(, )}.

MMK@CSE
 A string is a finite sequence of symbols from an alphabet.
Basics of Strings o 0011 and 11 are strings from Σ = {0, 1}.
o abacy and cca are strings from Σ = {a, b, ..., z}.
o (()()) and () are strings from Σ = {(, )}.

aaabc, aaabc, aaabc


aaabc, aaabc
aaabc, aaabc, aaabc
aaabc, aaabc
aaabc, aaabc, aaabc

A proper prefix and proper suffix of a string is not equal to the string itself and non- empty.
MMK@CSE
Basics of Strings

MMK@CSE
Languages and Problems

In automata theory, a problem is to decide whether a given string is a


member of some particular language.

1. Alphabets
2. Strings
3. Languages
4. Problems

This formulation is general enough to capture the difficulty levels of all


computing problems.
MMK@CSE
Finite Automata or Finite State Machine
We will study 3 types of Finite Automata
• Deterministic Finite Automata (DFA)
• Non-deterministic Finite Automata (NFA)
• Finite Automata with ε-transitions (ε-NFA)

 There are some states and transitions (edges) between the states.
 An edge label defines the move from one state to another.

MMK@CSE
Deterministic Finite Automata

A DFA is a 5-tuple <Q, Σ, δ, q₀, F>:


• Q is a finite set of states
• Σ is a finite input alphabet
Transition Diagram
• δ is the transition function mapping Q×Σ to Q
• 𝑞𝑞0∈𝑄𝑄 is the initial state (only one)
• F ∈ Q is a set of final states (zero or more)

Note that there is one transition only for each


*
input symbol from each state
Transition Table
MMK@CSE
Deterministic Finite Automata
A DFA is a 5-tuple <Q, Σ, δ, q₀, F>:
• Q is a finite set of states
• Σ is a finite input alphabet
• δ is the transition function mapping Q×Σ to Q Transition Diagram

• 𝑞𝑞0∈𝑄𝑄 is the initial state (only one)


• F ∈ Q is a set of final states (zero or more)

Note that there is one transition only for


each input symbol from each state *
Transition Table
A DFA for Σ={0,1} that accepts strings with odd number of 0's
and any number of 1's.
MMK@CSE
Constructions of DFA
L={w: w has both an even number of 0’s and an even number of 1’s}

MMK@CSE
Constructions of DFA
L={w: w has both an even number of 0’s and an even number of 1’s}

MMK@CSE
Constructions of DFA
• Design a DFA that accepts strings of 0's and 1's ending with 011.

MMK@CSE
Constructions of DFA
• Design a DFA that accepts strings of 0's and 1's ending with 011.

MMK@CSE
Constructions of DFA
• Design a DFA that accepts strings of a's and b’s having exactly one a.

MMK@CSE
Constructions of DFA
• Design a DFA that accepts strings of a's and b’s having exactly one a.

 A dead state in a DFA is generally defined as a non-accepting state from which no sequence of inputs can
lead to an accepting state. Two school of thoughts -
1. This state may have outgoing transitions that either loop back to itself or lead to other non-accepting
states, but they do not lead to any accepting state​.
2. this state must not have any outgoing transition apart from self-loop.

MMK@CSE
Constructions of DFA
Give a DFA for Σ = {0, 1}; that accepts any string with 001 as a substring.

MMK@CSE
Constructions of DFA
Construct a DFA for Σ = {0, 1}; that accepts any string with 001
as a subsequence.

MMK@CSE
Constructions of DFA
L={w: w has both an even number of 0’s and an even number of 1’s}

MMK@CSE
Practice DFA Constructions
 Design a DFA for each of the following

• To accept strings of a’s and b’s that contains substring aba


• To accept strings of a’s and b’s that start with baba
• To accept strings of a’s and b’s that end with abba
• To accept strings of a’s and b’s that contains exactly two b’s

MMK@CSE
Practice DFA Constructions

MMK@CSE
Extended transition function for DFA
 The DFA defines a language
• the set of all strings that results in a sequence of state transitions from
the initial state to a final state

 Extended Transition Function


• denoted by δ�
• δ� takes a state q, a string ω and returns a state p
o p is the states where the automaton reaches when starting in a
state q and processing the sequence of inputs ω

MMK@CSE
Extended transition function for DFA
Basis
δ� (q, ε) = q
Induction
Suppose, ω = xa
i.e.; a is the last symbol of ω and x is the string consisting of all but the last symbol


Then, δ(q,ω) �
= δ(δ(q,x), a)
• �
To compute δ(q,ω), �
first compute δ(q,x), the state that the automaton is in after
processing all but the last symbol of ω.
• �
Suppose this state is p i.e.; δ(q,x) = p.
• �
Then δ(q,ω) is what we get by making a transition from state p on input a -
the last symbol of ω.

MMK@CSE
Extended transition function for DFA
The check involves computing δ(q� 0,ω) for each prefix ω of 110101, starting at
ϵ and going in increasing size.

MMK@CSE
Extended transition function for DFA
The check involves computing δ(q� 0,ω) for each prefix ω of 110101, starting at
ϵ and going in increasing size.

MMK@CSE
Non-deterministic Finite Automata (NFA)
A NFA is a 5-tuple <Q, Σ, δ, q₀, F>:
• Q is a finite set of states
• Σ is a finite input alphabet
• δ is the transition function mapping Q×Σ to a subset of Q
• 𝑞𝑞0∈𝑄𝑄 is the initial state (only one)
• F ∈ Q is a set of final states (zero or more)

The difference between DFA and NFA is in type of value δ returns

MMK@CSE
Non-deterministic Finite Automata (NFA)

Design a NFA that accepts all binary strings that end with 01

MMK@CSE
Non-deterministic Finite Automata (NFA)
Design a NFA that accepts all binary strings that end with 01

MMK@CSE
Extended Transition Function for NFA

MMK@CSE
The concept of ϵ-closure
ϵ-closure(T) = T + all NFA states reachable from any state in T using
only ϵ-transitions.

1. ϵ-closure(D)
2. ϵ-closure(A)
3. ϵ-closure(A, B, E)
4. ϵ-closure(C, E)

5. ϵ-closure(δ(A, a))
6. ϵ-closure(δ(A, b))

7. ϵ-closure(δ(B, a))
8. ϵ-closure(δ(B, b))

MMK@CSE
The concept of ϵ-closure
ϵ-closure(T) = T + all NFA states reachable from any state in T using
only ϵ-transitions. 1. ϵ-closure(D) = {D, A}
2. ϵ-closure(A) = {A}
3. ϵ-closure(A, B, E) = {A, B, E}
4. ϵ-closure(C, E) = {A, C, D, E}

5. ϵ-closure(δ(A, a)) = {C, D, A}


6. ϵ-closure(δ(A, b)) = {B}

7. ϵ-closure(δ(B, a)) = { }
8. ϵ-closure(δ(B, b)) = {E, D, A}

MMK@CSE
Subset Construction Algorithm: From ε-NFA or NFA to DFA

MMK@CSE
Subset Construction Algorithm: From ε-NFA or NFA to DFA

MMK@CSE
Subset Construction Algorithm: From ε-NFA or NFA to DFA

MMK@CSE
Subset Construction Algorithm: From ε-NFA or NFA to DFA

MMK@CSE
Subset Construction Algorithm: From ε-NFA or NFA to DFA

MMK@CSE
Subset Construction Algorithm: From ε-NFA or NFA to DFA

MMK@CSE
Subset Construction Algorithm: From ε-NFA or NFA to DFA

MMK@CSE
Table Filling Method: DFA Minimization and Equivalence Testing

0 1
A 0 B 1 C 0 D
0
1
0 1 1
E 1 F 1 G 0 H
1 0
0

MMK@CSE
Table Filling Method: DFA Minimization

0 1
A 0 B 1 C 0 D
0
1
0 1 1
E 1 F 1 G 0 H
1 0
0
Original DFA Minimized DFA

MMK@CSE
Table Filling Method: Equivalence Testing

MMK@CSE
Finite Automata with Outputs: Mealy and Moore Machines

A Moore machine is a six-tuple (Q, Σ, Δ, δ, λ, q0), where


• Q is a finite set of states
• Σ is the input alphabet
• Δ is the output alphabet
• δ is the transition function δ: Q × Σ → Q
• λ is the output function λ: Q → Δ
• q0 ∈ Q is the initial state.
A Mealy machine is a six-tuple (Q, Σ, Δ, δ, λ, q0), same as Moore machine except λ

MMK@CSE
Moore Machine

MMK@CSE
Mealy Machine

MMK@CSE
Transforming a Mealy Machine into a Moore Machine

MMK@CSE
Transforming a Moore Machine into a Mealy Machine

MMK@CSE
Transforming a Moore Machine into a Mealy Machine

MMK@CSE
Transforming a Moore Machine into a Mealy Machine

MMK@CSE
Transforming a Moore Machine into a Mealy Machine

MMK@CSE
Transforming a Moore Machine into a Mealy Machine

MMK@CSE
Regular Language, Regular Grammar
Regular Expressions (RE)
 The regular expressions are useful for representing certain sets of strings in an algebraic fashion.
 These describe the languages accepted by finite state automata.

We give a formal recursive definition of regular expressions over Σ as follows:

1. Any terminal symbol a (an element of Σ), ϵ and ∅ are regular expressions.
2. The union of two regular expressions R1 and R2, written as R1 + R2, is also a regular expression.
3. The concatenation of two regular expressions R1 and R2, written as R1.R2 (or R1R2), is also a RE.
4. The iteration (or closure) of a regular expression R, written as R*, is also a RE.
5. If R is a regular expression, then (R) is also a regular expression.
6. The REs over Σ are those obtained recursively by the application of the rules 1–5 once or several times.

MMK@CSE
Regular Expression Examples
• 𝐿𝐿1​ = the set of all strings of 0's and 1's ending in 00.
• 𝐿𝐿2 = the set of all strings of 0's and 1's beginning with 0 and ending with 1.
• 𝐿𝐿3​ = {ϵ, 11, 1111, 111111, ...}.

MMK@CSE
Regular Expression Examples (cont.)

• 𝐿𝐿1​ = the set of all strings of 0's and 1's ending in 00.

 (0+1)*00
• 𝐿𝐿2 = the set of all strings of 0's and 1's beginning with 0 and ending with 1.

 0(0+1)*1
• 𝐿𝐿3​ = {ϵ, 11, 1111, 111111, ...}.

 (11)*

MMK@CSE
Identities of Regular Expressions
 Two regular expressions P and Q are equivalent (we write P = Q) if P and Q represent
the same set of strings.

MMK@CSE
Identities of Regular Expressions (cont.)

MMK@CSE
Construct Finite Automaton equivalent to the Regular Expression

MMK@CSE
Arden’s Theorem
(1)

Proof:

Follow K.L.P. Mishra et al’s Book for details


Application of Arden’s Theorem: Convert FA to RE

MMK@CSE
Application of Arden’s Theorem: Convert FA to RE

MMK@CSE
Application of Arden’s Theorem: Convert FA to RE

MMK@CSE
Application of Arden’s Theorem: Convert FA to RE

MMK@CSE
Application of Arden’s Theorem: Convert FA to RE

MMK@CSE
Application of Arden’s Theorem: Convert FA to RE

MMK@CSE
Application of Arden’s Theorem: Convert FA to RE

MMK@CSE
Application of Arden’s Theorem: Convert FA to RE

MMK@CSE
Application of Arden’s Theorem: Convert FA to RE

MMK@CSE
Application of Arden’s Theorem: Convert FA to RE

MMK@CSE
Pumping Lemma
 The Pumping Lemma is a fundamental concept in the theory of formal languages, particularly useful for
demonstrating that certain languages are not regular.
o However, the converse—that a language satisfying the conditions of the Pumping Lemma is necessarily
regular—is not true.

 Such a structural insight reveals why some languages, due to their complexity, cannot be captured by
regular expressions or finite automata.

 This lemma offers a systematic approach to demonstrate the non-regularity of languages by exploiting the
inherent "repetitive" structure required of regular languages.

 Finite Automata has a finite or limited number of states, it can't create a new state for every new
character because it would run out of states.
 Instead, it has to reuse states, meaning it goes into a loop. This looping behaviour is what the pumping
lemma is all about.

MMK@CSE
Pumping Lemma
 If a language L is regular, then there exists an integer p (called the pumping length) such that any string s
in L with length at least p can be divided into three parts, s = xyz, satisfying the following conditions:

 By repeating 𝑦𝑦 any number of times (including zero), the resulting strings 𝑥𝑥𝑧𝑧, 𝑥𝑥𝑦𝑦𝑧𝑧, 𝑥𝑥𝑦𝑦𝑦𝑦𝑧𝑧, 𝑥𝑥𝑦𝑦𝑦𝑦𝑦𝑦𝑧𝑧, etc., also
belong to 𝐿𝐿. By changing i, we explore the language’s ability to handle repetitions or omissions of y.

 The length of 𝑥𝑥𝑦𝑦 must be at most 𝑝𝑝, which limits how 𝑤𝑤 can be split into these substrings.

MMK@CSE
Pumping Lemma For a string s in L, s = xyz:

Using the Pumping Lemma to show a language is not regular

MMK@CSE
Pumping Lemma

MMK@CSE
Pumping Lemma

MMK@CSE
Pumping Lemma

MMK@CSE
Pumping Lemma
 If a language L is regular, then there exists an integer p (called the pumping length) such that any string s in
L with length at least p can be divided into three parts, s=xyz, satisfying the following conditions:

MMK@CSE
Pumping Lemma
 If a language L is regular, then there exists an integer p (called the pumping length) such that any string s in
L with length at least p can be divided into three parts, s=xyz, satisfying the following conditions:

 The lemma essentially says that every long enough string in a regular language can be "pumped" or
repeated in a certain segment (denoted as y) without leaving the language.
 When processing a long string, a finite automaton must enter at least one state more than once (due to the
pigeonhole principle), creating a loop that can be repeated.

MMK@CSE
Pumping Lemma
 If a language L is regular, then there exists an integer p (called the pumping length) such that
any string s in L with length at least p can be divided into three parts, s=xyz, satisfying the
following conditions:

MMK@CSE
Context Free Grammars and Languages

G = (V, T, P, S)
• V – a set of variables, e.g. {S, A, B, C, D, E}
• T – a set of terminals, e.g. {a, b, c}
• P – a set of productions rules
In the form of A  α, where A∈V, α ∈(V∪T)*
• S is a special variable called the start symbol

MMK@CSE
A Context Free Grammar (CFG) example
G = (V, T, P, S)
• V – a set of variables, e.g. {S, A, B, C, D, E}
• T – a set of terminals, e.g. {a, b, c}
• P – a set of productions rules

In the form of A  α, where A∈V, α ∈(V∪T)*


• S is a special variable called the start symbol
P→ ϵ
P → 0
P → 1 OR P→ ϵ | 0 | 1 | 0P0 | 1P1
P → 0P0
P → 1P1
Two ways to represent a context free grammar: palindromes over 0’s and 1’s
MMK@CSE
Parse tree and Derivation in CFGs
How a string ω is generated by a grammar G?
S ⇒ AS S ⇒ AS
S ⇒ A1S ⇒ AAS

S  AS | ε A S ⇒ 011S ⇒ AA ε

A  A1 | 0A1 | 01 ⇒ 011AS ⇒ A0A1ε


A 1 A S
⇒ 0110A1S ⇒ A0011ε
S ⇒* 0110011 ? 0 1 0 A 1 ε ⇒ 0110011S ⇒ A10011ε
⇒ 0110011ε ⇒ 0110011ε
0 1
⇒ 0110011 ⇒ 0110011

Parse Tree or Derivation Tree Leftmost Derivation Rightmost Derivation

MMK@CSE
Parse tree and Derivation in CFGs
P → Bb | aSa |a A →Bac | bTC | ba

A → a | aSa T →aB | TB | ε

B → aBaC | b B →aTB | bBC | b


C →TBc | aBC | ac
C → aSa

A ⇒* babbcac?
P ⇒* aabaaaaba ?

MMK@CSE
Ambiguity in CFGs
 Each parse tree has one unique leftmost derivation and one unique rightmost derivation.
 A grammar is considered ambiguous if there exists at least one string that can be
generated by the grammar in more than one way (i.e., has more than one distinct parse tree).

o This implies that the string has more than one leftmost derivation or more than one rightmost
derivation.
o Ambiguity in grammars is a critical issue because it can lead to confusion in parsing and
interpreting the strings generated by the grammar, especially in compiler design and natural
language processing.
S
S
Consider the following grammar G:
A S
S → AS | a | b A S
A → SS | ab S S b
a b b
MMK@CSE
a b
Ambiguity in CFGs (cont.)
 Each parse tree has one unique leftmost derivation and one unique rightmost derivation.
 A grammar is considered ambiguous if there exists at least one string that can be
generated by the grammar in more than one way (i.e., has more than one distinct parse tree).

Consider the following grammar:


E → E + E | E * E | (E) | x | y | z

Is the grammar ambiguous?

MMK@CSE
Ambiguity in CFGs (cont.)
 Each parse tree has one unique leftmost derivation and one unique rightmost derivation.
 A grammar is considered ambiguous if there exists at least one string that can be
generated by the grammar in more than one way (i.e., has more than one distinct parse tree).

Consider the following grammar:


E → E + E | E * E | (E) | x | y | z

Is the grammar ambiguous?

E ⇒ E + E E E E⇒E+E
⇒ x + E ⇒E+E+E
⇒ x + E + E E+ E E + E ⇒x+E+E
⇒ x + y + E ⇒x+y+E
⇒ x + y + z x y + z x + y z ⇒x+y+z
MMK@CSE
Construction of Reduced Grammar
Simplification of Context Free Grammars (CFGs)

 Remove useless symbols


• Generating Variables
• Reachable symbols (variables and terminals)

 Remove ε-productions, e.g. A ε


 Remove unit-productions, e.g. AB

MMK@CSE
Construction of Reduced Grammar
Remove non-generating variables and keep productive variables

S → AB • The set W1 ={A, B, E} includes the symbols that have productions with a
A→a terminal string on the right-hand side (RHS): A → a; B → b; E → c

B→b|D • W2 ={S, A, B, E}, S is included in list as it can generate through A and B

E→c
• W3 ={S, A, B, E} = W2 //the stopping condition has been met

Now remove all non-generating variables and construct the revised grammar:

S → AB
A→a
B→b
E→c MMK@CSE
Construction of Reduced Grammar Orifinal CFG:- S → AB; A → a; B → b | D; E → c

Remove non-reachable symbols

S → AB • W1 = {S} // start with start symbol S

A→a • W2 = {S, A, B} // from S → AB, we can reach A and B

B→b • W2 = {S, A, B} ∪ {a, b} //from A and B, terminal a and b can be reached

E→c • W3 = {S, A, B} ∪ {a, b} = W3 //the stopping condition has been met

Now remove all non-reachable symbols and construct the revised grammar:

S → AB
 For this example, 40% of the original CFG has been
A→a removed without compromising on the solution quality.

B→b
MMK@CSE
Construction of Reduced Grammar
Remove useless symbols

S → aAa
A → Sb | bCC | DaA
C → abb | DD
E → aC
D → aDA

MMK@CSE
Construction of Reduced Grammar
Remove useless symbols Reduced grammar

S → aAa S → aAa
A → Sb | bCC | DaA A → Sb | bCC
C → abb | DD C → abb
E → aC
D → aDA

MMK@CSE
Construction of Reduced Grammar
Remove useless symbols

S → AB | CA
B → BC | AB
A→a
C → aB | b

MMK@CSE
Construction of Reduced Grammar
Remove useless symbols Reduced grammar

S → AB | CA S → CA
B → BC | AB A → a
A→a C → b
C → aB | b

MMK@CSE
Construction of Reduced Grammar
Remove useless symbols Reduced grammar

A → AB | CA None of the productions are usefull !!!

B → DC | AB | ab
C→a
D → aB | b

MMK@CSE
Construction of Reduced Grammar
Remove Null Productions
Construction of Nullable variables
S → aS | AB
W1 = {A, B}
A→ϵ
W2 = {S, A, B}
B→ϵ
W3 = {S, A, B} = W2 //the stopping condition has been met
D→b
Final CFG after removing null productions

S → aS | a | AB | A | B
D→b
MMK@CSE
Construction of Reduced Grammar
Remove Unit Productions Final CFG after removing unit productions
S → AB
W(S) = {S}
S → AB
A→a
W(A) = {A} A→a
B→C|b W(B) = {B, C, D, E}
B → b | ab | aBa
W(C) = {C, D, E}
C→D W(D) = {D, E} C → ab | aBa
W(E) = {E}
D→E D → ab | aBa
E → ab | aBa E → ab | aBa

MMK@CSE
Construction of Reduced Grammar
Remove Unit Productions
Final CFG after removing unit productions
S → aS | a | AB | A | B
S → aS | a | AB
D→b
D→b
W(S) = {S,A,B}
W(A) = {A}
W(B) = {B}
W(D) = {D}

What happens if we remove useless symbols as well?

MMK@CSE
Chomsky Normal Form - CNF

Chomsky Normal Form (CNF) is a way of organizing and simplifying the production rules of a context-free
grammar (CFG) to assist in various computational processes.

A CFG is in Chomsky Normal Form if all of its production rules satisfy one of the following conditions:

 A non-terminal produces exactly two non-terminals, A → BC, where 𝐴𝐴, B, and 𝐶𝐶 are non-
terminal symbols.

 A non-terminal produces exactly one terminal symbol A → 0, where 0 is a terminal symbol.

 Optionally, a rule that allows the start symbol to produce the empty string, A ε, if

necessary for deriving ε from the grammar.

MMK@CSE
Chomsky Normal Form - CNF
Converting a general CFG to Chomsky Normal Form involves several steps:

1. Remove ε-productions, Except for the start symbol


2. Remove unit-productions, e.g. AB
3. Remove useless symbols (optional)
I. Generating Variables
II. Reachable symbols (variables and terminals)
4. Finally convert the resultant CFG to CNF

MMK@CSE
Chomsky Normal Form - CNF
Reduce the following grammar to its equivalent CNF:
S → aAD
A → aB | bAB Since there is no null or unit production in the original grammar, we can
directly start converting it to CNF.
B→b
D→d S → PQ
S → PAD A → aB A → PB
S → aAD P→a
P→a
Q → AD
S → PQ A → BR
A → bAB A → BAB A → BR P→a R → AB
CNF:
R → AB
B→ b B→ b Q → AD B→b
B→b
A → PB D→d
D→d D→d D→d
MMK@CSE
Chomsky Normal Form – CNF (cont.)
Reduce the following grammar to its equivalent CNF:
S → aAbB
A → aA | a Since there is no null or unit production in the original grammar, we can
directly start converting it to CNF.
B → bB | b
S → MN
D→b S → PADB
M → PA
P→a A → aA A → PA
S → aAbB N → DB
A→a A→a
D→b P→a
D→b
S → MN A → PA | a
M → PA B → DB
B → bB B → DB CNF:
N → DB B→b
B→ b B→ b P→a D→d
MMK@CSE
The CYK algorithm – a dynamic programming approach for parsing
• The CYK algorithm (Cocke-Younger-Kasami algorithm) is a prominent CNF grammar G
parsing algorithm for context-free grammars, particularly useful when the
• S  AB | BC
grammar is presented in CNF.
• A  BA | a
• The CYK algorithm is used to decide whether a given string belongs to the • B  CC | b
language generated by a grammar.
• C  AB | a

• The algorithm uses dynamic programming to build a table (often a w is baaba


triangular array) that represents possible substrings of the input string
Question Is baaba in L(G)?
and their corresponding derivations according to the grammar rules.

MMK@CSE
The CYK algorithm – a dynamic programming approach for parsing

CNF grammar G
{S, A, C}
• S  AB | BC

• A  BA | a Ø {S, A, C}

• B  CC | b Ø {B} {B}
{S, A} {B} {S, C} {S, A}
• C  AB | a {B} {A, C} {A, C} {B} {A, C}

w is baaba b a a b a

Question Is baaba in L(G)?

MMK@CSE
Chomsky hierarchy: Chomsky classification of grammars

Grammar
Grammar Type Language Accepted Automation
Accepted
Recursively Enumerable
Unrestricted Recursively Enumerable
Type 0 Grammar Language
Turing Machine

Context Sensitive
Context Sensitive Context Sensitive Linear-bounded
Type 1 Grammar language automaton
Context Free
Context Free Push Down
Type 2 Context Free language
Grammar Automata Regular
Type 3 Regular Grammar Regular Language Finite Automata

MMK@CSE
Push Down Automata
A PDA is a 7-tuple (Q, Σ, Γ, δ, q0, Z0, F)
• Q is a finite set of states
• Σ is a finite input symbols
• Γ is a finite stack alphabet
• δ is the transition function that governs the behaviour of the automaton

Formally, 𝛿𝛿 takes as argument a triple 𝛿𝛿(𝑞𝑞,𝑎𝑎,𝑋𝑋),


i. q is a state in Q
ii. a is either an input symbol in Σ or 𝑎𝑎 = 𝜖𝜖, the empty string, which is assumed not to be an input symbol.
iii. X is a stack symbol, that is, a member of Γ.
o The output of 𝛿𝛿 is a finite set of pairs (𝑝𝑝,𝛾𝛾), where 𝑝𝑝 is the new state, and 𝛾𝛾 is the string of stack symbols
that replaces 𝑋𝑋 at the top of the stack.
o For instance, if 𝛾𝛾=𝜖𝜖, then the stack is popped; if 𝛾𝛾=𝑋𝑋, then the stack is unchanged, and if 𝛾𝛾=𝑌𝑌𝑍𝑍, then 𝑋𝑋 is
replaced by 𝑍𝑍, and 𝑌𝑌 is pushed onto the stack.

• 𝑞𝑞0 ∈ 𝑄𝑄 is the initial state (only one)


• Z0 ∈ Γ is the initial stack symbol, the PDA’s stack consists of one instance of this symbol and nothing else.
• F∈Q is a set of final states (zero or more)
MMK@CSE
A Push Down Automata (PDA)

MMK@CSE
A Push Down Automata (PDA)

MMK@CSE
The derivation in a PDA

MMK@CSE
Deterministic PDA

A Language L that accepts strings with more 0’s than 1’s.

• Deterministic Push Down Automata (DPDA) follows deterministic rules


for its transitions.
• a DPDA ensures that for any combination of current state, input
symbol, and stack symbol, there is at most one valid transition.

To be a DPDA, the following two conditions must be met:

1. For any state q ∈ Q, any input symbol s ∈ Σ ∪ Ɛ, and any stack


symbol t ∈ Γ, the set δ (q, s, t) has at most one element.

2. For any state q ∈ Q and any stack symbol t ∈ Γ, if δ (q, Ɛ, t) is


not empty, then δ (q, s, t) = ∅ for each s ∈ Σ.

MMK@CSE
PDA Accepted by Final State
Let P = (Q, Σ, Γ, δ, q0, Z0, F) be a PDA. Then 𝐿𝐿(𝑃𝑃) the language accepted by P final state, is

• for some state 𝑞𝑞 in F and any stack string 𝛼𝛼.


• That is, starting in the initial ID with 𝑤𝑤 waiting on the input, 𝑃𝑃 consumes 𝑤𝑤 from the input
and enters an accepting state.
• The contents of the stack at that time is irrelevant.

PDA Accepted by Empty Stack


Let P = (Q, Σ, Γ, δ, q0, Z0, F) be a PDA. We define

• for any state q. That is, N(P) is the set of inputs w that P can consume
and at the same time empty its stack.
MMK@CSE
PDA: Conversion from Empty Stack to Final State

MMK@CSE
PDA: Conversion from Final State to Empty Stack

MMK@CSE
Summary of Transformation

MMK@CSE
From Context Free Grammar to PDA
I → a | b | Ia | Ib | I0 | I1
E → I | E * E | E + E | (E)
• The set of terminals for the PDA is {a, b, 0, 1, (, ), +, ∗}.
• These eight symbols and the symbols I and E form the stack alphabet.
• a) δ(q, ϵ, I) = {(q, a), (q, b), (q, Ia), (q, Ib), (q, I0), (q, I1)}
• b) δ(q, ϵ, E) = {(q, I), (q, E + E), (q, E * E), (q, (E))}

• c) δ(q, a, a) = {(q, ϵ)}


• δ(q, b, b) = {(q, ϵ)}
• δ(q, 0, 0) = {(q, ϵ)} δ(q, 1, 1) = {(q, ϵ)} δ(q, (, ()) = {(q, ϵ)} δ(q, ), )) = {(q, ϵ)} δ(q, +,
+) = {(q, ϵ)} δ(q, *, *) = {(q, ϵ)}
MMK@CSE
Turing Machine and Undecidability
• The Turing machine, conceptualized by Alan Turing in the 1930s, is a theoretical
computational device that models the behaviour of a general-purpose
computer.
• Turing machines are used to study the limits and capabilities of computation
systems, and to prove that certain problems are undecidable.

• An undecidable problem is a problem that cannot be solved by a Turing


machine.
• One example of an undecidable problem is the halting problem – proposed by Turing in
1936 and proved that the halting problem is undecidable.

• The halting problem asks whether there exists a program that can determine, for any
given program and input, whether that program will eventually halt or continue running
indefinitely.

MMK@CSE
Turing Machine and Undecidability (cont.)
• The existence of undecidable problems has important implications for the
theory of computation.
• It means that there are some problems that cannot be solved by computers, no
matter how powerful they are.
• This has led to the development of new techniques for solving problems that
are not undecidable, such as approximation algorithms and heuristics.

MMK@CSE
Nondeterministic Polynomial (NP)
• In computational complexity theory, NP refers to the class of decision
problems for which a "yes" instance can be verified in polynomial time.
• In other words, if someone claims to have a solution to an NP problem, it can
be verified efficiently. However, finding a solution itself may not be
computationally efficient.

MMK@CSE
Nondeterministic Polynomial (NP)
• The relationship between NP and undecidable problems lies in the concept of
the "P versus NP problem.“
• This problem asks whether every problem for which a solution can be verified
in polynomial time (NP) can also be solved in polynomial time (P).
In other words, is NP equal to P?
• If P = NP, it would mean that any problem with an efficient verification
algorithm also has an efficient solution algorithm.

• However, if P ≠ NP, it implies that there are problems for which no efficient
solution algorithm exists, even though a solution can be verified efficiently.
• This would indicate a fundamental gap between the ability to verify solutions
and the ability to find solutions.
MMK@CSE
Relationship between Undecidable and NP
• Undecidable problems and NP are connected in the sense that undecidable
problems generally fall outside the jurisdiction of NP.

• Undecidable problems, such as the halting problem, are beyond the scope of
computation, regardless of whether efficient verification algorithms exist.

• NP problems, on the other hand, are in the jurisdiction of computation, but


finding efficient solutions remains an open question for many of them.

MMK@CSE
Turing Machine
A Turing Machine can be defined by 7-tuple (Q, Σ, Γ, δ, q0, B, F)

• Q is a finite set of states


• Σ is a finite input symbols
• Γ is a complete set of tape symbols, Σ is always a subset of Γ.
• δ is the transition function that governs the behaviour of the automaton
Formally, the arguments of 𝛿𝛿(𝑞𝑞, 𝑋𝑋) are a state q and symbol X.
i. p is the next state, in Q
ii. Y is the symbol, in Γ, written in the cell being scanned, replacing whatever symbol was there.
iii. D is direction, either L or R, standing for left or right, respectively, and telling us the
direction in which the head moves.
• 𝑞𝑞0 ∈ 𝑄𝑄 The start state, a member of Q, in which the finite control is found initially.
• B∈ Γ is the blank symbol, and is in Γ but not in Σ.
• F∈Q is a set of final states (zero or more)

MMK@CSE
Turing Machine (Q, Σ, Γ, δ, q0, B, F)

q00011 ⊢ X q1011 ⊢ X0 q111 ⊢ X q20Y1 ⊢ q2X0Y1 ⊢

Xq00Y1 ⊢ XXq1Y1 ⊢ XXYq11 ⊢ XXq2YY ⊢ X q2XYY ⊢

XXq0YY ⊢ XXYq3Y ⊢ XXYYq3B ⊢ XXYYq4B

MMK@CSE
Turing Machine TM = (Q, Σ, Γ, δ, q0, B, F)

MMK@CSE
Final States vs Halting States in Turing Machine
Halting State:
• A state in a Turing Machine where computation stops.
• Once a halting state is reached, no further transitions occur, and the
machine halts. Includes both

• Accepting State (𝑞𝑞accept): A special type of halting state where the input is
accepted. by the machine.
• Rejecting State (qreject): Indicates the machine rejects the input.

MMK@CSE
Turing Machine Example TM = (Q, Σ, Γ, δ, q0, B, F)

MMK@CSE
Turing Machine Example TM = (Q, Σ, Γ, δ, q0, B, F)

Construct a Turing machine that accepts strings over alphabet {0, 1} where the last symbol is 1.

MMK@CSE
Strength of Turing Machine
The Halting Problem
You’re given a description of a Turing Machine M and some input w. You
want to determine whether M halts when run on w.
• H= {⟨M,w⟩ ∣ M halts on input w}.

• The Halting Problem H is undecidable, meaning no Turing Machine can


• always halt on all inputs ⟨M,w⟩
• reliably say “yes” if M halts on w and “no” if M does not halt on w.

• In other words, a Turing Machine cannot always solve the Halting Problem by giving a correct
yes/no answer in finite time on every input.

MMK@CSE
Strength of Turing Machine
Semi-Decidability (Recognizability) or Recursively Enumerable
This means there is a Turing Machine R that “recognizes” membership in H.

If M does halt on w, then R will eventually halt and accept.


If M does not halt on w, then R either runs forever (never halts) or otherwise
never accepts.

Key difference:
• A semi-decider (recognizer) does not have to reject in finite time if ⟨M,w⟩ is not in the language;
it’s allowed to run forever.
• But whenever the input is in the language, it must eventually say “yes, I accept.”

MMK@CSE
How do we build such a Recognizer
Take an input pair ⟨M,w⟩ and put it on the tape
Start running M on w
- if M halts, R halts, and accepts
- if M never halts, R never halts, i.e. if M keeps running forever, then so does R.
Since R accepts exactly those pairs ⟨M,w⟩ for which M halts on w, R recognizes the Halting
Problem.

Undecidable: No Turing Machine can guarantee a yes/no answer for every input and always halt in
finite time.
Semi-decidable (recursively enumerable): There is a Turing Machine that will halt and accept
exactly the inputs that are in the language. For inputs not in the language, that machine might run
forever (never halting).
Turing Machine can partially solve it in the sense that they can recognize which inputs belong to
the language H – they just cannot also reliably recognize those that don’t (they might get stuck
running forever in the latter case)
MMK@CSE
How do we build such a Recognizer
Semi-decidable (recursively enumerable): There is a Turing Machine that will halt and accept
exactly the inputs that are in the language. For inputs not in the language, that machine might run
forever (never halting).
• The Halting Problem is semi-decidable because we can detect halting (a "yes" instance) but
cannot reliably determine non-halting (a "no" instance).
• Semi-decidability often involves an algorithm that provides a partial solution: It can confirm
halting behaviour but cannot conclusively identify non-halting behaviour.

MMK@CSE
Another undecidable problem
Why is Program Equivalence Undecidable?

• The Program Equivalence Problem is undecidable because it is as hard (or harder) than the
Halting Problem, which is already proven to be undecidable.

MMK@CSE
Another undecidable problem
Why is Program Equivalence Undecidable?

MMK@CSE
Another undecidable problem
Why is Program Equivalence Undecidable?

MMK@CSE

You might also like