BCS503 Notes 1
BCS503 Notes 1
Theory of Computation
Some Notes
DISCLAIMER
Alphabet
An alphabet is a finite non-empty set of symbols. It is
usually denoted by the Greek capital letter ∑. Typical
examples of alphabets are the binary alphabet {0, 1} and
the set of lower case English letters {a, b, c, ..., z}.
String
A string is a finite sequence of symbols from an alphabet.
It is usually denoted by w. The length of a string w,
denoted |w|, is the number of symbols in the sequence
w. For example |abc| = 3 and |01001| = 5. We also have
the empty string containing no symbol (or 0 symbols),
which is denoted by the Greek letter ͼ (epsilon). |ͼ| = 0.
Power of an Alphabet
If n ≥ 0 is an integer, ∑n represents the set of all strings
of length n, using ∑ as the alphabet. Note that ∑0 = {ͼ}.
Now, the set of all strings over ∑, represented as ∑* is
defined as ∑0 U ∑1 U ∑2 U ∑3 U ∑4 ... where U represents
set union operator.
Language
A language over an alphabet is a set of strings using that
alphabet. That is, a language L over ∑ is given by L
subset of ∑*. Note that L can be either finite or infinite.
An example of a finite language over {0, 1} is {01, 11,
0011, 000011} and an example of an infinite language
over {0, 1} is L = {w | w has the same number of 0’s as
1’s}
Finite Automata
A finite automaton is an automaton or machine which
can exist in only one of a finite number of states at any
given time.
Deterministic Finite Automata
A deterministic finite automaton or DFA has a finite set
of states Q, an alphabet ∑, a state transition function δ,
an initial or start state q0, and a set of final states F
subset of Q. In other words, a DFA A = (Q, ∑, δ, q0, F) is a
5-tuple. The transition function δ, can be represented by
a transition diagram or a transition table. For ever state
and a symbol, there is a single next state.
δ : Qx∑→Q
δ can be extended to a string w in ∑*. Let δ^ be the
extended transition function given by:
1. δ^(q, ε) = q
2. δ^(q, xa) = δ(δ^ (q, x), a) for x in Σ* and a in Σ.
For all q in Q.
The language of A is L(A) = {w| δ^(q0,w) is in F}.
Definition of ECLOSE
Basis: For any state q, ECLOSE(q) contains q.
Induction: If p is in ECLOSE(q) and there is an ε-
transition from state p to state r, then r is also in
ECLOSE(q).
Regular Expressions
Basis:
1. The empty set ϕ is a regular expression representing
the language ϕ = {}.
2. The empty string ε is a regular expression
representing the language {ε}.
3. If a is in the alphabet, a is a regular expression
representing {a}.
Induction:
Let E and F be regular expressions standing for L(E) and
L(F) respectively. Then:
1. E+F represents L(E) union L(F)
2. E.F or simply EF represents L(E).L(F)
3. E* represents (L(E))*
4. (E) represents the same language as E with
parentheses giving importance to the operators
contained within.
Proving Languages not to be regular
We use the pumping lemma for regular languages
(Theorem 4.1 from the textbook) to show that some
languages are not regular.
Example 1
L = {0n1n | n ≥ 0}. Let, if possible, L be regular. Let n be
the constant as given by the pumping lemma. (n is the
pumping length.) Consider w = 0n1n which is in L. |w| >
n. Hence, as per the pumping lemma, w=xyz where |xy|
≤ n. Let |y|=m>0. Consider the string xyiz. The number
of 0’s in it is n-m+im and the number of 1’s is n. These
two numbers are not equal unless i=1. For example, if
i=2, the number of 0’s is n+m, which is different from n.
This goes against the lemma. Therefor L is not regular.
Pushdown Automata
A pushdown automaton P is a 7-tuple (Q, Σ, Γ, δ, q0, Z0,
F). Q is a finite set of states the machine can be in. Σ is
the input alphabet. The input is scanned from left to
right and is either accepted or rejected. P has a stack
which can grow without any limit. The symbols which
can be pushed on the stack are given by the stack
alphabet Γ. The state q0 in Q is the starting state of P. F
is a subset of Q (not necessarily proper). If P reaches a
state in F, it accepts the input seen so far. The string of
symbols on the stack is written from left to right, with
the left most symbol being on the top of the stack. When
P starts, it has only one symbol Z0 on the stack. Moves of
P are given by the transition function δ. When the
current state of P is q, the input scanned is symbol a
from Σ, and the stack contains a symbol X from Γ on the
top, the next state can be given by p and symbol X can
be replaced by string γ from Γ*. So, δ(q, a, X) contains (p,
γ). Note that δ is, in general multi-valued. δ(q, a, X) is in
general a set of pairs of the form (q, γ). This means PDA P
is in general non-deterministic. Not only that, P can
have ε-moves, that is, moves without consuming any
input. We write an ε-move as δ(q, ε, X) which is a set of
values of the form (p, γ).
Normal Forms
A context-free grammar can be put in a normal form. A
normal for restricts the types of productions. We discuss
two normal forms: Chomsky Normal Form and Greibach
Normal Form.
Chomsky Normal Form (CNF)
A context-free grammar is in CNF if every production is
of the form A → BC or A → a where A, B and C are
variables and a is a terminal.
For any context-free language which is not empty and
which does not contain the null string ε, we can find a
context-free grammar which is in CNF.
To do this, we start with a context-free grammar for the
language and then follow the three steps:
1. Eliminate ε-Productions (Productions of the form A →
ε).
2. Eliminate Unit productions (of the form A → B).
3. Eliminate Useless Symbols.
A grammar symbol X (variable or terminal) is useful if S
=>* αXβ =>* γ for some γ in Σ*. That is X occurs in a
derivation from the start symbol which yields a string of
terminals. X is useless if it is not useful. To eliminate all
useless symbols, the following steps need to be followed
in proper order:
1. Eliminate non-generating variables. That is,
eliminate any variable A from which we cannot
derive a string of terminals.
2. Eliminate unreachable symbols, That is, eliminate
all grammar symbols which cannot be reached from
S.
After these steps, it is easy to convert the grammar into
CNF by the following procedure:
1. Let all productions of the form A → a or A → BC to
remain.
2. Introduce new variables so that the remaining
productions are of the form A → α where α is a string
of variables.
3. Break the productions A → α where |α| > 2: Let α =
Bβ. Replace it with productions A → BC and C → β.
Repeat 3 till CNF is achieved.
Turing Machines
A Turing Machine (TM) is one of the most general models
of computation. TMs are the most powerful automata in
their ability to recognize languages. A TM M = (Q, Σ, Γ, δ,
q0, B, F). Q is a finite set of states. The special state q0 in
Q is the start state. M is in the start state q0 when the
machine starts. M has infinitely many cells or squares on
which tape symbols can be written stretching from left to
right. The symbols are from the tape alphabet Γ. Σ is the
input alphabet. In the beginning, the input is provided
on the tape between two infinite sequences of blanks.
The blank B is a special tape symbol which is not in Σ.
All symbols from Σ as well as B are part of Γ. F is the set
of final accepting states. M has a tape head which is
positioned over one of the cells. Initially, the tape head is
scanning the first input symbol which is on the right of
the a blank.
The transition function δ is given by δ(q, X) = (p, Y, D)
where p and q are states, X and Y are symbols and D is a
direction, either left (L) or right (R). What this means, is
that when M is in state q, and the tape head has finished
scanning symbol X, the state changes to p, X is replaced
by Y and the head moves in the direction given by D.
Note that p can be the same as q and Y can be the same
as X. In some cases, δ(q, X) might not even be defined.
Then, M has no further moves; it has halted in state q.
Instantaneous Description or ID of a TM
The ID is given by X1X2…Xi-1qXiXi+1…Xn. The n tape
symbols X1 to Xn are present between the leftmost non-
blank symbol and the rightmost non-blank symbol on
the tape. The tape head is positioned over symbol Xi. The
current state is q. The moves of a TM can be visualized
as a change of IDs. For example, if δ(q, Xi) = (p, Y, R), the
ID changes to X1X2…Xi-1YpXi+1…Xn Similarly, if δ(q, Xi) =
(p, Y, L), the next ID is X1X2…pXi-1YXi+1…Xn.
Language of a TM
The language of a TM is the set of all strings w such that
starting from the ID q0w, M reaches a final state in F.
Once a final state is reached, we can assume, there is no
further move. The Turing Machine is said to halt if there
is no further move.
Example of TM
This ia a TM which replaces a sequence of 0’s with 1’s
and accepts
Example of TM which recognizes L = {0n1n | n ≥ 0}