0% found this document useful (0 votes)
39 views64 pages

Theory of Computation

Uploaded by

borixot531
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views64 pages

Theory of Computation

Uploaded by

borixot531
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

UNIT I

Theory of Computation

Automata theory (also known as Theory of Computation) is a theoretical branch of Computer


Science and Mathematics, which mainly deals with the logic of computation with respect to simple
machines, referred to as automata.

Automata* enables scientists to understand how machines compute the functions and solve
problems. The main motivation behind developing Automata Theory was to develop methods to
describe and analyze the dynamic behavior of discrete systems.

Automata originated from the word “Automaton” which is closely related to “Automation”.

Basic Terminologies of Theory of Computation:

Now, let’s understand the basic terminologies, which are important and frequently used in the
Theory of Computation.

Symbol:

A symbol (often also called a character) is the smallest building block, which can be any alphabet,
letter, or picture.

Alphabets (Σ):

Alphabets are a set of symbols, which are always finite.


Conclusion:

For alphabet {a, b} with length n, number of

strings can be generated = 2n.

Note: If the number of symbols in the alphabet Σ is represented by |Σ|, then a number of strings of
length n, possible over Σ is |Σ|n.
The central concepts of Automata theory
Some central concepts of automata theory include:

• Strings: A sequence of symbols from an alphabet, also sometimes called a word. The empty
string, denoted as ε, is a string with zero symbols.

• Formal languages: A language made up of words that follow a specific set of rules and are
derived from an alphabet.

• Automata: A finite representation of a formal language, which can be infinite.

Automata theory is the study of abstract computing devices, or machines, that convert information
from one form to another. It's a core part of computer science and helps us understand the
capabilities and limitations of computers.

Some applications of automata theory include:

• Spelling checkers and advisers: Finite automata can be used to describe large natural
vocabularies.

• Multi-language dictionaries: Finite automata can be used for multi-language dictionaries.

• Document indentation: Finite automata can be used to indent documents.

• Calculators: Finite automata can be used to evaluate complex expressions in calculators.

• Hardware design: Finite automata can be used in hardware design for circuit verification,
designing hardware boards, and more.

First, let’s understand the basic terminologies/concepts, which are important and

frequently used in Formal Languages and Automata Theory.

1. Symbol:

A symbol is the smallest building block (an abstract entity), which can be any alphabet,

digit, or any picture.

Example: Frequently used symbols are: a, b, c, ……, z, A, B,…,Z, 0, 1, …..9

2. Alphabet:

An alphabet is a finite, nonempty set of symbols. It is denoted by Σ(Sigma).

Example: Frequently used alphabets are:

Σ = {0,1} is an alphabet of binary digits

Σ = {0,1,2,….9} is an alphabet of decimal digits

Σ = {a,b,c} is one alphabet


Σ = {A,B,C,….,Z} is an alphabet of Capital Letters

Σ = {0,1,a,b} is one alphabet

Σ = {a,b,c,#} is one alphabet

3. String:

A string is a finite sequence of symbols chosen from some alphabet. It is generally

denoted by w OR s and string is also called a word.

Examples:

• abaaab is a strings from the alphabet Σ={a,b}

• 01, 000, 1111, and 01101 are the strings from the binary alphabet Σ={0,1}

4. Length of a string:

The length of a string is the number of symbols in the string. It is denoted by |w|.

Example: If w=abaa is a string then length is |w|=|abaa|=4.

5. Substring:

A part of a string is called a substring.

Ex: Let a string w=abcd , then

Substrings are: a , b , c , d , ab , bc , cd , abc , bcd , and abcd.

But acd is not a substring. Similaryly ac , ad , bd , abd are not substrings.

6. Prefix of a string (or) Starting substring:

A prefix of a string is any number of leading symbols of that string.

Ex: Let a string w=abc , then

Prefixes are: ε , a , ab and abc.

7. Suffix of a string (or) Ending substring:

A suffix of a string is any number of trailing symbols of that string.

Ex: Let a string w=abc , then

Suffixes are: ε , c , bc and abc

8. Powers of an alphabet (Powers of Σ):

Let Σ be an alphabet, then ΣK is nothing but set of all strings of length K, over Σ.

If Σ={a,b}

• Σ1 is nothing but set of all strings of length ‘1’ over Σ

Σ1 ={a,b} and total strings is: |Σ1| = 2


• Σ2 is nothing but set of all strings of length ‘2’ over Σ

Σ2 => Σ.Σ (concatenation) => {aa, ab, ba, bb} and total strings is: |Σ2| = 4

9. Empty String (ε):

The empty string consisting of zero symbols, it is denoted by ε(epsilon).

Σ0 is nothing but set of all strings of length ‘0’ over Σ

Σ0 ={ε} , length of ε is |ε|=0, and total strings is: |Σ0| = 1

10. Kleene Closure (Σ*):


Set of all possible strings including epsilon(ε) over an alphabet Σ is called kleene closure

and it is denoted with Σ*.

Let Σ={a,b}.

Σ* = Σ0 U Σ1 U Σ2 U Σ3 …….. Σn i.e Union of all lengths of strings.

Σ* = {{ε} U {a,b} U {aa,ab,ba,bb} ……………} i.e All Possible Strings over Σ

Σ* = {ε, a, b, aa, ab, ba, bb ……………}

11. Positive Kleene Closure (Σ+):

Set of all possible strings excluding epsilon(ε) over an alphabet Σ={a,b}.

Σ+ = Σ1 U Σ2 U Σ3 …….. Σn

Σ+ = {{a,b} U {aa,ab,ba,bb} ……………}

12. Language:

A set of strings, all of which are chosen from some Σ* is called a language. It is

denoted by L. Where Σ is a particular alphabet.

Or

we can say A Language is a subset of Σ*.

• Σ* is the super set of all the languages i.e If L is a language, then L ⊆ Σ*


DFA (Deterministic finite automata)
o DFA refers to deterministic finite automata. Deterministic refers to the uniqueness of the
computation. The finite automata are called deterministic finite automata if the machine is
read an input string one symbol at a time.

o In DFA, there is only one path for specific input from the current state to the next state.

o DFA does not accept the null move, i.e., the DFA cannot change state without any input
character.

o DFA can contain multiple final states. It is used in Lexical Analysis in Compiler.

In the following diagram, we can see that from state q0 for input a, there is only one path which is
going to q1. Similarly, from q0, there is only one path for input b going to q2.

Formal Definition of DFA

A DFA is a collection of 5-tuples same as we described in the definition of FA.

1. Q: finite set of states

2. ∑: finite set of the input symbol

3. q0: initial state

4. F: final state

5. δ: Transition function

Transition function can be defined as:

1. δ: Q x ∑→Q

Graphical Representation of DFA

A DFA can be represented by digraphs called state diagram. In which:

Advertisement

1. The state is represented by vertices.

2. The arc labeled with an input character show the transitions.


3. The initial state is marked with an arrow.

4. The final state is denoted by a double circle.

Example 1:

1. Q = {q0, q1, q2}

2. ∑ = {0, 1}

3. q0 = {q0}

4. F = {q2}

Solution:

Transition Diagram:

Transition Table:

Present State Next state for Input 0 Next State of Input 1

→q0 q0 q1

q1 q2 q1

*q2 q2 q2

Example 2:

DFA with ∑ = {0, 1} accepts all starting with 0.

Solution:
Explanation:

o In the above diagram, we can see that on given 0 as input to DFA in state q0 the DFA changes
state to q1 and always go to final state q1 on starting input 0. It can accept 00, 01, 000,
001....etc. It can't accept any string which starts with 1, because it will never go to final state
on a string starting with 1.

Example 3:

DFA with ∑ = {0, 1} accepts all ending with 0.

Solution:

Explanation:

In the above diagram, we can see that on given 0 as input to DFA in state q0, the DFA changes state
to q1. It can accept any string which ends with 0 like 00, 10, 110, 100....etc. It can't accept any string
which ends with 1, because it will never go to the final state q1 on 1 input, so the string ending with
1, will not be accepted or will be rejected.

Examples of DFA

Example 1:

Design a FA with ∑ = {0, 1} accepts those string which starts with 1 and ends with 0.

Solution:

The FA will have a start state q0 from which only the edge with input 1 will go to the next state.

In state q1, if we read 1, we will be in state q1, but if we read 0 at state q1, we will reach to state q2
which is the final state. In state q2, if we read either 0 or 1, we will go to q2 state or q1 state
respectively. Note that if the input ends with 0, it will be in the final state.
Example 2:

Design a FA with ∑ = {0, 1} accepts the only input 101.

Solution:

In the given solution, we can see that only input 101 will be accepted. Hence, for input 101, there is
no other path shown for other input.

Example 3:

Design FA with ∑ = {0, 1} accepts even number of 0's and even number of 1's.

Solution:

This FA will consider four different stages for input 0 and input 1. The stages could be:

Here q0 is a start state and the final state also. Note carefully that a symmetry of 0's and 1's is
maintained. We can associate meanings to each state as:

q0: state of even number of 0's and even number of 1's.


q1: state of odd number of 0's and even number of 1's.
q2: state of odd number of 0's and odd number of 1's.
q3: state of even number of 0's and odd number of 1's.
Example 4:

Design FA with ∑ = {0, 1} accepts the set of all strings with three consecutive 0's.

Solution:

The strings that will be generated for this particular languages are 000, 0001, 1000, 10001, .... in
which 0 always appears in a clump of 3. The transition graph is as follows:

Note that the sequence of triple zeros is maintained to reach the final state.

Example 5:

Design a DFA L(M) = {w | w ε {0, 1}*} and W is a string that does not contain consecutive 1's.

Solution:

Advertisement

When three consecutive 1's occur the DFA will be:

Here two consecutive 1's or single 1 is acceptable, hence

The stages q0, q1, q2 are the final states. The DFA will generate the strings that do not contain
consecutive 1's like 10, 110, 101,..... etc.

Advertisement

Example 6:

Design a FA with ∑ = {0, 1} accepts the strings with an even number of 0's followed by single 1.

Solution:

The DFA can be shown by a transition diagram as:


NFA (non-deterministic finite automata)
o NFA stands for non-deterministic finite automata. It is easy to construct an NFA than DFA for
a given regular language.

o The finite automata are called NFA when there exist many paths for specific input from the
current state to the next state.

o Every NFA is not DFA, but each NFA can be translated into DFA.

o NFA is defined in the same way as DFA but with the following two exceptions, it contains
multiple next states, and it contains ε transition.

In the following image, we can see that from state q0 for input a, there are two next states q1 and
q2, similarly, from q0 for input b, the next states are q0 and q1. Thus it is not fixed or determined
that with a particular input where to go next. Hence this FA is called non-deterministic finite
automata.

Formal definition of NFA:

NFA also has five states same as DFA, but with different transition function, as shown follows:

δ: Q x ∑ →2Q

where,

1. Q: finite set of states

2. ∑: finite set of the input symbol

3. q0: initial state

4. F: final state

5. δ: Transition function

Graphical Representation of an NFA

An NFA can be represented by digraphs called state diagram. In which:

Advertisement

1. The state is represented by vertices.

2. The arc labeled with an input character show the transitions.


3. The initial state is marked with an arrow.

4. The final state is denoted by the double circle.

Example 1:

1. Q = {q0, q1, q2}

2. ∑ = {0, 1}

3. q0 = {q0}

4. F = {q2}

Solution:

Transition diagram:

Transition Table:

Present State Next state for Input 0 Next State of Input 1

→q0 q0, q1 q1

q1 q2 q0

*q2 q2 q1, q2

In the above diagram, we can see that when the current state is q0, on input 0, the next state will be
q0 or q1, and on 1 input the next state will be q1. When the current state is q1, on input 0 the next
state will be q2 and on 1 input, the next state will be q0. When the current state is q2, on 0 input the
next state is q2, and on 1 input the next state will be q1 or q2.
Example 2:

NFA with ∑ = {0, 1} accepts all strings with 01.

Solution:

Advertisement

Transition Table:

Present State Next state for Input 0 Next State of Input 1

→q0 q1 ε

q1 ε q2

*q2 q2 q2

Example 3:

NFA with ∑ = {0, 1} and accept all string of length atleast 2.

Solution:
Transition Table:

Present State Next state for Input 0 Next State of Input 1

→q0 q1 q1

q1 q2 q2

*q2 ε ε

Introduction of Finite Automata

Finite automata are abstract machines used to recognize patterns in input sequences, forming the
basis for understanding regular languages in computer science. They consist of states, transitions,
and input symbols, processing each symbol step-by-step. If the machine ends in an accepting state
after processing the input, it is accepted; otherwise, it is rejected. Finite automata come in
deterministic (DFA) and non-deterministic (NFA), both of which can recognize the same set of regular
languages. They are widely used in text processing, compilers, and network protocols.

Figure: Features of Finite Automata

Features of Finite Automata

• Input: Set of symbols or characters provided to the machine.

• Output: Accept or reject based on the input pattern.

• States of Automata: The conditions or configurations of the machine.

• State Relation: The transitions between states.

• Output Relation: Based on the final state, the output decision is made.
Formal Definition of Finite Automata

A finite automaton can be defined as a tuple:

{ Q, Σ, q, F, δ }, where:

• Q: Finite set of states

• Σ: Set of input symbols

• q: Initial state

• F: Set of final states

• δ: Transition function

Types of Finite Automata

There are two types of finite automata:

• Deterministic Fintie Automata (DFA)

• Non-Deterministic Finite Automata (NFA)

1. Deterministic Finite Automata (DFA)

A DFA is represented as {Q, Σ, q, F, δ}. In DFA, for each input symbol, the machine transitions to one
and only one state. DFA does not allow any null transitions, meaning every state must have a
transition defined for every input symbol.

DFA consists of 5 tuples {Q, Σ, q, F, δ}.


Q: set of all states.
Σ: set of input symbols. (Symbols which machine takes as input)
q: Initial state. (Starting state of a machine)
F: set of final state.
δ(Delta): Transition Function, defined as δ: Q X Σ --> Q. // from one state to another state

Example:

Construct a DFA that accepts all strings ending with ‘a’.

Given:

Σ = {a, b},

Q = {q0, q1},

F = {q1}

Fig 1. State Transition Diagram for DFA with Σ = {a, b}


State\Symbol a b

q0 q1 q0

q1 q1 q0

In this example, if the string ends in ‘a’, the machine reaches state q1, which is an accepting state.

2) Non-Deterministic Finite Automata (NFA)

NFA is similar to DFA but includes the following features:

• It can transition to multiple states for the same input.

• It allows null (ϵ) moves, where the machine can change states without consuming any input.

Example:

Construct an NFA that accepts strings ending in ‘a’.

Given:

Σ = {a, b},

Q = {q0, q1},

F = {q1}

Fig 2. State Transition Diagram for NFA with Σ = {a, b}

State Transition Table for above Automaton,

State\Symbol a b

q0 {q0, q1} q0

q1 φ φ

In an NFA, if any transition leads to an accepting state, the string is accepted.


Comparison of DFA and NFA

Although NFAs appear more flexible, they do not have more computational power than DFAs. Every
NFA can be converted to an equivalent DFA, although the resulting DFA may have more states.

• DFA: Single transition for each input symbol, no null moves.

• NFA: Multiple transitions and null moves allowed.

• Power: Both DFA and NFA recognize the same set of regular languages.

https://www.prepbytes.com/blog/cs-subjects/applications-of-finite-automata/

Applications of finite automata


Finite automata (FA) are mathematical models of computation used to design and analyze algorithms
in a variety of fields. Here are some key applications of finite automata:

1. Lexical Analysis in Compilers

• Lexical analyzers (or tokenizers) use finite automata to recognize patterns in the source code
(such as keywords, operators, and identifiers). Regular expressions are often converted into
finite automata to perform this task efficiently.

2. Pattern Matching

• Finite automata are commonly used in string matching algorithms, such as searching for a
pattern in a text (e.g., finding a substring in a document or DNA sequence). The Knuth-
Morris-Pratt (KMP) algorithm, for instance, uses a form of finite automata.

3. Text Processing

• Text editors, spell checkers, and data validation tools often use finite automata to detect
errors or search for specific patterns. Finite automata are used to define regular languages
that describe valid strings.

4. Control Systems

• In hardware design, finite automata are used to model control circuits. For example, they
help in designing sequential circuits and digital logic by representing the system's states and
transitions between them.

5. Protocol Design and Verification

• In computer networks, finite automata are used to model the behavior of communication
protocols. They help ensure that communication occurs as intended, without errors or
deadlocks.

6. Natural Language Processing

• Finite automata can be applied to tokenization (splitting text into words or sentences) and
morphological analysis (handling word forms). Regular grammars, which finite automata
recognize, play a role in basic NLP tasks.
7. Speech Recognition

• In speech recognition systems, finite automata are used to model phoneme sequences and
recognize spoken words, especially when combined with probabilistic models like Hidden
Markov Models (HMMs).

8. Robotics and Artificial Intelligence

• Behavior modeling of robots and autonomous systems can use finite automata to represent
different states of operation and transitions based on sensor inputs. Finite state machines
are a key tool in behavior-based robotics.

9. Game Development

• Finite state machines (FSM) are widely used in video games to model the behavior of non-
playable characters (NPCs), game mechanics, and user interface states. They help manage
complex game logic by keeping track of the current state of entities.

10. Regular Expressions and Search Engines

• Finite automata are used to compile regular expressions into machine-executable code that
searches for patterns in text. This application is especially common in search engines, log
analyzers, and command-line tools like grep.

11. Software Testing

• Finite automata are employed in model-based testing to represent the possible states of a
system and generate test cases that cover all transitions between states. This is often used in
testing communication protocols and software with complex workflows.

12. Error Detection and Correction

• Finite automata can model error detection algorithms, such as parity checks or more
complex error-correcting codes. These automata help ensure data integrity during
transmission over unreliable networks.
Finite automata with Epsilon transitions

Finite Automata with ε-transitions (or epsilon-NFA, also called ε-NFA) is an extension of the Non-
deterministic Finite Automata (NFA) that allows the automaton to move from one state to another
without consuming any input symbol. The symbol ε (epsilon) represents an empty transition,
meaning the automaton can jump to a new state without reading any input character.

Components of ε-NFA:

An ε-NFA is defined as a 5-tuple:

M=(Q,Σ,δ,q0,F)M = (Q, Σ, δ, q_0, F)M=(Q,Σ,δ,q0,F)

Where:

• QQQ: A finite set of states.

• ΣΣΣ: A finite set of input symbols (the alphabet).

• δδδ: A transition function that maps a state and input symbol to a set of possible states. This
is where ε-transitions come in:

o For every state qqq and input a∈Σ∪{ε}a \in Σ \cup \{ε\}a∈Σ∪{ε}, the transition
function δ gives a set of possible states.

• q0q_0q0: The start state (one of the states in QQQ).

• FFF: A set of final states (subset of QQQ).

Epsilon Transitions:

An ε-transition is a transition that allows the automaton to move to another state without
consuming an input symbol. It provides flexibility and non-determinism by allowing the machine to
"guess" when to make such transitions.

For example, if q1→εq2q_1 \xrightarrow{ε} q_2q1εq2, then the automaton can move from state
q1q_1q1 to q2q_2q2 without reading any input symbol.

How ε-NFA Works:

1. Starting at the initial state q0q_0q0, the automaton can make ε-transitions before reading
any input.

2. At any given state, it can either:

o Consume an input symbol and move to a new state, or

o Take an ε-transition to move to a new state without consuming any input.

3. This process continues until the input is consumed or no transitions are possible.

4. If the automaton ends in one of the final states FFF after consuming the input, the string is
accepted.
Example:

Consider the following ε-NFA:

States: Q={q0,q1,q2}Q = \{q_0, q_1, q_2\}Q={q0,q1,q2}

Alphabet: Σ={a,b}Σ = \{a, b\}Σ={a,b}

Start state: q0q_0q0

Final state: F={q2}F = \{q_2\}F={q2}

Transitions:

• δ(q0,ε)={q1}δ(q_0, ε) = \{q_1\}δ(q0,ε)={q1} (ε-transition from q0q_0q0 to q1q_1q1)

• δ(q1,a)={q2}δ(q_1, a) = \{q_2\}δ(q1,a)={q2} (on input 'a', move from q1q_1q1 to q2q_2q2)

• δ(q2,b)={q2}δ(q_2, b) = \{q_2\}δ(q2,b)={q2} (on input 'b', stay in q2q_2q2)

This ε-NFA accepts strings starting with an 'a' followed by any number of 'b's. It can take the ε-
transition from q0q_0q0 to q1q_1q1 without consuming input, and then process 'a' and 'b'
transitions.

Conversion of ε-NFA to NFA:

Every ε-NFA can be converted into an equivalent NFA (without ε-transitions) using the ε-closure
operation. The ε-closure of a state is the set of states that the automaton can reach from that state
by making any number of ε-transitions.

Steps for Conversion:

1. Compute the ε-closure for each state in the ε-NFA.

2. Modify the transition function to include ε-closures:

o For each state and input, the new NFA will consider all the states reachable by
reading the input after making any number of ε-transitions.

3. Update the final states: A state in the NFA becomes a final state if any of the states in its ε-
closure is a final state in the ε-NFA.
Units II
What is Regular Expression?

Regular Expressions are the expressions that describe the language accepted by Finite Automata. It is
the most efficient way to represent any language.

The languages accepted by some regular expressions are referred to as Regular languages.

Regular expressions are used to check and match character combinations in strings. The string
searching algorithm uses this pattern to find the operations on a string.

Let Σ be an alphabet that denotes the input set.

The regular expression over Σ can be defined as follows:-

1.) Φ is a regular expression that denotes the empty set.

2.) ε is a regular expression and denotes the set {ε}, called a null string.

3.) For each ‘x’ in Σ ‘x’ is a regular expression and denotes the set {x}.

4.) If ‘a’ and ‘b’ are the regular expressions that denote the language L1 and L2, respectively, then:-

a.) a+b is equal to L1 U L2 union.

b.) ab is equal to the L1L2 concatenation.

c.) a* is equal to L1* closure.

In a regular expression, a* means zero or more occurrences of a. It can generate {ε, a, aa, aaa, aaaa,
aaaaa, .....}.

In a regular expression, a+ means one or more occurrences of a. It can generate {a, aa, aaa, aaaa,
aaaaa, .....}.

Operations On Regular Language

The various operations on the regular language are:

1.) Union

If R and S are two regular languages, their union R U S is also a Regular Language.

R U S = {a | a is in R or a is in S}

2.) Intersection

If R and S are two regular languages, their intersection is also a Regular Language.

L ⋂ M = {ab | a is in R and b is in S}

3.) Kleene closure

If R is a regular language, its Kleene closure R1* will also be a Regular Language.

R* = Zero or more occurrences of language R.


Examples of Regular Expressions

Example 1

Consider the languages L1 = ∅ and L2 = {x}. Which one of the following represents

L1 L2* U L1*.

a.) {ε}

b.) x*

c.) ∅

d.) {ε,x}

Ans. a.) {ε}

Explanation: L1 L2* U L1* Result of L1 L2* is ∅, {∅} indicates an empty language. Concatenation of ∅
with any other language is ∅.

L1* = ∅* which is {ε}. Union of ∅ and {ε} is {ε}.

Example 2

Which one of the following languages over the alphabet {a,b} is described by the regular expression:
(a+b)*a(a+b)*a(a+b)*?

a.) Set of all strings containing the substring aa.

b.) Set of all strings containing at least two a’s.

c.) Set of all strings that begin and end with either a or b.

d.) Set of all strings containing at most two a’s

Ans. b.) Set of all strings containing at least two a’s.

Explanation: The regular expression has two a′s surrounded by (a+b)*, which means accepted strings
must have at least two a′s. The least possible string is ε a ε a ε = aa The set of strings accepted is = {
aa, aaa, baa, aaba, bbbb, aabaa, baabaab,.....}. Thus, we can see from the set of accepted strings that
all have at least two a’s, the least possible string.
Regular Expressions are simple expressions that can describe the language that finite
automata accept. It is the most efficient method of representing any language. Regular languages are
the languages that some regular expressions accept. A regular expression can also be defined as a
pattern sequence that defines a string.

To match character combinations in strings, regular expressions are used. The string searching
algorithm used this pattern to find operations on a string.

An expression is regular if and only if the following conditions are met:

• ɸ is a regular expression that stands for regular language ɸ.

• ɛ is a regular expression that stands for regular language {ɛ}.

• If a ∈ Σ (Σ represents the input alphabet), then a is a regular expression in language {a}.

• If a and b are regular expressions, then a + b is a regular expression with the language as
{a,b}.

• If a and b are regular expressions, then ab (a and b concatenation) is also regular.

• If a is a regular expression, then a* (0 or more times a) is also regular.

Note that two regular expressions are equivalent if the languages generated by them are the same.
For example,(a+b*)* and (a+b)* yield the same language. Every string produced by (a+b*)* is also
produced by (a+b)*, and vice versa.

Examples of Regular Expression in TOC

Here * means 0 or more occurrences, and + means one or more occurrences.

• If regular language is { }, then the regular expression is ϕ.

• If regular language is {ϵ}, then the regular expression is ϵϵ.

• If regular language is {r}, then the regular expression is r.

• If regular languages are LR and LS, then the regular expression is R and S.

• If regular language is LR U LS, the regular expression is R+S.

• If regular language is LR.LS, then the regular expression is R.S

• If regular language is LR*, then the regular expression is R*

• If regular language is L+R, then the regular expression is R+

Properties of Regular Expressions using Operators

(a) Union operator satisfies commutative property and associative property.

a+b=b+a (commutative)

a+(b+c)=(a+b) +c (associative)

(b) The concatenation operator satisfies associative property but not commutative property

ab ≠ b.a (not commutative)


a(b.c) =(a.b).c (associative)

(c) Both left and right distributive properties of concatenation over union hold.

x (y+z) = (x.y) + (x.z) (Left distribution of. over +)

(x+y).z =(x.z) + (y.z) (Right distribution of. over +)

(d) Both left and right distributive properties of union over concatenation does not holds.

a+(b.c) ≠ (a+b).(a+c)

(ab)+c ≠ (a+c).(b+c)

(e) Union operetor satisfies idempotent property but the concatenation operator does not holds.

a+az = a (idempotent)

a.a ≠ a (not idempotent)

(f) Identity property:

R+∅ = ∅ +R =R (∅ is identity element with respect to union operation)

ε . R = R.ε = R (ε is identity element with respect to concatenation)

(g) Annihilator property: RoX = X

R+Σ* = Σ* (Σ* is annihilator with respect to union operator)

R. ∅ = ∅ ( ∅ is annihilator with respect to concatenation operator)


Applications of Regular Expression in TOC

Regular Expressions are helpful in a wide range of text processing tasks and string processing in
general, where the data does not have to be textual.

For example:

• Data validation

• Data scraping (particularly web scraping)

• Data wrangling

• Simple parsing

• The creation of syntax highlighting systems and a variety of other tasks are typical
applications.

While Regular Expressions could be helpful in Internet search engines, processing them across the
entire database could consume many computer resources depending on the complexity and design
of the regular expressions.

Regular expressions (regex) are a tool used to find patterns in text. They help in searching, modifying,
and validating strings based on specific rules. Here are some everyday applications in simple terms:

1. Search and Replace in Text

• You can use regex to quickly find specific words or patterns in a document and replace them.
For example, in a book, you might want to replace all occurrences of the word "cat" with
"dog."

• Regular expressions can search for specific patterns in text and replace
them with desired text

2. Form Validation

• Websites use regex to check if you’ve entered correct information. For example, when you
enter an email or phone number, regex checks if it’s in the right format (like
[email protected]).

3. Programming

• Developers use regex to search for specific text patterns in code or data. For example, to
find all words that start with the letter "a" or to extract dates from a large text file.

4. Data Extraction

• Regex helps pull out specific information from text, like extracting all the email addresses or
phone numbers from a document.

• Regular expressions can extract data from structured or semi-structured text, such as names,
addresses, or dates.
5. File and Document Organization

• You can use regex to rename multiple files at once based on their name patterns. For
instance, renaming all files starting with "2023" to start with "2024."

6. Log Analysis

• When looking through large logs (such as error logs), regex helps find key information like
specific error codes, timestamps, or IP addresses.

7. Search Engines and Filters

• In search engines or websites, regex is used to filter or find content based on complex
criteria, like searching for all items priced between $10 and $50.

8. Email Filters

• Email services can use regex to automatically sort your emails into folders or mark them as
spam by detecting patterns, like messages with suspicious keywords.

9. Natural Language Processing

• Regex is used to break text into sentences or words, helping in applications like chatbots or
translating text.

10. Network Security

• In cybersecurity, regex helps detect suspicious activities in network logs by looking for
specific patterns (like unusual login attempts or hacker behavior).

11. Database Search

• In databases, regex can be used to search for information based on patterns, such as finding
all users with email addresses that end in ".com."

12. Version Control Systems

• Developers use regex to search for changes in code, like finding commits that have specific
keywords.

Regular expressions have many applications in theory of computation (TOC), including:

• Data validation

Regular expressions are used to validate user input, such as email addresses, phone numbers, and
passwords.

• Search and replace

Regular expressions can search for specific patterns in text and replace them with desired text.

• Data extraction

Regular expressions can extract data from structured or semi-structured text, such as names,
addresses, or dates.
• Web development

Regular expressions are used in web development for tasks such as URL routing, form validation, and
data extraction from HTML/XML documents.

• Log analysis

Regular expressions are used in system administration and troubleshooting to analyze and extract
information from log files.

• Data scraping

Regular expressions can be used to scrape websites for data, which can be messy and full of noise.

• Syntax highlighting

Regular expressions can be used to create syntax highlighting systems

Summary

Regular expressions make it easier to search, find, and manipulate text based on patterns. Whether
you're validating input in a web form, analyzing log files, or extracting information from documents,
regex can save a lot of time by automating these tasks.
Regular Expressions

Regular Expressions are used to denote regular languages. An expression is regular if:

• ? is a regular expression for regular language ?.

• ? is a regular expression for regular language {?}.

• If a ? ? (? represents the input alphabet), a is regular expression with language {a}.

• If a and b are regular expression, a + b is also a regular expression with language {a,b}.

• If a and b are regular expression, ab (concatenation of a and b) is also regular.

• If a is regular expression, a* (0 or more times a) is also regular.

• Regular Expression Regular Languages

set of vowels (a?e?i?o?u) {a, e, i, o, u}

(a.b
a followed by 0 or
* {a, ab, abb, abbb, abbbb,….}
more b
)

any no. of vowels .c { ? , a ,aou, aiou, b, abcd…..} where


followed by any no. of ? represent empty string (in case 0
*
consonants vowels and o consonants )
( where v – vowels
and c –
consonants)

• Regular Grammar :

• A grammar is regular if it has rules of form A -> a or A -> aB or A -> ? where ? is a special
symbol called NULL.

• Regular Languages :

• A language is regular if it can be expressed in terms of regular expression.

• Closure Properties of Regular Languages


• Union:

• If L1 and If L2 are two regular languages, their union L1 ? L2 will also be regular. For example,
L1 = {a

• n

• | n ? 0} and L2 = {b

• n

• | n ? 0} L3 = L1 ? L2 = {a

• n

• ?b

• n

• | n ? 0} is also regular.

• Intersection :

• If L1 and If L2 are two regular languages, their intersection L1 ? L2 will also be regular. For
example, L1= {a

• m

• b

• n

• | n ? 0 and m ? 0} and L2= {a

• m

• b

• n

• ?b

• n

• a

• m

• | n ? 0 and m ? 0} L3 = L1 ? L2 = {a

• m

• b

• n

• | n ? 0 and m ? 0} is also regular.

• Concatenation :
• If L1 and If L2 are two regular languages, their concatenation L1.L2 will also be regular. For
example, L1 = {a

• n

• | n ? 0} and L2 = {b

• n

• | n ? 0} L3 = L1.L2 = {a

• m

• .b

• n

• | m ? 0 and n ? 0} is also regular.

• Kleene Closure :

• If L1 is a regular language, its Kleene closure L1* will also be regular. For example, L1 = (a ? b)
L1* = (a ? b)*

• Complement :

• If L(G) is regular language, its complement L’(G) will also be regular. Complement of a
language can be found by subtracting strings which are in L(G) from all possible strings. For
example, L(G) = {a

• n

• | n > 3} L’(G) = {a

• n

• | n <= 3}

• Note :

• Two regular expressions are equivalent if languages generated by them are same. For
example, (a+b*)* and (a+b)* generate same language. Every string which is generated by
(a+b*)* is also generated by (a+b)* and vice versa.
Decision properties of regular languages
The decision properties of regular expressions pertain to various questions that can be answered
about the languages they describe and the expressions themselves. These properties help determine
how regular expressions behave and how they relate to automata and formal languages. Here are
key decision properties of regular expressions:

### 1. **Emptiness Problem**

- **Question**: Does a given regular expression describe a language that contains no strings (i.e., is
it an empty language)?

- **Solution**: This problem can be solved efficiently. Convert the regular expression to its
equivalent NFA (Nondeterministic Finite Automaton) and check if the NFA has any reachable
accepting states. If no accepting states are reachable, the language is empty.

### 2. **Finiteness Problem**

- **Question**: Does the language described by a given regular expression contain only a finite
number of strings?

- **Solution**: This can be determined by converting the regular expression to an equivalent NFA
and checking whether the NFA accepts an infinite number of strings. If the NFA has a loop that can
generate infinitely many strings, the language is infinite.

### 3. **Membership Problem**

- **Question**: Given a regular expression and a string, does the string belong to the language
described by the regular expression?

- **Solution**: This problem is solvable using finite automata. Convert the regular expression to a
DFA (Deterministic Finite Automaton) and check if the string is accepted by the DFA. The time
complexity for this check is linear in the size of the string.

### 4. **Equivalence Problem**

- **Question**: Are two regular expressions equivalent, i.e., do they describe the same language?

- **Solution**: This problem can be solved by converting both regular expressions to equivalent
DFAs and then checking if the two DFAs are equivalent (i.e., they accept the same set of strings). The
equivalence of DFAs can be determined by checking if their state transition graphs are isomorphic,
which can be done efficiently.

### 5. **Sub-expression Membership Problem**


- **Question**: Given two regular expressions, is the language described by the second regular
expression a subset of the language described by the first?

- **Solution**: This problem can be reduced to checking if the DFA for the second regular
expression is a subset of the DFA for the first regular expression. This involves checking if the DFA for
the second language accepts only strings that the DFA for the first language accepts.

### 6. **Pattern Matching**

- **Question**: Given a regular expression and a set of strings, which strings in the set match the
regular expression?

- **Solution**: This is the practical application of regular expressions. By converting the regular
expression into a DFA or NFA, one can efficiently match patterns in strings. This is commonly used in
text processing and searching.

### 7. **Containment Problem**

- **Question**: Does the language described by one regular expression contain the language
described by another regular expression?

- **Solution**: This can be solved by constructing DFAs for both regular expressions and checking if
the language of the first DFA (say, A) contains the language of the second DFA (say, B). Formally, this
is checked by verifying if the DFA for `A` minus the DFA for `B` is empty.

### Summary

The decision properties of regular expressions are important for various theoretical and practical
applications. These properties can be systematically addressed through conversions to finite
automata and subsequent operations on those automata. Most of these problems can be solved
efficiently due to the finite nature of the automata and the regularity of the languages involved.
Equivalence and minimization of automata.
Equivalence and minimization of automata are fundamental concepts in the theory of computation,
especially when working with finite automata. Here's a detailed look at each:

### 1. **Equivalence of Automata**

#### **Definition**

Two automata (DFA or NFA) are said to be **equivalent** if they recognize the same language. In
other words, for every string in the input alphabet, both automata either accept it or reject it.

#### **Testing Equivalence**

To test whether two automata are equivalent, you can follow these general steps:

1. **Convert to DFAs**: If the automata are not already DFAs, convert them into equivalent DFAs.
This is necessary because the equivalence check is straightforward for DFAs.

2. **Minimize the DFAs**: Apply the minimization algorithm (discussed below) to both DFAs. If the
resulting minimized DFAs are identical (i.e., they have the same states and transitions), then the
original automata are equivalent.

3. **Direct Comparison**: Alternatively, you can directly compare the state transition tables of the
two DFAs. This involves:

- Ensuring that the two DFAs have the same number of states.

- Ensuring that the transition functions are the same for each corresponding state and input
symbol.

4. **Using a Product Automaton**: Construct a product automaton where states are pairs of states
from the two automata. Use this product automaton to check if there are any strings that are
accepted by one automaton but not the other.

### 2. **Minimization of Automata**

#### **Purpose**
The goal of **minimization** is to reduce a DFA to its smallest equivalent form—one with the fewest
states. This is useful for simplifying automata and optimizing performance.

#### **Minimization Algorithm**

One of the most common algorithms for DFA minimization is the **Hopcroft's Algorithm**, but
there are also other methods like the **Moore's Algorithm**. Here’s a high-level overview of the
minimization process:

1. **Remove Unreachable States**:

- Identify and remove states that cannot be reached from the initial state. These states do not
contribute to the accepted language and can be eliminated.

2. **Partition States into Equivalent Classes**:

- **Initial Partition**: Start by partitioning states into two groups: accepting states and non-
accepting states.

- **Refinement**: Refine these partitions by examining transitions. States that transition to the
same states on the same inputs should belong to the same partition. Continue refining until no
further refinement is possible.

3. **Construct the Minimized DFA**:

- Create a new DFA where each state corresponds to a partition from the refinement step.

- Define the transition function based on the transitions between partitions.

- Designate a new start state and accepting states based on the partitions of the original start state
and accepting states, respectively.

### **Example of Minimization**

Consider a DFA with the following states and transitions:

- States: `q0`, `q1`, `q2`

- Alphabet: `{0, 1}`

- Transitions:

- `q0` --`0`--> `q1`

- `q0` --`1`--> `q2`


- `q1` --`0`--> `q0`

- `q1` --`1`--> `q2`

- `q2` --`0`--> `q1`

- `q2` --`1`--> `q0`

- Start State: `q0`

- Accepting State: `q0`

After applying the minimization algorithm, you might find that `q1` and `q2` are equivalent and can
be merged, resulting in a DFA with fewer states.

### **Summary**

- **Equivalence** of automata involves checking if two automata recognize the same language,
often achieved by comparing their minimized forms or using product automata.

- **Minimization** of DFAs aims to reduce the number of states in the automaton while preserving
the language it recognizes. This process involves removing unreachable states, partitioning states
into equivalence classes, and constructing a new minimized DFA.

Both concepts are crucial for understanding and working with finite automata, especially in
applications involving pattern matching, lexical analysis, and formal verification.

Closure Properties of Regular Languages

Regular languages are closed under:

• Union: The union of two regular languages is regular.

• Intersection: The intersection of two regular languages is regular.

• Complement: The complement of a regular language is regular.

• Difference: The difference between two regular languages is regular.

• Concatenation: The concatenation of two regular languages is regular.

• Kleene Star: Applying the Kleene star to a regular language results in a regular language.

Decision Properties of Regular Languages

For regular languages, you can decide:

• Emptiness: Whether a language is empty.

• Finiteness: Whether a language contains a finite number of strings.

• Membership: Whether a given string belongs to the language.


Unit III
Context Free Grammar is formal grammar, the syntax or structure of a formal language can be
described using context-free grammar (CFG), a type of formal grammar. The grammar has four
tuples: (V,T,P,S).

V - It is the collection of variables or non-terminal symbols.


T - It is a set of terminals.
P - It is the production rules that consist of both terminals and non-terminals.
S - It is the starting symbol.

A grammar is said to be the Context-free grammar if every production is in the form of :

G -> (V∪T)*, where G ∊ V

• And the left-hand side of the G, here in the example, can only be a Variable, it cannot be a
terminal.

• But on the right-hand side here it can be a Variable or Terminal or both combination of
Variable and Terminal.

The above equation states that every production which contains any combination of the ‘V’ variable
or ‘T’ terminal is said to be a context-free grammar.

For example, the grammar A = { S, a, b } having productions:

• Here S is the starting symbol.

• {a, b} are the terminals generally represented by small characters.

• S is the variable.

S-> aS
S-> bSa

but

a->bSa, or
a->ba is not a CFG as on the left-hand side there is a variable which does not follow the CFGs rule.

Lets consider the string “aba” and and try to derive the given grammar from the productions given.
we start with symbol S, apply production rule S->bSa and then S->aS (S->a) to get the string “aba”.

Parse tree of string “aba”

In the computer science field, context-free grammars are frequently used, especially in the areas of
formal language theory, compiler development, and natural language processing. It is also used for
explaining the syntax of programming languages and other formal languages.
Limitations of Context-Free Grammar

Apart from all the uses and importance of Context-Free Grammar in the Compiler design and the
Computer science field, there are some limitations that are addressed, that is CFGs are less
expressive, and neither English nor programming language can be expressed using Context-Free
Grammar. Context-Free Grammar can be ambiguous means we can generate multiple parse trees of
the same input. For some grammar, Context-Free Grammar can be less efficient because of the
exponential time complexity. And the less precise error reporting as CFGs error reporting system is
not that precise that can give more detailed error messages and information.

Definition − A context-free grammar (CFG) consisting of a finite set of grammar rules is a


quadruple (N, T, P, S) where

• N is a set of non-terminal symbols.

• T is a set of terminals where N ∩ T = NULL.

• P is a set of rules, P: N → (N ∪ T)*, i.e., the left-hand side of the production rule P does have
any right context or left context.

• S is the start symbol.

Example

• The grammar ({A}, {a, b, c}, P, A), P : A → aA, A → abc.

• The grammar ({S, a, b}, {a, b}, P, S), P: S → aSa, S → bSb, S → ε

• The grammar ({S, F}, {0, 1}, P, S), P: S → 00S | 11F, F → 00F | ε

What is a Parse Tree?

A parse tree, also called a syntax tree, is a tree-like hierarchical representation of the derivation of a
string according to a formal grammar.

The parse tree is designed in such a way that an in-order traversal-a traversal from left, root, right-
produces the original input string. Parse trees represent a fundamental basis for explaining the
syntactic structure to source code and assist in error detection and further stages of compilation.

Here we will study the concept and uses of Parse Tree in Compiler Design. First, let us check out two
terms:

• Parse : It means to resolve (a sentence) into its component parts and describe their syntactic
roles or simply it is an act of parsing a string or a text.

• Tree: A tree may be a widely used abstract data type that simulates a hierarchical tree
structure, with a root value and sub-trees of youngsters with a parent node, represented as a
group of linked nodes.
What is Parse Tree?

Parse tree is the hierarchical representation of terminals or non-terminals. These symbols (terminals
or non-terminals) represent the derivation of the grammar to yield input strings. In parsing, the
string springs using the beginning symbol. The starting symbol of the grammar must be used as the
root of the Parse Tree. Leaves of parse tree represent terminals. Each interior node represents
productions of a grammar. The root node represents the starting symbol of the grammar. Interior
nodes represent the non-terminal symbols and the production rules applied. Leaf nodes represent
the terminal symbols or tokens of the input string. Leftmost Derivation Tree: A parse tree that
follows a leftmost derivation, where there is always an expansion of the leftmost non-terminal.
Derivation Tree: Rightmost There is a rightmost derivation. It expands the rightmost non-terminal
first.

Rules to Draw a Parse Tree

• All leaf nodes need to be terminals.

• All interior nodes need to be non-terminals.

• In-order traversal gives the original input string.

Example 1: Let us take an example of Grammar (Production Rules).

S -> sAB
A -> a
B -> b

The input string is “sab”, then the Parse Tree is:

Example-2: Let us take another example of Grammar (Production Rules).

S -> AB
A -> c/aA
B -> d/bB

The input string is “acbd”, then the Parse Tree is as follows:


Uses of Parse Tree

• It helps in making syntax analysis by reflecting the syntax of the input language.

• It uses an in-memory representation of the input with a structure that conforms to the
grammar.

• The advantages of using parse trees rather than semantic actions: you’ll make multiple
passes over the info without having to re-parse the input.

Applications of Context Free Grammar (CFG)

• Context Free Grammars are used in Compilers (like GCC) for parsing. In this step, it takes a
program (a set of strings).

• Context Free Grammars are used to define the High-Level Structure of a Programming
Language.

• Every Context Free Grammar can be converted to a Parser which is a component of a


Compiler that identifies the structure of a Program and converts the Program into a Tree.

• Document Type Definition in XML is a Context Free Grammar which describes the HTML tags
and the rules to use the tags in a nested fashion.

Applications of Context-Free Grammars (CFGs)

1. Compilers and Interpreters:

o CFGs define the syntax of programming languages. They help in parsing the source
code into a structure that can be understood and processed by the compiler.

2. Natural Language Processing (NLP):

o CFGs are used to model the syntax of human languages, enabling tasks like speech
recognition, machine translation, and text parsing.

3. Markup Languages:

o Languages such as HTML and XML can be defined using CFGs, allowing for structured
data representation and validation.

4. Mathematical Logic:

o CFGs can express the syntax of formal languages used in mathematical logic and
proof systems.

5. Graphical User Interfaces:

o UI frameworks can use CFGs to define the layout and behavior of user interface
components.

6. Programming Language Design:

o When designing new programming languages, CFGs help in specifying the grammar
and ensuring syntactical correctness.
Ambiguity in Grammar

A grammar is said to be ambiguous if there exists more than one leftmost derivation or more than
one rightmost derivation or more than one parse tree for the given input string. If the grammar is not
ambiguous, then it is called unambiguous.

If the grammar has ambiguity, then it is not good for compiler construction. No method can
automatically detect and remove the ambiguity, but we can remove ambiguity by re-writing the
whole grammar without ambiguity.

Example 1:

Let us consider a grammar G with the production rule

1. E → I

2. E → E + E

3. E → E * E

4. E → (E)

5. I → ε | 0 | 1 | 2 | ... | 9

Solution:

For the string "3 * 2 + 5", the above grammar can generate two parse trees by leftmost derivation:

Since there are two parse trees for a single string "3 * 2 + 5", the grammar G is ambiguous.

Example 2:

Check whether the given grammar G is ambiguous or not.

1. E → E + E

2. E → E - E

3. E → id
Solution:

From the above grammar String "id + id - id" can be derived in 2 ways:

First Leftmost derivation

1. E → E + E

2. → id + E

3. → id + E - E

4. → id + id - E

5. → id + id- id

Second Leftmost derivation

1. E → E - E

2. →E+E-E

3. → id + E - E

4. → id + id - E

5. → id + id - id

Since there are two leftmost derivation for a single string "id + id - id", the grammar G is ambiguous.
Example 3:

Check whether the given grammar G is ambiguous or not.

1. S → aSb | SS

2. S → ε

Solution:

For the string "aabb" the above grammar can generate two parse trees

Since there are two parse trees for a single string "aabb", the grammar G is ambiguous.

Example 4:

Check whether the given grammar G is ambiguous or not.

1. A → AA

2. A → (A)

3. A → a

Solution:

For the string "a(a)aa" the above grammar can generate two parse trees:

Since there are two parse trees for a single string "a(a)aa", the grammar G is ambiguous.

A Pushdown Automaton (PDA) is a type of computational model that extends the capabilities of a
finite automaton by adding a stack as an auxiliary memory structure. It is particularly used to
recognize context-free languages, which are more powerful than regular languages but less powerful
than context-sensitive languages.
Components of a Pushdown Automaton:

1. States: A finite set of states, similar to those in a finite automaton.

2. Input Alphabet: A set of symbols that the automaton reads from the input string.

3. Stack Alphabet: A set of symbols that can be pushed to or popped from the stack.

4. Transition Function: Describes the rules for moving between states, which depend on:

o The current state.

o The current input symbol (can be the empty symbol ε).

o The top symbol on the stack.

o After the transition, the PDA can push new symbols to the stack, pop symbols from
the stack, or leave the stack unchanged.

5. Initial State: The starting state of the PDA.

6. Initial Stack Symbol: The symbol initially placed on the stack.

7. Accept States: A set of states in which the PDA accepts the input string if it reaches one of
these states.
How it Works:

• The PDA reads an input string one symbol at a time and uses its stack to store and retrieve
information as needed.

• The stack allows the PDA to handle non-regular languages (like those with nested structures,
such as balanced parentheses).

• It can transition between states based on the current input symbol and the stack's top
symbol. Depending on these, it can push new symbols onto the stack, pop symbols off the
stack, or perform no stack operation.

• A PDA accepts a string either when it reaches an accepting state or when its stack becomes
empty (depending on the type of PDA).

Types of PDAs:

1. Deterministic PDA (DPDA): Has a single transition for each possible input and stack symbol
pair. It is less powerful and cannot recognize all context-free languages.

2. Nondeterministic PDA (NPDA): Can have multiple possible transitions for the same input and
stack symbol, allowing it to explore different computational paths. This is more powerful and
can recognize all context-free languages.

Example:

One classic example of a language recognized by a PDA is the set of balanced parentheses,

L = { a^n b^n | n ≥ 0 }. A PDA can use its stack to keep track of how many as have been read and
ensure there is a corresponding number of bs.

The languages of a PDA

A Pushdown Automaton (PDA) is a type of computational model used in formal language theory. It is
more powerful than a finite automaton but less powerful than a Turing machine. PDAs are
specifically designed to recognize a class of languages called context-free languages.

There are two primary types of languages that a PDA can recognize:

1. Context-Free Languages (CFLs):

• Definition: A context-free language is a language that can be generated by a context-free


grammar (CFG), where each production rule has a single non-terminal on the left-hand side.

• Recognized by: Non-deterministic Pushdown Automaton (NPDA).

• Examples:

o The language of balanced parentheses: { "()", "(())", "()()", ... }

o The language of palindromes: { "a", "aa", "aba", "abba", ... }


2. Deterministic Context-Free Languages (DCFLs):

• Definition: These are a subset of context-free languages that can be recognized by a


Deterministic Pushdown Automaton (DPDA). Not all context-free languages are
deterministic, meaning they can't all be recognized by a DPDA.

• Recognized by: Deterministic Pushdown Automaton (DPDA).

• Examples:

o Some programming languages' syntax (like arithmetic expressions).

o The language of palindromes of even length: { "aa", "abba", ... }

Important Points:

• A PDA uses a stack to store symbols and helps it recognize certain patterns that a regular
finite automaton (without a stack) cannot.

• The stack allows the PDA to handle more complex languages, especially those with recursive
structures (like nested parentheses).

• PDAs are usually studied in two forms:

o Non-deterministic PDA (NPDA): More powerful, can recognize all context-free


languages.

o Deterministic PDA (DPDA): Less powerful, only recognizes deterministic context-free


languages (a subset of CFLs).

Would you like to explore how PDAs work in detail or see examples of languages that a PDA can
recognize?

Equivalence of PDA’s and CFG’s

The equivalence between Pushdown Automata (PDA) and Context-Free Grammars (CFG) is a key
concept in formal language theory. It shows that both these computational models have the same
expressive power when it comes to recognizing context-free languages. Let’s explore this
equivalence:

1. From CFG to PDA (Generating a PDA from a CFG)

A Context-Free Grammar (CFG) can be transformed into a Pushdown Automaton (PDA) that
recognizes the same language. The process involves simulating the grammar’s productions using the
stack of the PDA.

Context-free grammar (CFG) is a quadruple (N, T, P, S) made up of a finite collection of grammatical


rules. Here N stands for the set of non-terminal symbols, T stands for the set of terminal symbols, P is
a set of rules, and S is the start symbol. Also, N ∩ T should be NULL.

• How it works:

o For each production rule in the grammar (like A → BC), the PDA pushes the
corresponding right-hand side (e.g., BC) onto the stack when it sees the left-hand
side (A).
o The PDA starts with the start symbol of the CFG on the stack and tries to match it
against the input string by applying the production rules.

o As it processes the input, it pops symbols from the stack and replaces them with the
corresponding production.

o If the input string is completely matched and the stack is empty, the PDA accepts the
string.

• Example: Consider a CFG for the language of balanced parentheses:

o Grammar rules:

1. S→SSS

2. S→(S)S

3. S→ϵS (epsilon = empty string)

A corresponding PDA would push and pop parentheses on the stack to simulate these rules.

2. From PDA to CFG (Generating a CFG from a PDA)

A Pushdown Automaton (PDA) can also be converted into a Context-Free Grammar (CFG) that
generates the same language. This shows that every language recognized by a PDA can also be
generated by a CFG.

In a manner similar to how we create DFA for regular grammar, a pushdown automaton is a means to
implement context-free grammar. A PDA can store unlimited amounts of information, but a DFA can
only store a finite quantity.

We can also call PDA “Finite state machine” + “stack”.

• How it works:

o The CFG is constructed based on the states of the PDA.

o The non-terminal symbols in the CFG are usually constructed to represent transitions
between states of the PDA, as well as the symbols on the stack.

o Each production in the CFG mimics the effect of a PDA transition (either pushing or
popping symbols on the stack).

o By systematically writing rules for the PDA's transitions, you can construct a grammar
that generates exactly the strings accepted by the PDA.

• Example: If you have a PDA that recognizes balanced parentheses by pushing and popping (
and ) on its stack, you can convert it to a CFG that uses rules like:

o S→(S)SS

o S→ϵS
Formal Proof of Equivalence:

• Every context-free language can be generated by a context-free grammar (CFG).

• Every context-free language can be recognized by a pushdown automaton (PDA).

• Hence, CFGs and PDAs are equivalent in the sense that they both define the same class of
languages: context-free languages (CFLs).

• The process of conversion between CFGs and PDAs proves their equivalence.

• If a grammar G is context-free, we can build an equivalent


nondeterministic PDA which accepts the language that is produced by
the context-free grammar G. A parser can be built for the grammar G.
• Also, if P is a pushdown automaton, an equivalent context-free
grammar G can be constructed where
• L(G) = L(P)
• In the next two topics, we will discuss how to convert from PDA to CFG
and vice versa.

• Algorithm to find PDA corresponding to a given CFG


• Input − A CFG, G = (V, T, P, S)
• Output − Equivalent PDA, P = (Q, ∑, S, δ, q0, I, F)
• Step 1 − Convert the productions of the CFG into GNF.
• Step 2 − The PDA will have only one state {q}.
• Step 3 − The start symbol of CFG will be the start symbol in the PDA.
• Step 4 − All non-terminals of the CFG will be the stack symbols of the
PDA and all the terminals of the CFG will be the input symbols of the
PDA.
• Step 5 − For each production in the form A → aX where a is terminal
and A, X are combination of terminal and non-terminals, make a
transition δ (q, a, A).
• Problem
• Construct a PDA from the following CFG.
• G = ({S, X}, {a, b}, P, S)
• where the productions are −
• S → XS | ε , A → aXb | Ab | ab
• Solution
• Let the equivalent PDA,
• P = ({q}, {a, b}, {a, b, X, S}, δ, q, S)
• where δ −
• δ(q, ε , S) = {(q, XS), (q, ε )}
• δ(q, ε , X) = {(q, aXb), (q, Xb), (q, ab)}
• δ(q, a, a) = {(q, ε )}
• δ(q, 1, 1) = {(q, ε )}

• Algorithm to find CFG corresponding to a given PDA
• Input − A CFG, G = (V, T, P, S)
• Output − Equivalent PDA, P = (Q, ∑, S, δ, q0, I, F) such that the non-
terminals of the grammar G will be {Xwx | w,x ∈ Q} and the start state
will be Aq0,F.
• Step 1 − For every w, x, y, z ∈ Q, m ∈ S and a, b ∈ ∑, if δ (w, a, ε)
contains (y, m) and (z, b, m) contains (x, ε), add the production rule
Xwx → a Xyzb in grammar G.
• Step 2 − For every w, x, y, z ∈ Q, add the production rule Xwx →
XwyXyx in grammar G.
• Step 3 − For w ∈ Q, add the production rule Xww → ε in grammar G.
UNIT IV
Deterministic Pushdown Automata: Normal forms for CFGs
In formal language theory, normal forms for Context-Free Grammars (CFGs) are standardized ways
of writing the production rules. These normal forms make it easier to study and process CFGs. The
two most common normal forms are:

The grammar G1 is in CNF as production rules satisfy the rules specified for CNF. However, the
grammar G2 is not in CNF as the production rule S->aZ contains terminal followed by non-terminal
which does not satisfy the rules specified for CNF.

Note –

• For a given grammar, there can be more than one CNF.

• CNF produces the same language as generated by CFG.

• CNF is used as a preprocessing step for many algorithms for CFG like CYK(membership algo),
bottom-up parsers etc.

• For generating string w of length ‘n’ requires ‘2n-1’ production or steps in CNF.

• Any Context free Grammar that do not have ε in it’s language has an equivalent CNF.

1. Chomsky Normal Form (CNF)

A context free grammar (CFG) is in Chomsky Normal Form (CNF) if all production rules satisfy one of
the following conditions:

• A non-terminal generating a terminal (e.g.; X->x)

• A non-terminal generating two non-terminals (e.g.; X->YZ)

• Start symbol generating ε. (e.g.; S-> ε)

In Chomsky Normal Form (CNF), every production rule in the grammar is of one of the following two
types:

1. A→BC (where A, B, and C are non-terminal symbols, and B and C are not start symbols)

2. A→ a (where A is a non-terminal and a is a terminal symbol)

Additionally, there can be a special rule S→ϵ if the language contains the empty string.

Why Chomsky Normal Form is useful:

• It simplifies many parsing algorithms (like the CYK algorithm for determining if a string
belongs to a context-free language).

• It restricts the structure of the grammar, which can be useful in theoretical proofs and
automatic processing.
How to convert a CFG to CNF:

1. Eliminate ε-productions: Remove all productions of the form A→ϵ (where ϵ is the empty
string) unless A is the start symbol.

2. Eliminate unit productions: Replace rules of the form A→B (where both A and B are non-
terminals) with equivalent rules.

3. Convert terminals in long productions: If a rule has both terminal and non-terminal symbols
or more than two non-terminal symbols, break it down into multiple two-symbol rules.

Example:

Steps for converting CFG into CNF

Step 1: Eliminate start symbol from the RHS. If the start symbol T is at the right-hand side of any
production, create a new production as:

1. S1 → S

Where S1 is the new start symbol.

Step 2: In the grammar, remove the null, unit and useless productions. You can refer to
the Simplification of CFG.

Step 3: Eliminate terminals from the RHS of the production if they exist with other non-terminals or
terminals. For example, production S → aA can be decomposed as:

1. S → RA

2. R → a

Step 4: Eliminate RHS with more than two non-terminals. For example, S → ASB can be decomposed
as:

1. S → RS

2. R → AS

Example:
Convert the given CFG to CNF. Consider the given grammar G1:

1. S → a | aA | B

2. A → aBB | ε

3. B → Aa | b

Solution:

Step 1: We will create a new production S1 → S, as the start symbol S appears on the RHS. The
grammar will be:

1. S1 → S

2. S → a | aA | B

3. A → aBB | ε

4. B → Aa | b

Step 2: As grammar G1 contains A → ε null production, its removal from the grammar yields:

1. S1 → S

2. S → a | aA | B

3. A → aBB

4. B → Aa | b | a

Now, as grammar G1 contains Unit production S → B, its removal yield:

1. S1 → S

2. S → a | aA | Aa | b

3. A → aBB

4. B → Aa | b | a

Also remove the unit production S1 → S, its removal from the grammar yields:

1. S0 → a | aA | Aa | b

2. S → a | aA | Aa | b

3. A → aBB

4. B → Aa | b | a

Step 3: In the production rule S0 → aA | Aa, S → aA | Aa, A → aBB and B → Aa, terminal a exists on
RHS with non-terminals. So we will replace terminal a with X:

1. S0 → a | XA | AX | b

2. S → a | XA | AX | b

3. A → XBB

4. B → AX | b | a
5. X → a

Step 4: In the production rule A → XBB, RHS has more than two symbols, removing it from grammar
yield:

1. S0 → a | XA | AX | b

2. S → a | XA | AX | b

3. A → RB

4. B → AX | b | a

5. X → a

6. R → XB

Hence, for the given grammar, this is the required CNF.

The pumping lemma for CFGs

Theorem

• Any context free language may be generated by a context free grammar in Chomsky Normal Form

• To show how this is possible we must be able to convert any CFG into CNF

1. Eliminate all ε rules of the form A -> ε

2. Eliminate all unit rules of the form A -> B

3. Convert any remaining rules into the form A -> BC

Why is this useful?

• Because we know lots of things about binary trees

• We can now apply these things to context-free grammars since any CFG can be placed into the CNF
format

• For example – If yield of the tree is a terminal string w – If n is the height of the longest path in the
tree – Then |w| ≤ 2n-1 – How is this so? (Next slide)

The Pumping Lemma for CFL’s

• The result from the previous slide (|w| ≤ 2n-1) lets us define the pumping lemma for CFL’s

• The pumping lemma gives us a technique to show that certain languages are not context free – Just
like we used the pumping lemma to show certain languages are not regular – But the pumping
lemma for CFL’s is a bit more complicated than the pumping lemma for regular languages

• Informally – The pumping lemma for CFL’s states that for sufficiently long strings in a CFL, we can
find two, short, nearby substrings that we can “pump” in tandem and the resulting string must also
be in the language.
Pumping Lemma for Context-free Languages (CFL) Pumping Lemma for CFL states that for any
Context Free Language L, it is possible to find two substrings that can be ‘pumped’ any number of
times and still be in the same language. For any language L, we break its strings into five parts and
pump second and fourth substring. Pumping Lemma, here also, is used as a tool to prove that a
language is not CFL. Because, if any one string does not satisfy its conditions, then the language is
not CFL. Thus, if L is a CFL, there exists an integer n, such that for all x ? L with |x| ? n, there exists u,
v, w, x, y ? ?*, such that x = uvwxy, and (1) |vwx| ? n (2) |vx| ? 1 (3) for all i ? 0: uviwxiy ? L

For above example, 0n1n is CFL, as any string can be the result of pumping at two places, one for 0
and other for 1. Let us prove, L012 = {0n1n2n | n ? 0} is not Context-free. Let us assume that L is
Context-free, then by Pumping Lemma, the above given rules follow. Now, let x ? L and |x| ? n. So, by
Pumping Lemma, there exists u, v, w, x, y such that (1) – (3) hold. We show that for all u, v, w, x, y (1)
– (3) do not hold. If (1) and (2) hold then x = 0n1n2n = uvwxy with |vwx| ? n and |vx| ? 1. (1) tells us
that vwx does not contain both 0 and 2. Thus, either vwx has no 0’s, or vwx has no 2’s. Thus, we have
two cases to consider. Suppose vwx has no 0’s. By (2), vx contains a 1 or a 2. Thus uwy has ‘n’ 0’s and
uwy either has less than ‘n’ 1’s or has less than ‘n’ 2’s. But (3) tells us that uwy = uv0wx0y ? L. So, uwy
has an equal number of 0’s, 1’s and 2’s gives us a contradiction. The case where vwx has no 2’s is
similar and also gives us a contradiction. Thus L is not context-free. Source : John E. Hopcroft, Rajeev
Motwani, Jeffrey D. Ullman (2003). Introduction to Automata Theory, Languages, and Computation.

Pumping Lemma for Context Free Languages


Context-free grammar generates context-free languages (CFLs). The set of all context-free languages
is the same as the set of languages accepted by the pushdown Automata. The set of regular
languages is a subset of context-free languages.

Pumping Lemma for CFL states that for any Context-Free Language L, it is possible to find two
substrings that can be 'pumped' any number of times and still be in the same language. We break its
strings into five parts for any language L and pump the second and fourth substrings. If any string
does not satisfy its conditions, then the language is not CFL.

Also, see - Pumping Lemma in Theory of Computation

Theorem

If L is a context-free language, there is a constant 'n' that depends exclusively on L, such that if w? L
and |w| >= n, w can be divided into five pieces, w = uvxyz, meeting the following requirements.

• |vxy| >=n

• |vy| # ε

• For all k>=0, the string uvkxyyz∈L

Steps to apply pumping lemma

• Assume that L is context-free.

• The pumping length will be n.

• All strings longer than the length(n) can be pumped-> |w|>=n.

• Now we have to find a string 'w' in L such that |w|>=n.

• We will divide string 'w' into uvxyz.

• Now show that uvkxykz ∉L for some constant k.

• Then, we have to consider the ways that w can be divided into uvxyz.

• Show that none of these can satisfy all the 3 pumping conditions simultaneously.

• A string 'w' cannot be pumped (contradiction).

Closure properties of CFLs.

The closure properties of Context-Free Languages (CFLs) refer to how CFLs behave under various
operations. Let's go through the main closure properties of CFLs:

1. Union
• CFLs are closed under union.

o If L₁ and L₂ are context-free languages, then L₁ ∪ L₂ is also a context-free language.

o You can construct a new context-free grammar (CFG) by combining the grammars of
L₁ and L₂, creating a new start symbol that produces either the start symbol of L₁ or
L₂.

o Union : If L1 and L2 are two context free languages, their union L1 ? L2 will also be
context free

2. Concatenation

• CFLs are closed under concatenation.

o If L₁ and L₂ are CFLs, then L₁L₂ (strings formed by concatenating a string from L₁ and a
string from L₂) is also a CFL.

o A new CFG can be constructed by using the start symbols of the grammars of L₁ and
L₂ to form the concatenated language.

o Concatenation : If L1 and If L2 are two context free languages, their concatenation


L1.L2 will also be context free.

3. Kleene Star

• CFLs are closed under Kleene star.

o If L is a CFL, then L* (zero or more concatenations of strings from L) is also a CFL.

o A CFG can be constructed for L* by allowing recursive applications of the grammar of


L.

o If L1 is context free, its Kleene closure L1* will also be context free.

4. Intersection with Regular Languages

• CFLs are closed under intersection with regular languages.

o If L₁ is a CFL and L₂ is a regular language, then L₁ ∩ L₂ is a CFL.

o This works because you can construct a CFG that simulates both the context-free
grammar and the regular expression at the same time (using automata or pushdown
automata for the intersection).

5. Complement

• CFLs are not closed under complement.

o If L is a CFL, the complement of L (i.e., all strings not in L) is not necessarily a CFL.

o This can be seen because complementing a language requires being able to handle
structures that may not be context-free.

6. Intersection
• CFLs are not closed under intersection (except with regular languages, as noted above).

o If L₁ and L₂ are both CFLs, L₁ ∩ L₂ is not necessarily a CFL.

o This can be proven using techniques like the pumping lemma for context-free
languages.

7. Reversal

• CFLs are closed under reversal.

o If L is a CFL, then the reversal of L (the language consisting of the reverse of every
string in L) is also a CFL.

o This can be done by constructing a new CFG where the production rules are
reversed.

8. Homomorphism

• CFLs are closed under homomorphism.

o A homomorphism is a function that replaces each symbol of a string with a string


from another alphabet. If L is a CFL, applying a homomorphism to L results in
another CFL.

9. Inverse Homomorphism

• CFLs are closed under inverse homomorphism.

o If L is a CFL, and h is a homomorphism, then the inverse homomorphism of L (the set


of all strings whose image under h belongs to L) is also a CFL.

These closure properties help in understanding the computational capabilities and limitations of
context-free languages, especially in parsing and designing compilers or automata.

UNIT V
The Turing machine

A Turing machine (TM) is a mathematical model of computation that can be used to explore the
limits of computing real numbers. It is a theoretical and real computer that can perform
computations by reading and writing to an infinite tape.

Here are some characteristics of a Turing machine:

• Tape: The tape is divided into cells, each of which can hold a single symbol from a finite set
of symbols. The tape is theoretically infinite.

• Head: The machine has a reading head that moves left and right across the cells.

• State register: The state register stores the state of the Turing machine.

• Instructions: The machine has a set of instructions for the preceding states.

• Input: The input is written on the tape in the form of unary numbers.

• Acceptance: If the TM reaches the end state, the input string is accepted. Otherwise, it is
rejected.

Alan Turing invented the Turing machine in 1936. He called it an "a-machine" (automatic
machine). Turing machines proved that there are fundamental limitations on the power of
mechanical computation.

Turing machine was invented in 1936 by Alan Turing. It is an accepting device which accepts
Recursive Enumerable Language generated by type 0 grammar.

There are various features of the Turing machine:

1. It has an external memory which remembers arbitrary long sequence of input.

2. It has unlimited memory capability.

3. The model has a facility by which the input at left or right on the tape can be read easily.

4. The machine can produce a certain output based on its input. Sometimes it may be required
that the same input has to be used to generate the output. So in this machine, the
distinction between input and output has been removed. Thus a common set of alphabets
can be used for the Turing machine.

Formal definition of Turing machine

A Turing machine can be defined as a collection of 7 components:

Q: the finite set of states


∑: the finite set of input symbols
T: the tape symbol
q0: the initial state
F: a set of final states
B: a blank symbol used as a end marker for input
δ: a transition or mapping function.
Extensions to the basic Turing machine model have been developed to explore various aspects of
computational power, efficiency, and expressiveness. These extensions help us understand practical
limitations, alternative computation models, and the boundaries of what can be computed. Here are
some of the most significant extensions to the basic Turing machine:

1. Multi-Tape Turing Machine

• Description:

o In a multi-tape Turing machine, there are multiple tapes, each with its own head that
can read, write, and move independently.

• Behavior:

o At each step, the machine reads the symbols from all tapes and can write on any of
the tapes and move the heads independently.

• Purpose:

o Multi-tape machines can simulate basic single-tape Turing machines, but they are
often more efficient in terms of time complexity because they allow more direct
access to different portions of data.

• Equivalence:

o While a multi-tape Turing machine is computationally equivalent to a single-tape


machine (they can recognize the same set of languages), a multi-tape machine can
often solve problems faster than a single-tape machine (for example, in linear time
instead of quadratic time).

2. Non-Deterministic Turing Machine (NTM)

• Description:

o A non-deterministic Turing machine allows multiple possible transitions from a single


state and symbol. This means it can explore many computation paths
simultaneously.

• Behavior:

o At each step, the machine can choose from several possible actions (write symbols,
change states, move left or right). The machine "accepts" the input if at least one
sequence of choices leads to an accepting state.

• Purpose:

o This extension explores non-deterministic algorithms, which are helpful in


theoretical analysis of complexity classes like NP (nondeterministic polynomial time).

• Equivalence:
o A non-deterministic Turing machine is computationally equivalent to a deterministic
Turing machine in terms of recognizing the same set of languages, but it helps in
understanding problems for which no efficient deterministic solution is known.

3. Multi-Dimensional Turing Machine

• Description:

o In a multi-dimensional Turing machine, the tape is no longer one-dimensional. It


could be two-dimensional, three-dimensional, or higher-dimensional.

• Behavior:

o The tape head moves in multiple directions (e.g., up, down, left, right in two
dimensions) instead of just left or right.

• Purpose:

o This model can represent certain types of computations more naturally, such as
those involving two-dimensional arrays or matrices.

• Equivalence:

o Multi-dimensional Turing machines are equivalent to single-tape Turing machines in


terms of computational power, although multi-dimensional machines can sometimes
lead to more efficient algorithms for specific problems.

4. Off-Line Turing Machine

• Description:

o In this model, there are multiple tapes, but only one is used for input, and the other
tapes are used for working (memory).

• Behavior:

o The input tape is read-only, while the working tapes can be both read and written to.

• Purpose:

o This model simulates how real computers use read-only memory (ROM) and working
memory (RAM), separating input from processing.

• Equivalence:

o The off-line Turing machine is computationally equivalent to the standard model but
allows better simulation of real-world computations involving fixed input and
variable memory.

5. Oracle Turing Machine

• Description:
o An oracle Turing machine has access to an "oracle" – a black box that can instantly
solve specific problems (e.g., a decision problem).

• Behavior:

o The machine can query the oracle at any step of the computation to get an answer
for a specific question. The oracle's response is provided in a single step.

• Purpose:

o Oracle machines help explore the idea of relativized complexity classes like P^NP
(polynomial time with access to an NP oracle) and PH (polynomial hierarchy).

• Equivalence:

o An oracle Turing machine extends the idea of computation by introducing an


external "helper" that solves certain problems, making it useful for theoretical
studies of complexity classes.

6. Probabilistic Turing Machine

• Description:

o A probabilistic Turing machine is a non-deterministic machine where each possible


move has a certain probability.

• Behavior:

o At each step, the machine chooses between possible transitions based on


probabilities. The machine accepts or rejects based on the likelihood of reaching an
accepting state.

• Purpose:

o This machine model is used to study algorithms that rely on randomness, such as
those in randomized complexity classes like BPP (bounded-error probabilistic
polynomial time).

• Equivalence:

o It models computations where randomness plays a key role and helps explore
problems solvable efficiently by probabilistic methods.

7. Quantum Turing Machine (QTM)

• Description:

o A quantum Turing machine operates based on the principles of quantum mechanics.


It uses quantum bits (qubits) instead of classical bits, allowing superposition and
entanglement.

• Behavior:
o Transitions between states involve quantum operations, and computation can
proceed in parallel over a superposition of different states. The outcome is
probabilistic, and the machine halts after measuring the qubits.

• Purpose:

o This model helps in understanding quantum computation and forms the theoretical
basis for real-world quantum computers.

• Equivalence:

o The quantum Turing machine extends the capabilities of classical Turing machines
and is believed to solve certain problems (like factoring large numbers) much faster
than classical machines (see Shor's algorithm).

8. Alternating Turing Machine (ATM)

• Description:

o An alternating Turing machine combines both existential (like a non-deterministic


machine) and universal states (where all branches must lead to acceptance).

• Behavior:

o In each state, the machine alternates between "existential" choices (where it accepts
if at least one path leads to acceptance) and "universal" choices (where it accepts
only if all paths lead to acceptance).

• Purpose:

o This model is useful for exploring complexity classes like AP (alternating polynomial
time) and PSPACE.

• Equivalence:

o Alternating Turing machines extend the non-deterministic model and are used to
simulate complex decision processes involving both existential and universal
quantification.

9. Linear Bounded Automaton (LBA)

• Description:

o A linear bounded automaton is a Turing machine with a tape whose length is


bounded by a function of the input size (typically linear in the input size).

• Behavior:

o The machine operates similarly to a standard Turing machine but can only use a fixed
portion of the tape for computation.

• Purpose:
o LBAs are used to study problems in context-sensitive languages and serve as a
bridge between finite automata and unrestricted Turing machines.

• Equivalence:

o While less powerful than full Turing machines, LBAs are more powerful than
pushdown automata, making them useful for understanding languages that require
more memory than context-free languages.

10. Turing Machine with Multiple Heads

• Description:

o This extension involves a machine with multiple tape heads on a single tape.

• Behavior:

o The multiple heads can independently read from and write to different parts of the
tape.

• Purpose:

o Multiple heads allow for more efficient parallel processing of different sections of
data, simulating multitasking in computation.

• Equivalence:

o This model is computationally equivalent to a standard Turing machine but may offer
time complexity improvements for certain problems
Turning machines and computers

What is a Turing machine?

A Turing machine can be defined as a computer which can only ever exist theoretically. It has infinite
memory, and infinite states. A Turing machine at best is just a hypothetical model of a fictitious
computer defined by a precise mathematical model to function as an abstract machine. To say a
programming language is Turing complete simply means it can simulate or function equivalently to a
Turing machine.

Turing Machine was invented by Alan Turing in 1936 and it is used to accept Recursive Enumerable
Languages (generated by Type-0 Grammar).

Turing machines are a fundamental concept in the theory of computation and play an important role
in the field of computer science. They were first described by the mathematician and computer
scientist Alan Turing in 1936 and provide a mathematical model of a simple abstract computer.

In the context of automata theory and the theory of computation, Turing machines are used to study
the properties of algorithms and to determine what problems can and cannot be solved by
computers. They provide a way to model the behavior of algorithms and to analyze their
computational complexity, which is the amount of time and memory they require to solve a problem.

A Turing machine is a finite automaton that can read, write, and erase symbols on an infinitely long
tape. The tape is divided into squares, and each square contains a symbol. The Turing machine can
only read one symbol at a time, and it uses a set of rules (the transition function) to determine its
next action based on the current state and the symbol it is reading.

The Turing machine’s behavior is determined by a finite state machine, which consists of a finite set
of states, a transition function that defines the actions to be taken based on the current state and the
symbol being read, and a set of start and accept states. The Turing machine begins in the start state
and performs the actions specified by the transition function until it reaches an accept or reject
state. If it reaches an accept state, the computation is considered successful; if it reaches a reject
state, the computation is considered unsuccessful.

Turing machines are an important tool for studying the limits of computation and for understanding
the foundations of computer science. They provide a simple yet powerful model of computation that
has been widely used in research and has had a profound impact on our understanding of algorithms
and computation.

A turing machine consists of a tape of infinite length on which read and writes operation can be
performed. The tape consists of infinite cells on which each cell either contains input symbol or a
special symbol called blank. It also consists of a head pointer which points to cell currently being read
and it can move in both directions.

Figure: Turing Machine


A TM is expressed as a 7-tuple (Q, T, B, ∑, δ, q0, F) where:

• Q is a finite set of states

• T is the tape alphabet (symbols which can be written on Tape)

• B is blank symbol (every cell is filled with B except input alphabet initially)

• ∑ is the input alphabet (symbols which are part of input alphabet)

• δ is a transition function which maps Q × T → Q × T × {L,R}. Depending on its present state


and present tape alphabet (pointed by head pointer), it will move to new state, change the
tape symbol (may or may not) and move head pointer to either left or right.

• q0 is the initial state

• F is the set of final states. If any state of F is reached, input string is accepted.

Here’s a comparison between a Turing machine and a computer organized in a table


format:

Feature Turing Machine Computer


Abstract theoretical model of Physical device for executing
Concept
computation programs
Infinite tape for storage and
Memory Finite memory (RAM and storage)
manipulation
Reads and writes symbols based on
Operation Executes instructions from software
rules
Performs specific tasks and runs
Functionality Simulates any computable function
applications
Purely mechanical, can be complex Practical, optimized for speed and
Complexity
theoretically usability
Used for practical applications in
Purpose Used in theoretical computer science
various fields
Limitations Not practical for real-world problems Limited by hardware specifications

https://steemit.com/computation/@dana-edwards/real-computers-vs-turing-machines

You might also like