0% found this document useful (0 votes)
6 views68 pages

Toc Unit 1

Uploaded by

Kashish Bhalla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views68 pages

Toc Unit 1

Uploaded by

Kashish Bhalla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

Theory of Computation

Presented By
Dr Saravanan S ME,Ph.D
School of Computing Science and Engineering
VIT Bhopal University, Bhopal
Theory of Automata

• Theory of automata is a theoretical branch of


computer science and mathematical. It is the
study of abstract machines and the
computation problems that can be solved using
these machines. The abstract machine is called
the automata. The main motivation behind
developing the automata theory was to develop
methods to describe and analyse the dynamic
behaviour of discrete systems.
• The State is represented by circles, and
the Transitions is represented by arrows.
Symbols:
• Symbols are an entity or individual objects,
which can be any letter, alphabet or any
picture.
Example:
1, a, b, #
Alphabets:
• Alphabets are a finite set of symbols. It is denoted by ∑.
• Examples:
• ∑ = {a, b}

• ∑ = {A, B, C, D}

• ∑ = {0, 1, 2}

• ∑ = {0, 1, ....., 5]

• ∑ = {#, β, Δ}
String:
• It is a finite collection of symbols from the alphabet. The string is
denoted by w.
Example 1:
• If ∑ = {a, b}, various string that can be generated from ∑ are {ab,
aa, aaa, bb, bbb, ba, aba.....}.
• A string with zero occurrences of symbols is known as an empty
string. It is represented by ε.
• The number of symbols in a string w is called the length of a string.
It is denoted by |w|.
• Example 2:
w = 010
Number of Sting |w| = 3
Language:
• A language is a collection of appropriate
string. A language which is formed over Σ can
be Finite or Infinite.
• Example: 1
L1 = {Set of string of length 2} = {aa, bb, ba,
bb}
L2 = {Set of all strings starts with 'a'} = {a, aa,
aaa, abb, abbb, ababb}
Finite Automata

• Finite automata are used to recognize patterns.


• It takes the string of symbol as input and changes
its state accordingly. When the desired symbol is
found, then the transition occurs.
• At the time of transition, the automata can either
move to the next state or stay in the same state.
• Finite automata have two states, Accept
state or Reject state. When the input string is
processed successfully, and the automata reached
its final state, then it will accept.
Formal Definition of FA

• A finite automaton is a collection of 5-tuple


(Q, ∑, δ, q0, F), where:
• Q: finite set of states
• ∑: finite set of the input symbol
• q0: initial state
• F: final state
• δ: Transition function
Finite Automata Model

• Finite automata can be represented by input tape


and finite control.
• Input tape: It is a linear tape having some
number of cells. Each input symbol is placed in
each cell.
• Finite control: The finite control decides the next
state on receiving particular input from input tape.
The tape reader reads the cells one by one from
left to right, and at a time only one input symbol
is read.
Transition Diagram
Transition Table

Present State Next state for Next State of Input 1


Input 0

→q0 q1 q2

q1 q0 q2

*q2 q2 q2
Types of Automata
Deterministic Finite Automata(DFA)
• DFA refers to deterministic finite automata.
Deterministic refers to the uniqueness of the
computation. The finite automata are called
deterministic finite automata if the machine is
read an input string one symbol at a time.
• In DFA, there is only one path for specific input
from the current state to the next state.
• DFA does not accept the null move, i.e., the DFA
cannot change state without any input character.
• DFA can contain multiple final states. It is used in
Lexical Analysis in Compiler
Formal Definition of DFA

A DFA is a collection of 5-tuples same as we


described in the definition of FA
• Q: finite set of states
• ∑: finite set of the input symbol
• q0: initial state
• F: final state
• δ: Transition function Q x ∑→Q
Example 1

• Q = {{q0, q1, q2} ,∑ = {0, 1},


δ ,q0 = {q0} ,F = {q2} }= ?

State 0 1

→q0 q0 q1

q1 q2 q1

*q2 q2 q2
Example 2

DFA with ∑ = {0, 1} accepts all starting with 0.


Example 3

DFA with ∑ = {0, 1} accepts all ending with 0.


Non-Deterministic Finite
Automata(NFA or NDFA)
• NFA stands for non-deterministic finite automata.
It is easy to construct an NFA than DFA for a
given regular language.
• The finite automata are called NFA when there
exist many paths for specific input from the
current state to the next state.
• Every NFA is not DFA, but each NFA can be
translated into DFA.
• NFA is defined in the same way as DFA but with
the following two exceptions, it contains multiple
next states, and it contains ε transition.
Example
Formal definition of NFA

A DFA is a collection of 5-tuples same as we


described in the definition of FA
• Q: finite set of states
• ∑: finite set of the input symbol
• q0: initial state
• F: final state
• δ: Transition function δ: Q x ∑ →2Q
Example

Present State Next state for Next State of


Input 0 Input 1

→q0 q0, q1 q1
q1 q2 q0
*q2 q2 q1, q2
Example 2
Present State 0 1

→q0 q0, q1 q0, q2


q1 q3 ε
q2 q2, q3 q3
*q3 q3 q3
Example 3
Problem 1
• Let M =({ q0,q1,q2,q3},{0,1},δ,q0,q0) and the
transition functions are
δ (q0, 0) =q2 δ (q2, 0) =q0
δ (q0, 1) =q1 δ (q2, 1) =q3
δ (q1, 0) =q3 δ (q3, 0) =q1
δ (q1, 1) =q0 δ (q3, 1) =q2
a) Represent M by its State table
b) Represents M by its State diagram
c) Which of the following strings are accepted by
1)0101 2) 110101 3) 01001
Problem 2

• String are{ 001, 1001,01101}


Conversion from NFA to DFA

Let, M = (Q, ∑, δ, q0, F) is an NFA which accepts


the language L(M). There should be equivalent
DFA denoted by M' = (Q', ∑', q0', δ', F') such that
L(M) = L(M').
Steps for converting NFA to DFA

• Step 1: Initially Q' = ϕ


• Step 2: Add q0 of NFA to Q'. Then find the
transitions from this start state.
• Step 3: In Q', find the possible set of states for
each input symbol. If this set of states is not in
Q', then add it to Q'.
• Step 4: In DFA, the final state will be all the
states which contain F(final states of NFA)
Example 1: Convert the given NFA to DFA.

State 0 1

→q0 {q0, q1} {q1}

*q1 ϕ {q0, q1}


Solution
State 0 1

→[q0] [q0, q1] [q1]

*[q1] ϕ [q0, q1]

*[q0, q1] [q0, q1] [q0, q1]


Conversion from NFA with ε to DFA

Steps for converting NFA with ε to DFA:


Step 1: We will take the ε-closure for the starting state of NFA
as a starting state of DFA.
Step 2: Find the states for each input symbol that can be
traversed from the present. That means the union of transition
value and their closures for each state of NFA present in the
current state of DFA.
Step 3: If we found a new state, take it as current state and
repeat step 2.
Step 4: Repeat Step 2 and Step 3 until there is no new state
present in the transition table of DFA.
Step 5: Mark the states of DFA as a final state which contains
the final state of NFA.
Example

Convert the given NFA into its equivalent DFA


Solution
• ε-closure(q0) = {q0, q1, q2}
• ε-closure(q1) = {q1, q2}
• ε-closure(q2) = {q2}
• δ'(A, 0) = A
• δ'(A, 1) = B
• δ'(A, 2) = C
• δ'(B, 0) = ϕ
• δ'(B, 1) = B
• δ'(B, 2) = C
Minimization of DFA

• Step 1: Remove all the states that are


unreachable from the initial state via any set
of the transition of DFA.
• Step 2: Draw the transition table for all pair of
states.
• Step 3: Now split the transition table into two
tables T1 and T2. T1 contains all final states,
and T2 contains non-final states.
• Step 4: Find similar rows from T1 such that:
COMPILER
DEFINITION
 Compiler is a program that reads a program written in
one language – source program – and translate it into
an equivalent program in another language – target
language
 Compiler reports to its user the presence of error in
source program

source program Target program


COMPILER

Error Message
Types of Compiler

There are 4 types


– Single – Pass
– Multi- Pass
– Load and Go
– Debugging
Analysis – Synthesis Model

There are 2 pars to compilation


1.Analysis Model
 breaks up the source program into
constituent pieces and creates an
intermediate representation of the source
program
 Tree Syntax tree
Analysis – Synthesis model
2.Synthesis Model
Constructs the desired target program from the
intermediate representation
Software Tools :
 Structural Editors
Pretty Printers
Static Checkers
Interpreters
Text formatters
Silicon compilers
Query interpreters
Context of compilers

Figure 1. A Language-Processing System


Analysis of the source program
Analysis consists of 3 phases
1.Linear analysis
 source program is read from left-to-right
 Grouped into tokens  sequences of
characters having a collective meaning.
2.Hierarchical analysis
 tokens are grouped hierarchically into
nested collections with collective meaning.

3.semantic analysis
 to ensure that the components of a
program fit together meaningfully.
Phases of a Compiler
Source Program

1
Lexical Analyzer

2
Syntax Analyzer

3
Semantic Analyzer

Symbol-table Manager Error Handler


4 Intermediate Code
Generator

5
Code Optimizer

6
Code Generator

Target Program
Lexical Analysis
• Also called scanning
Scans Input & Identify tokens
Removes White spaces and comments

An Example
Position := initial + rate * 60 ;

All are tokens


Token

• Three types of tokens:


• Terminal Symbols (TRM)-
• Keywords and Operators, Literals (LIT),
• Identifiers (IDN).
Example
• int a = 10;
Tokens
int (keyword),
a(identifier),
=(operator),
10(constant)
;(punctuation-semicolon)
Total token =5
Example 2
int main()
{
// printf() sends the string inside quotation to
// the standard output (the display)
printf("Welcome to GeeksforGeeks!"); return 0;
}
• Tokens 'int', 'main', '(', ')', '{', 'printf', '(', '
"Welcome to GeeksforGeeks!" ', ')', ';', 'return',
'0', ';', '}‘

Total number of tokens = 14


Lexeme

• It is a sequence of characters in the source


code that are matched by given predefined
language rules for every lexeme to be specified
as a valid token. int, goto,main
Example
main is lexeme of type identifier(token)
(,),{,} are lexemes of type punctuation(token)
Pattern

• It specifies a set of rules that a scanner follows


to create a token.
• It must start with the alphabet, followed by
the alphabet or a digit.
• The sequence of characters that make the
keyword.
Example
(, ), {, }
LEX

• Lex is a program that generates lexical


analyzer. It is used with YACC parser
generator.
• The lexical analyzer is a program that
transforms an input stream into a sequence of
tokens.
The function of Lex is as follows:
• Lex file format
{definitions } %% { rules } %% { user subroutines }
Syntax Analysis
 The Syntactic Analysis is also called Parsing.

 Tokens are grouped into grammatical phrases represented by a Parse Tree

 The hierarchical structure is expressed by recursive rules, called


Productions.

Parse Tree: An Example


Semantic Analysis
 The Semantic Analysis phase checks the program for
semantic errors (Type Checking) and gathers type information
for the successive phases.

 Type Checking: Check types of operands .No real number as


index for array; etc.
Intermediate Code Generation
 An intermediate code is generated as a program for an abstract machine.

 The intermediate code should be easy to translate into the target program.

 As intermediate code we consider the three-address code, similar to


assembly:

• sequence of instructions with at most three operands such that:

 There is at most one operator, in addition to the assignment.

• make explicit the operators precedence.

 Temporary names must be generated to compute intermediate


operations.
Code Optimization
 This phase attempts to improve the intermediate code so that
faster-running

• machine code can be obtained.

 Different compilers adopt different optimization techniques.

Code Generation
This phase generates the target code consisting of assembly code.
1. Memory locations are selected for each variable;
2. Instructions are translated into a sequence of assembly instructions;
3. Variables and intermediate results are assigned to memory registers.
The Structure of a Compiler
Code Generator
[Intermediate Code Generator]

Non-optimized Intermediate Code


Scanner
[Lexical Analyzer]

Tokens

Code Optimizer
Parser
[Syntax Analyzer]
Optimized Intermediate Code
Parse tree

Code Optimizer
Semantic Process
[Semantic analyzer] Target machine code

Abstract Syntax Tree w/ Attributes


Symbol Table Management
• A symbol table is a data structure containing a record for
each identifier, with fields for the attributes of the
identifier.

• The data structure allows us to find the record for each


identifier quickly and to store or retrieve data from that
record quickly.
Error Detection and Reporting
• Each phase encounter errors
 The lexical phase can detect errors where the characters
remaining in the input do not form any token of the language.
 Errors where the token stream violates the structure rules
(syntax)of the language are determined by the syntax analysis
phase.
 During semantic analysis the compiler tries to detect constructs
that have the right syntactic structure but no meaning to the
operation involved
 e.g., if we try to add two identifiers, one of which is the name
of an array, and the other the name of a procedure
Compiler-Construction Tools
Other names are
 Compiler – compiler
 Compiler – generators
 Translator – writing system

For compiler construction there are special tools like :


• 1- Scanner generators Eg.Lex and Flex
• 2- Parser generators Eg. Yacc and Bison
• 3- Syntax-directed translation engines
• 4- Code-generator generators
• 5- Data-flow analysis engines

You might also like