0% found this document useful (0 votes)
28 views97 pages

Compier Design - Unit I

Uploaded by

Saisubramanian V
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views97 pages

Compier Design - Unit I

Uploaded by

Saisubramanian V
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 97

Compiler Design

Textbook:
Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman,
“Compilers: Principles, Techniques, and Tools”
Addison-Wesley, 1986.
2

Unit – I Syllabus

Compilers – Analysis of the source program-Phases of a compiler –


Cousins of the Compiler-Grouping of Phases – Compiler construction
tools- Lexical Analysis – Role of Lexical Analyzer-Input Buffering-
Specification of Tokens--Finite automation – deterministic Finite
automation - non deterministic-Transition Tables- Acceptance of Input
Strings by Automata-State Diagrams and Regular Expressions- Conversion
of regular expression to NFA - Thompson’s method-Conversion of NFA to
DFA- Simulation of an NFA-Converting Regular expression directly to DFA-
Minimization of DFA-Minimization of NFA- Design of lexical analysis (LEX)
Jeya R 3

Compiler - Introduction
• A compiler is a program that can read a program in one language - the
source language - and translate it into an equivalent program in
another language - the target language.
• A compiler acts as a translator, transforming human-oriented
programming languages into computer-oriented machine languages.
• Ignore machine-dependent details for programmer
Jeya R 4

COMPILERS
• A compiler is a program takes a program written in a
source language and translates it into an equivalent
program in a target language.

source program COMPILER target program

( Normally a program written in ( Normally the equivalent program in


a high-level programming language) machine code – relocatable object file)

error messages
Compiler vs Interpreter

• An interpreter is another common kind of language


processor. Instead of producing a target program as a
translation, an interpreter appears to directly execute
the operations specified in the source program on
inputs supplied by the user

• The machine-language target program produced by a


compiler is usually much faster than an interpreter at
mapping inputs to outputs .
• An interpreter, however, can usually give better error
diagnostics than a compiler, because it executes the source
program statement by statement
Jeya R 5
Jeya R 6

Compiler Applications
• Machine Code Generation
– Convert source language program to machine understandable one
– Takes care of semantics of varied constructs of source language
– Considers limitations and specific features of target machine
– Automata theory helps in syntactic checks
– valid and invalid programs
– Compilation also generate code for syntactically correct programs
Structure of a Compiler

• Breaks the source program into pieces


Analysis and fit into a
grammatical structure
• If this part detect any syntactically ill
formed or semantically unsound error it
is report to the user
• It collect the information about the
source program and stored in a data
structure – Symbol Table

Synthesis • Construct the target program from the


available symbol table and
intermediate representation

Jeya R 7
Jeya R 8
Jeya R 9

Phases of A Compiler

Source Lexical Syntax Semantic Intermediate Code Code Target


Program Analyzer Analyzer Analyzer Code Generator Optimizer Generator Program

• Each phase transforms the source program from one representation


into another representation.

• They communicate with error handlers.

• They communicate with the symbol table.


Jeya R 10

Lexical Analyzer
• Lexical Analyzer reads the source program character by character and returns
the tokens of the source program.
• A token describes a pattern of characters having same meaning in the source
program. (such as identifiers, operators, keywords, numbers, delimeters and so
on)
Ex: newval := oldval + 12 => tokens: newval identifier
:= assignment
operator
oldval identifier
+ add operator
12 a number

• Puts information about identifiers into the symbol table.


• Regular expressions are used to describe tokens (lexical constructs).
• A (Deterministic) Finite State Automaton can be used in the implementation of a
lexical analyzer.
Phases of Compiler-Lexical
Analysis
• It is also called as scanning

• This phase scans the source code as a stream of characters and converts it
into meaningful lexemes.

• For each lexeme, the lexical analyzer produces as output a token of


the form
• It passes on to the subsequent phase, syntax analysis.

This points to an entry in


the symbol table for this
It is an abstract token.
symbol that is <token-name, Information from the
used during attribute-value> symbol-table
syntax entry 'is needed for
analysis semantic analysis and
Jeya R code generation 11
Jeya R 12

Token , Pattern and Lexeme


• Token: Token is a sequence of characters that can
be treated as a single logical entity. Typical tokens
are, 1) Identifiers 2) keywords 3) operators 4) special
symbols 5)constants
• Pattern: A set of strings in the input for which the
same token is produced as output. This set of strings
is described by a rule called a pattern associated
with the token.
• Lexeme: A lexeme is a sequence of characters in
the source program that is matched by the pattern
for a token.
Phases of Compiler-Symbol
Table Management
• Symbol table is a data structure holding information about all symbols defined in

the source program

• Not part of the final code, however used as reference by all phases of a

compiler

• Typical information stored there include name, type, size, relative offset

of variables
• Generally created by lexical analyzer and syntax analyzer
• Good data structures needed to minimize searching time
• The data structure may be flat or hierarchical
A Syntax Analyzer creates the syntactic

Syntax
structure (generally a parse tree) of the
given program.
A syntax analyzer is also called as a parser.
A parse tree describes a syntactic structure

Analysis •In a parse tree, all terminals are at leaves.

• All inner nodes are non-terminals in


a context free grammar
Phases of Compiler-Syntax
Analysis
• This is the second phase, it is also called as parsing

• It takes the token produced by lexical analysis as input and generates a parse

tree (or syntax tree).

• In this phase, token arrangements are checked against the source

code grammar, i.e. the parser checks if the expression made by the tokens is

syntactically correct.
Jeya R 16

Syntax Analyzer versus Lexical Analyzer


• Which constructs of a program should be recognized by the
lexical analyzer, and which ones by the syntax analyzer?
• Both of them do similar things; But the lexical analyzer deals with simple
non-recursive constructs of the language.
• The syntax analyzer deals with recursive constructs of the language.
• The lexical analyzer simplifies the job of the syntax analyzer.
• The lexical analyzer recognizes the smallest meaningful units (tokens) in a
source program.
• The syntax analyzer works on the smallest meaningful units (tokens) in a
source program to recognize meaningful structures in our programming
language.
Semantic
Analysis
Phases of Compiler-Semantic
Analysis
• Semantic analysis checks whether the parse tree constructed follows the

rules of language.

• The semantic analyzer uses the syntax tree and the information in the

symbol table to check the source program for semantic consistency with
the language definition.
• It also gathers type information and saves it in either the syntax
tree or the symbol table, for subsequent use during intermediate-code
generation.
• An important part of semantic analysis is type checking
Phases of Compiler-Semantic
Analysis
• Suppose that position, initial, and rate have been declared to be
floating-point numbers and that the lexeme 60 by itself forms an integer.

• The type checker in the semantic analyzer discovers that the operator

* is applied to a floating-point number rate and an integer 60.

• In this case, the integer may be converted into a floating-point


number.
Intermediate Code
Generation
Phases of Compiler-Intermediate
Code Generation
• After semantic analysis the compiler generates an intermediate code of

the source code for the target machine.


• It represents a program for some abstract machine.
• It is in between the high-level language and the machine language.

• This intermediate code should be generated in such a way that it makes

it easier to be translated into the target machine code.

• A compiler may produce an explicit intermediate codes representing the


source program.
• These intermediate codes are generally machine (architecture
independent). But the level of intermediate codes is close to the level of
machine codes
Phases of Compiler-Intermediate
Code Generation
• An intermediate form called three-address code were used

• It consists of a sequence of assembly-like instructions with three

operands per instruction. Each operand can act like a register.


Code
Optimization
Phases of Compiler-Code
Optimization
• The next phase does code optimization of the intermediate code.
• Optimization can be assumed as something that removes unnecessary
code lines, and arranges the sequence of statements in order to speed up
the program execution without wasting resources (CPU, memory).
Code
Generation
Phases of Compiler-Code
Generation
• In this phase, the code generator takes the optimized representation of the
intermediate code and maps it to the target machine language.
• If the target language is machine code, registers or memory locations are
selected for each of the variables used by the program.
• Then, the intermediate instructions are translated into sequences
of machine instructions that perform the same task.
• Produces the target language in a specific architecture.
• The target program is normally is a relocatable object file containing the
machine codes
Phases of Compiler-Code
Generation
• For example, using registers R1 and R2, the intermediate code
might get translated into the machine code

• The first operand of each instruction specifies a destination. The F


in each instruction tells us that it deals with floating-point
numbers.
Jeya R 28

Phases of Compiler-Translation of assignment


statement
Jeya R 29

Cousins of Compiler- Language


Processing System
Jeya R 30

Compiler Construction Tool


Jeya R 31

Role of a Lexical Analyzer

• Role of lexical analyzer


• Specification of tokens
• Recognition of tokens
• Lexical analyzer generator
• Finite automata
• Design of lexical analyzer generator
By Nagadevi

Why to separate Lexical analysis and parsing

1. Simplicity of design
2. Improving compiler efficiency
3. Enhancing compiler portability
By Nagadevi

The role of lexical analyzer

token
Source To semantic
Lexical Analyzer Parser
program analysis
getNextToken

Symbol
table
CS416 Compiler Design 34

Lexical Analyzer
• Lexical Analyzer reads the source program character by character to
produce tokens.
• Normally a lexical analyzer doesn’t return a list of tokens at one shot,

it returns a token when the parser asks a token from it.

source Lexical token


Parser
program Analyze get next token
r
By Nagadevi

Lexical errors
• Some errors are out of power of lexical analyzer to
recognize:
• fi (a == f(x)) …
• However it may be able to recognize errors like:
• d = 2r
• Such errors are recognized when no pattern for tokens
matches a character sequence
By Nagadevi

Error recovery
• Panic mode: successive characters are ignored until we
reach to a well formed token
• Delete one character from the remaining input
• Insert a missing character into the remaining input
• Replace a character by another character
• Transpose two adjacent characters
CS416 Compiler Design 37

Token
• Token represents a set of strings described by a pattern.
• Identifier represents a set of strings which start with a letter continues with letters and
digits
• The actual string (newval) is called as lexeme.
• Tokens: identifier, number, addop, delimeter, …
• Since a token can represent more than one lexeme, additional information should be held
for that specific lexeme. This additional information is called as the attribute of the token.
• For simplicity, a token may have a single attribute which holds the required information for
that token.
• For identifiers, this attribute a pointer to the symbol table, and the symbol table holds
the actual attributes for that token.
Jeya R 38

Token
• Some attributes:
• <id,attr> where attr is pointer to the symbol table
• <assgop,_> no attribute is needed (if there is only one assignment operator)
• <num,val> where val is the actual value of the number.
• Token type and its attribute uniquely identifies a lexeme.
• Regular expressions are widely used to specify patterns.
By Nagadevi

Tokens, Patterns and Lexemes


• A token is a pair a token name and an optional token value
• A pattern is a description of the form that the lexemes of a
token may take
• A lexeme is a sequence of characters in the source program
that matches the pattern for a token
By Nagadevi

Example

Token Informal description Sample lexemes


if Characters i, f if
else Characters e, l, s, e else
comparison < or > or <= or >= or == or != <=, !=

id Letter followed by letter and digits pi, score, D2


number Any numeric constant 3.14159, 0, 6.02e23
literal Anything but “ sorrounded by “ “core dumped”

printf(“total = %d\n”, score);


CS416 Compiler Design 41

Terminology of Languages
• Alphabet : a finite set of symbols (ASCII characters)
• String :
• Finite sequence of symbols on an alphabet
• Sentence and word are also used in terms of string
• ε is the empty string
• |s| is the length of string s.
• Language: sets of strings over some fixed alphabet
• ∅ the empty set is a language.
• {ε} the set containing empty string is a language
• The set of well-formed C programs is a language
• The set of all possible identifiers is a language.
Jeya R 42

Terminology of Languages
• Operators on Strings:
• Concatenation: xy represents the concatenation of strings
x and y. s ε = s εs=s
• sn = s s s .. s ( n times) s0 = ε
43

Input buffering
• Sometimes lexical analyzer needs to look ahead some symbols to decide
about the token to return
• In C language: we need to look after -, = or < to decide what token to
return
• In Fortran: DO 5 I = 1.25
• We need to introduce a two buffer scheme to handle large look-aheads
safely

E = M * C * * 2 eof
44

Cont..,
45

Cont..,
46

Cont..,
47

Sentinels

E = M eof * C * * 2 eof eof

Switch (*forward++) {
case eof:
if (forward is at end of first buffer) {
reload second buffer;
forward = beginning of second buffer;
}
else if {forward is at end of second buffer) {
reload first buffer;\
forward = beginning of first buffer;
}
else /* eof within a buffer marks the end of input */
terminate lexical analysis;
48

Specification of tokens
• In theory of compilation regular expressions are used to
formalize the specification of tokens
• Regular expressions are means for specifying regular
languages
• Example:
• Letter_(letter_ | digit)*
• Each regular expression is a pattern specifying the form of
strings
49

Regular expressions
• Ɛ is a regular expression, L(Ɛ) = {Ɛ}

• If a is a symbol in ∑then a is a regular expression, L(a) = {a}


• (r) | (s) is a regular expression denoting the language L(r) ∪
L(s)
• (r)(s) is a regular expression denoting the language L(r)L(s)
• (r)* is a regular expression denoting (L(r))*
• (r) is a regular expression denting L(r)
50

Regular definitions

d1 -> r1
d2 -> r2

dn -> rn

• Example:
letter_ -> A | B | … | Z | a | b | … | Z | _
digit -> 0 | 1 | … | 9
id -> letter_ (letter_ | digit)*
51

Extensions
• One or more instances: (r)+
• Zero or one instances: r?
• Character classes: [abc]

• Example:
• letter_ -> [A-Za-z_]
• digit -> [0-9]
• id -> letter_(letter|digit)*
52

Recognition of tokens
• Starting point is the language grammar to understand the
tokens:
stmt -> if expr then stmt
| if expr then stmt else stmt

expr -> term relop term
| term
term -> id
| number
53

Recognition of tokens (cont.)


• The next step is to formalize the patterns:
digit -> [0-9]
Digits -> digit+
number -> digit(.digits)? (E[+-]? Digit)?
letter -> [A-Za-z_]
id -> letter (letter|digit)*
If -> if
Then -> then
Else -> else
Relop -> < | > | <= | >= | = | <>

• We also need to handle whitespaces:


CS416 Compiler Design 54

Operations on Languages
• Concatenation:
• L1L2 = { s1s2 | s1 ∈ L1 and s2 ∈ L2 }

• Union
• L1 ∪ L2 = { s | s ∈ L1 or s ∈ L2 }

• Exponentiation:
• L0 = {ε} L1 = L L2 = LL

• Kleene Closure
• L* =

• Positive Closure
• L+ =
CS416 Compiler Design 55

Example

• L1 = {a,b,c,d} L2 = {1,2}

• L1L2 = {a1,a2,b1,b2,c1,c2,d1,d2}

• L1 ∪ L2 = {a,b,c,d,1,2}

• L13 = all strings with length three (using a,b,c,d}

L * = all strings using letters a,b,c,d and empty string


CS416 Compiler Design 56

Regular Definitions
• To write regular expression for some languages can be difficult, because their regular expressions can
be quite complex. In those cases, we may use regular definitions.
• We can give names to regular expressions, and we can use these names as symbols to define other
regular expressions.

• A regular definition is a sequence of the definitions of the form:


d 1 → r1 where di is a distinct name and

d 2 → r2 ri is a regular expression over symbols in

. Σ∪{d1,d2,...,di-1}

d n → rn
basic symbols previously defined names
CS416 Compiler Design 57

Regular Definitions (cont.)


• Ex: Identifiers in Pascal
letter → A | B | ... | Z | a | b | ... | z
digit → 0 | 1 | ... | 9
id → letter (letter | digit ) *
• If we try to write the regular expression representing identifiers without using regular
definitions, that regular expression will be complex.
(A|...|Z|a|...|z) ( (A|...|Z|a|...|z) | (0|...|9) ) *

• Ex: Unsigned numbers in Pascal


digit → 0 | 1 | ... | 9
digits → digit +
opt-fraction → ( . digits ) ?
opt-exponent → ( E (+|-)? digits ) ?
unsigned-num → digits opt-fraction opt-exponent
By Nagadevi

Regular expressions
• Ɛ is a regular expression, L(Ɛ) = {Ɛ}

• If a is a symbol in ∑then a is a regular expression, L(a) = {a}


• (r) | (s) is a regular expression denoting the language L(r) ∪
L(s)
• (r)(s) is a regular expression denoting the language L(r)L(s)
• (r)* is a regular expression denoting (L(r))*
• (r) is a regular expression denting L(r)
By Nagadevi

Regular definitions

d1 -> r1
d2 -> r2

dn -> rn

• Example:
letter_ -> A | B | … | Z | a | b | … | Z | _
digit -> 0 | 1 | … | 9
id -> letter_ (letter_ | digit)*
By Nagadevi

Extensions
• One or more instances: (r)+
• Zero or one instances: r?
• Character classes: [abc]

• Example:
• letter_ -> [A-Za-z_]
• digit -> [0-9]
• id -> letter_(letter|digit)*
By Nagadevi

Recognition of tokens
• Starting point is the language grammar to understand the
tokens:
stmt -> if expr then stmt
| if expr then stmt else stmt

expr -> term relop term
| term
term -> id
| number
By Nagadevi

Recognition of tokens (cont.)


• The next step is to formalize the patterns:
digit -> [0-9]
Digits -> digit+
number -> digit(.digits)? (E[+-]? Digit)?
letter -> [A-Za-z_]
id -> letter (letter|digit)*
If -> if
Then -> then
Else -> else
Relop -> < | > | <= | >= | = | <>
• We also need to handle whitespaces:
ws -> (blank | tab | newline)+
Design of a Lexical
Analyzer (LEX)

6
3
Design of a Lexical
Analyzer
• LEX is a software tool that automatically construct a lexical
analyzer from a program
• The Lexical analyzer will be of the form
P1 {action 1}
P2 {action 2}
--
--

• Each pattern pi is a regular expression and action i is a program


fragment that is to be executed whenever a lexeme matched
by pi is found in the input
• If two or more patterns that match the longest lexeme, the first
listed matching pattern is chosen
6
4
Design of a Lexical Analyzer
• Here the Lex compiler
constructs a transition table
for a finite automaton from
the regular expression pattern
in the Lex specification
• The lexical analyzer itself
consists of a finite automaton
simulator that uses this
transition table to look for the
regular expression patterns in
the input buffer

6
5
General
format
• The declarations section includes declarations
of variables, manifest constants (identifiers
declared to stand for a constant, e.g., the
name of a token)
• The translation rules each have the form
Pattern { Action )
• Each pattern is a regular expression, which
may use the regular definitions of the
declaration section.
• The actions are fragments of code, typically
written in C, although many variants of Lex
using other languages have been created.
• The third section holds whatever additional
functions are used in the actions.

6
6
Lexical Analyzer Generator - Lex

Lex Source
Lexical Compiler lex.yy.c
program
lex.l

C
lex.yy.c a.out
compiler

Input a.out
Sequenc
stream e of
tokens

67
Finite Automata
• Regular expressions = specification
• Finite automata = implementation
• Recognizer ---A recognizer for a language is a program that takes as input
a string x answers ‘yes’ if x is a sentence of the language and ‘no’ otherwise.

• A better way to convert a regular expression to a recognizer is to construct


a generalized transition diagram from the expression. This diagram is
called a finite automaton.

• Finite Automaton can be


• Deterministic
• Non-deterministic

68
Finite Automata

• A finite automaton consists of


• An input alphabet Σ
• A set of states S
• A start state n
• A set of accepting states F ⊆ S
• A set of transitions state →input state

9
Finite Automata
• Transition
s 1 → a s2
• Is read
In state s1 on input “a” go to state s2

• If end of input
• If in accepting state => accept, otherwise => reject
• If no transition possible => reject

70
Finite Automata State Graphs
• A state

• The start state

• An accepting state

a
• A transition

71
CS416 Compiler Design 72

Finite Automata
• A recognizer for a language is a program that takes a string x, and answers “yes” if x is a sentence of that
language, and “no” otherwise.
• We call the recognizer of the tokens as a finite automaton.
• A finite automaton can be: deterministic(DFA) or non-deterministic (NFA)
• This means that we may use a deterministic or non-deterministic automaton as a lexical analyzer.
• Both deterministic and non-deterministic finite automaton recognize regular sets.
• Which one?
• deterministic – faster recognizer, but it may take more space
• non-deterministic – slower, but it may take less space
• Deterministic automatons are widely used lexical analyzers.
• First, we define regular expressions for tokens; Then we convert them into a DFA to get a lexical
analyzer for our tokens.
• Algorithm1: Regular Expression 🡺 NFA 🡺 DFA (two steps: first to NFA, then to DFA)
• Algorithm2: Regular Expression 🡺 DFA (directly convert a regular expression into a DFA )
Non-Deterministic Finite Automaton (NFA)

• A non-deterministic finite automaton (NFA) is a mathematical model that consists of:


• S - a set of states
• Σ - a set of input symbols (alphabet)
• move – a transition function move to map state-symbol pairs to sets of states.
• s0 - a start (initial) state
• F – a set of accepting states (final states)

• ε- transitions are allowed in NFAs. In other words, we can move from one state to
another one without consuming any symbol.
• A NFA accepts a string x, if and only if there is a path from the starting state to one of
accepting states such that edge labels along this path spell out x.

73
74

Deterministic and Nondeterministic Automata

• Deterministic Finite Automata (DFA)


• One transition per input per state
• No ε-moves
• Nondeterministic Finite Automata (NFA)
• Can have multiple transitions for one input in a given state
• Can have ε-moves
• Finite automata have finite memory
• Need only to encode the current state
A Simple Example
• A finite automaton that accepts only “1”

• A finite automaton accepts a string if we can follow transitions labeled


with the characters in the string from the start to some accepting state

75
Another Simple Example
• A finite automaton accepting any number of 1’s followed by a single 0
• Alphabet: {0,1}

• Check that “1110” is accepted.

76
NFA

77
NFA

78
Transition Table

79
CS416 Compiler Design 80

Converting A Regular Expression into A NFA


(Thomson’s Construction)
• This is one way to convert a regular expression into a NFA.
• There can be other ways (much efficient) for the conversion.
• Thomson’s Construction is simple and systematic method.
It guarantees that the resulting NFA will have
exactly one final state, and one start state.
• Construction starts from simplest parts (alphabet symbols).

• To create a NFA for a complex regular expression, NFAs of


its sub-expressions are combined to create its NFA,
CS416 Compiler Design 81

Thomson’s Construction (cont.)


ε
i f
• To recognize an empty string ε

• To recognize a symbol a in the alphabet Σ a


i f

• If N(r1) and N(r2) are NFAs for regular expressions r1 and r2


• For regular expression r1 | r2

ε N(r1) ε
NFA for r1 | r2
i ε f
ε
N(r2)
CS416 Compiler Design 82

Thomson’s Construction (cont.)


• For regular expression r1 r2

i N(r1) N(r2) f Final state of N(r2) become


final state of N(r1r2)
NFA for r1 r2

• For regular expression r*

ε ε
i N(r) f

ε
NFA for r*
CS416 Compiler Design 83

Thomson’s Construction (Example - (a|b) * a )


a a ε
a: ε
(a | b) ε
b ε
b: b

a ε
ε
ε ε
(a|b) * ε ε
b
ε

ε
a ε
ε
(a|b) * a ε ε a
ε ε
b

ε
84
CS416 Compiler Design 85

Converting a NFA into a DFA (subset


construction)
put ε-closure({s0}) as an unmarked ε-closure({s0}) is the set of all states can b
state into the set of DFA (DS) accessible
while (there is one unmarked S1 in DS) from s0 by ε-transition.
do
begin set of states to which there is a transition on
mark S1 a from a state s in S1
for each input symbol a do
begin
S2 🡸 ε-
closure(move(S1,a))
if (S2 is not in DS) then
add S2 into DS as an
unmarked state
transfunc[S1,a] 🡸 S2
end
end
• a state S in DS is an accepting state of DFA if a state in S
is an accepting state of NFA
• the start state of DFA is ε-closure({s 0})
CS416 Compiler Design 86

Converting a NFA into a DFA (Example)


2 a 3 ε
ε
0 ε 1 ε a
ε 6 7 8
ε
4 b 5
ε

S0 = ε-closure({0}) = {0,1,2,4,7} S0 into DS as an unmarked state


⇓ mark S0
ε-closure(move(S0,a)) = ε-closure({3,8}) = {1,2,3,4,6,7,8} = S1 S1 into DS
ε-closure(move(S0,b)) = ε-closure({5}) = {1,2,4,5,6,7} = S2 S2 into DS
transfunc[S0,a] 🡸 S1 transfunc[S0,b] 🡸 S2
⇓ mark S1
ε-closure(move(S1,a)) = ε-closure({3,8}) = {1,2,3,4,6,7,8} = S1
ε-closure(move(S1,b)) = ε-closure({5}) = {1,2,4,5,6,7} = S2
transfunc[S1,a] 🡸 S1 transfunc[S1,b] 🡸 S2
⇓ mark S2
ε-closure(move(S2,a)) = ε-closure({3,8}) = {1,2,3,4,6,7,8} = S1
ε-closure(move(S2,b)) = ε-closure({5}) = {1,2,4,5,6,7} = S2
transfunc[S2,a] 🡸 S1 transfunc[S2,b] 🡸 S2
CS416 Compiler Design 87

Converting a NFA into a DFA (Example – cont.)


S0 is the start state of DFA since 0 is a member of S0={0,1,2,4,7}
S1 is an accepting state of DFA since 8 is a member of S1 = {1,2,3,4,6,7,8}

S1

S0 b a

S2

b
88
89
90
Jeya R 91

Minimization of DFA
Jeya R 92

Minimization of DFA
Jeya R 93

Minimization of DFA
Jeya R 94

Minimization of DFA
Jeya R 95

Minimization of DFA
Jeya R 96

Example-Minimization of DFA
Jeya R 97

Example-Minimization of DFA

You might also like