0% found this document useful (0 votes)
71 views12 pages

Lexical Analysis Finite Automata

Finite automata (FA) are abstract models that can decide whether to accept or reject strings. There are two types: non-deterministic FAs (NFAs) and deterministic FAs (DFAs). Regular expressions can represent any language and be converted to equivalent FAs using algorithms like subset construction. DFAs are easier to implement than NFAs and can be generated from regular expressions then used to create scanner programs that tokenize strings based on the represented language.

Uploaded by

Rokibul Hasan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views12 pages

Lexical Analysis Finite Automata

Finite automata (FA) are abstract models that can decide whether to accept or reject strings. There are two types: non-deterministic FAs (NFAs) and deterministic FAs (DFAs). Regular expressions can represent any language and be converted to equivalent FAs using algorithms like subset construction. DFAs are easier to implement than NFAs and can be generated from regular expressions then used to create scanner programs that tokenize strings based on the represented language.

Uploaded by

Rokibul Hasan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 12

4b

Lexical analysis
Finite Automata
Finite Automata (FA)
• FA also called Finite State Machine (FSM)
– Abstract model of a computing entity.
– Decides whether to accept or reject a string.
– Every RE can be represented as a FA and vice versa
• Two types of FAs:
– Non-deterministic (NFA): more than one action for same
input symbol
– Deterministic (DFA): at most one action for a given input
symbol
• Example: how do we write a program to recognize the Java
keyword “int”?

q0 i q1 n q2 t
q3
RE and Finite State Automaton (FA)
• REs are a declarative way to describe the tokens
– Describes what is a token, but not how to recognize the token
• FAs are used to describe how the token is recognized
– FAs are easy to simulate in a programs
• A 1-1 correspondence between FAs & REs
– A scanner generator (e.g., lex) bridges the gap between regular
expressions and FAs.

String stream

Regular
Finite scanner
automaton
expression
program

Scanner generator
Tokens
Transition Diagram
• FA can be represented using transition diagram
• A transition diagram has:
– States represented by circles;
– An Alphabet (Σ) represented by labels on edges;
– Transitions represented by labeled directed edges
between states. The label is the input symbol;
– One Start State shown as having an arrow head;
– One or more Final State(s) represented by double circles.
• Example transition diagram to recognize (a|b)*abb
a

a b q2 b
q0 q1 q3

b
Simple examples of FA
a start
0 a
1

a
a* start
0

a
start a
0 1
a+

a a, b
start
start
0 0
(a|b)*
b
Defining a DFA/NFA
• Define input alphabet and initial state
• Draw the transition diagram
• Check
– Do all states have out-going arcs labeled with all the
input symbols (DFA)
– Any missing final states?
– Any duplicate states?
– Can all strings in the language can be accepted?
– Are any strings not in the language accepted?
• Optionally name the states
Example of constructing a FA
• Construct a DFA accepting a language L over
the alphabet {0, 1} where L is set of strings with
any number of 0s followed by any number of 1s
• Regular expression: 0*1*
  = {0, 1}
• Draw initial state of the transition diagram

Start
Example of constructing a FA
• Draft the transition diagram
0 1

Start 0 1

• Is 111 accepted?
• Leftmost state has missed an arc for input 1
0 1

Start 0 1

1
Example of constructing a FA

• Is 00 accepted?
• The leftmost two states are also final states
– First state from the left:  is also accepted
– Second state from the left:
strings with “0”s only are also accepted

0 1

Start 0 1

1
Example of constructing a FA
• The leftmost two states are duplicate
– their arcs point to the same states with same symbols
0 1
Start 1

• Check that they are correct


– All strings in the language can be accepted
 , the empty string, is accepted
» strings with 0s/1s only are accepted
– No strings not in language are accepted
• Naming all the states
0 1

Start q0 1 q1
Transition table
• A transition table is a good way to implement a FSA
– One row for each state, S
– One column for each symbol, A
– Cell (S,A) is set of states can reachable from S on input A
• NFAs have at least one cell with more than one state
• DFAs have a singe state in every cell

(a|b)*abb STATES a
INPUT
b
a

a b b
>Q0 {q0, q1} q0
q0 q1 q2 q3 Q1 q2

b
Q2 q3
*Q3
DFA to program
• NFA is more concise but not as
easy to implement; RE

• DFAs easily simulated via


algorithm Thompson construction

• Every NFA can be converted to an NFA


equivalent DFA Subset construction
– What does equivalent mean?
• There are general algorithms to DFA

‘minimize’ a DFA Minimization

– Minimal in what sense? Minimized DFA


• There are systems to convert REs DFA simulation
Scanner
to programs using a minimal DFA generator
to recognize strings defined by
the RE Program
• Learn more in 451 (automata
theory) and 431 (Compiler design)

You might also like