Two Issues in Lexical Analysis

This document discusses lexical analysis and regular expressions. It covers two main topics: 1) Recognizing tokens specified by regular expressions using non-deterministic finite automata (NFA). An NFA can be constructed from a regular expression and used to recognize strings in the language in O(|S|^2|X|) time, where |S| is the number of NFA states and |X| is the length of the input string. 2) Converting an NFA to a deterministic finite automaton (DFA) to improve the token recognition time complexity to O(|X|). The conversion algorithm marks states in the DFA as it is constructed from the NFA. For an NFA with

Uploaded by

deThirdKind

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views11 pages

Two Issues in Lexical Analysis

Uploaded by

deThirdKind

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 11

Two issues in lexical analysis

Specifying tokens (regular expression)

Identifying tokens specified by regular
expression.
How to recognize tokens specified by
regular expressions?
A recognizer for a language is a program that takes a
string x as input and answers yes if x is a sentence of
the language and no otherwise.
In the context of lexical analysis, given a string and a regular
expression, a recognizer of the language specified by the
regular expression answer yes if the string is in the
language.
A regular expression can be compiled into a recognizer
(automatically) by constructing a finite automata which can be
deterministic or non-deterministic.
Non-deterministic finite automata
(NFA)
A non-deterministic finite automata (NFA) is a mathematical
model that consists of: (a 5-tuple (Q, , , q 0, F )
a set of states Q
a set of input symbols
a transition function that maps state-symbol pairs to sets of
states.
A state q0 that is distinguished as the start (initial) state
A set of states F distinguished as accepting (final) states.
An NFA accepts an input string x if and only if there is some path
in the transition graph from the start state to some accepting state.
Show an NFA example (page 116, Figure 3.21).
An NFA is non-deterministic in that (1) same
character can label two or more transitions out of
one state (2) empty string can label transitions.
For example, here is an NFA that recognizes the
language ???. a

0 1 2 3
a b b
b
An NFA can easily implemented using a transition
table.
State a b
0 {0, 1} {0}
1 - {2}
2 - {3}
The algorithm that recognizes the language
accepted by NFA.
Input: an NFA (transition table) and a string x (terminated by eof).
output yes if accepted, no otherwise.

S = e-closure({s0});
a = nextchar;
while a != eof do begin
S = e-closure(move(S, a));
a := next char;
end
if (intersect (S, F) != empty) then return yes
else return no

Note: e-closure({S}) are the state that can be reached from states in S
through transitions labeled by the empty string.
Example: recognizing ababb from previous NFA
Example2: Use the example in Fig. 3.27 for recognizing ababb

Space complexity O(|S|), time complexity O(|S|^2|x|)??

Construct an NFA from a regular expression:
Input: A regular expression r over an alphabet
Output: An NFA N accepting L( r )
Algorithm (3.3, pages 122):
For , construct the NFA
For a in , construct the NFA
a
Let N(s) and N(t) be NFAs for regular s and t:
for s|t, construct the NFA N(s|t): N(s)
For st, construct the NFA N(st):
N(s) N(t) N(t)

For s, construct the NFA N(s):

N(s)

Example: r = (a|b)*abb.

Example: using algorithm 3.3 to construct

N( r ) for r = (ab | a)*b* | b.
Using NFA, we can recognize a token in O(|
S|^2|X|) time, we can improve the time
complexity by using deterministic finite
automaton instead of NFA.
An NFA is deterministic (a DFA) if
no transitions on empty-string
for each state S and an input symbol a, there is at
most one edge labeled a leaving S.
What is the time complexity to recognize a
token when a DFA is used?
Algorithm to convert an NFA to a DFA that accepts the
same language (algorithm 3.2, page 118)

initially e-closure(s0) is the only state in Dstates and it is unmarked

while there is an unmarked state T in Dstates do begin
mark T;
for each input symbol a do begin
U := e-closure(move(T, a));
if (U is not in Dstates) then
add U as an unmarked state to Dstates;
Dtran[T, a] := U;
end
end;

Initial state = e-closure(s0), Final state = ?

Example: page 120, fig 3.27.

Question:
for a NFA with |S| states, at most how many states can its
corresponding DFA have?

Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Non Deterministic Finite Automata (NFA)
No ratings yet
Non Deterministic Finite Automata (NFA)
26 pages
Aho-3 7
No ratings yet
Aho-3 7
5 pages
Lec 6
No ratings yet
Lec 6
27 pages
TOC Lec3
No ratings yet
TOC Lec3
51 pages
Lecture 3 Lexical Analyzer
No ratings yet
Lecture 3 Lexical Analyzer
44 pages
Lec 4 CH 2
No ratings yet
Lec 4 CH 2
39 pages
Lecture 5-FSMs-NFA-2-DFA
No ratings yet
Lecture 5-FSMs-NFA-2-DFA
62 pages
3 - Lecture 07
No ratings yet
3 - Lecture 07
70 pages
Can We Build A Finite Automaton For Every Regular Expression?, - Build FA Based On The Definition of Regular Expression
No ratings yet
Can We Build A Finite Automaton For Every Regular Expression?, - Build FA Based On The Definition of Regular Expression
66 pages
Lecture 3
No ratings yet
Lecture 3
29 pages
Lec2 0 NFA
No ratings yet
Lec2 0 NFA
30 pages
Unit 01 - Part 3
No ratings yet
Unit 01 - Part 3
18 pages
Transition Diagram
No ratings yet
Transition Diagram
13 pages
Finite Autometa PDF
No ratings yet
Finite Autometa PDF
40 pages
Compiler Design Lab Manual
No ratings yet
Compiler Design Lab Manual
32 pages
Dfa 1
No ratings yet
Dfa 1
23 pages
Fa 2
No ratings yet
Fa 2
36 pages
Regular Languages and Finite State Automata
No ratings yet
Regular Languages and Finite State Automata
15 pages
Compiler 5
No ratings yet
Compiler 5
42 pages
Lect 07
No ratings yet
Lect 07
46 pages
SEM04a-NFA Construction and Minimum DFA
No ratings yet
SEM04a-NFA Construction and Minimum DFA
48 pages
02 Automata
No ratings yet
02 Automata
78 pages
CS5371 Theory of Computation: Lecture 4: Automata Theory II (DFA NFA, Regular Language)
No ratings yet
CS5371 Theory of Computation: Lecture 4: Automata Theory II (DFA NFA, Regular Language)
27 pages
Lect 04
No ratings yet
Lect 04
12 pages
From Regular Expressions To Automata
No ratings yet
From Regular Expressions To Automata
33 pages
CMP3008 LN3 NonDeterminism
No ratings yet
CMP3008 LN3 NonDeterminism
40 pages
Lec02 Lexicalanalyzer
100% (1)
Lec02 Lexicalanalyzer
50 pages
Slides 3
No ratings yet
Slides 3
34 pages
Finite Automata
No ratings yet
Finite Automata
37 pages
Theory of Computation-Lecture 1
No ratings yet
Theory of Computation-Lecture 1
78 pages
Phases of Compiler PDF
No ratings yet
Phases of Compiler PDF
63 pages
Non-Deterministic Finite Automata
No ratings yet
Non-Deterministic Finite Automata
36 pages
CS-352 - Spring 2024 - Lec4
No ratings yet
CS-352 - Spring 2024 - Lec4
38 pages
Lec 4
No ratings yet
Lec 4
17 pages
Finite Automata
No ratings yet
Finite Automata
34 pages
Finite Automata
No ratings yet
Finite Automata
41 pages
04 Regular Expressions & FAs
No ratings yet
04 Regular Expressions & FAs
46 pages
3 FiniteAutomata Anim
No ratings yet
3 FiniteAutomata Anim
39 pages
ch2 Engineering
No ratings yet
ch2 Engineering
78 pages
Pset2 Solutions
No ratings yet
Pset2 Solutions
5 pages
CS 346: Compilers: Lexical Analyzer Lexical Analyzer
No ratings yet
CS 346: Compilers: Lexical Analyzer Lexical Analyzer
52 pages
Finite Automata
No ratings yet
Finite Automata
46 pages
Lecture 4 - NFA To DFA
No ratings yet
Lecture 4 - NFA To DFA
38 pages
Patterns, Automata, and Regular Expressions
No ratings yet
Patterns, Automata, and Regular Expressions
4 pages
Lecture 08
No ratings yet
Lecture 08
35 pages
2.chapter 2 FiniteAutomata - Anim
No ratings yet
2.chapter 2 FiniteAutomata - Anim
63 pages
Lesson 16
No ratings yet
Lesson 16
26 pages
Theory of Automata: Lecture#04 Non-Deterministic Finite Automata, NFA & FA Equivalence, NFA & Kleene's Theorem
No ratings yet
Theory of Automata: Lecture#04 Non-Deterministic Finite Automata, NFA & FA Equivalence, NFA & Kleene's Theorem
22 pages
1 1 1 FiniteAutomata
No ratings yet
1 1 1 FiniteAutomata
50 pages
Compiler Construction Lecture 5-6
No ratings yet
Compiler Construction Lecture 5-6
37 pages
Dfa and Nfa
No ratings yet
Dfa and Nfa
50 pages
Chapter 3 Implementation - of - Lexical - Analysis
No ratings yet
Chapter 3 Implementation - of - Lexical - Analysis
63 pages
Flat CH 2
No ratings yet
Flat CH 2
86 pages
Nondeterministic Finite Automaton (NFA) : Definition: An NFA Is A TG With A Unique Start
No ratings yet
Nondeterministic Finite Automaton (NFA) : Definition: An NFA Is A TG With A Unique Start
21 pages
Model - I - TOC - QP - B - Scheme of Evaluation
No ratings yet
Model - I - TOC - QP - B - Scheme of Evaluation
11 pages
Σ = a, b w - w (ab ba .) : Figure 1: Simpler language
No ratings yet
Σ = a, b w - w (ab ba .) : Figure 1: Simpler language
7 pages
A Short Course in Automorphic Functions
From Everand
A Short Course in Automorphic Functions
Joseph Lehner
No ratings yet
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Thai Alphabet Made Easy #1 Maaw Máa, Naaw Nǔu, and Long A: Lesson Notes
No ratings yet
Thai Alphabet Made Easy #1 Maaw Máa, Naaw Nǔu, and Long A: Lesson Notes
4 pages
Placement Test Answer Key Grammar and Vocabulary Reading: Content (Maximum 4 Points)
No ratings yet
Placement Test Answer Key Grammar and Vocabulary Reading: Content (Maximum 4 Points)
1 page
Algorithms PDF
No ratings yet
Algorithms PDF
6 pages
Culture Language Translation
No ratings yet
Culture Language Translation
10 pages
Dell Hymes (1966) : Types of Speech Context
No ratings yet
Dell Hymes (1966) : Types of Speech Context
5 pages
IELTS Lesson Plan For Securing Band Scor
No ratings yet
IELTS Lesson Plan For Securing Band Scor
2 pages
Dr. Roy's Everything Grammar Part II
No ratings yet
Dr. Roy's Everything Grammar Part II
341 pages
Untitled Document - PDF 2
No ratings yet
Untitled Document - PDF 2
3 pages
Speech Peer Review
No ratings yet
Speech Peer Review
2 pages
Function and Usage of So - Exercises
No ratings yet
Function and Usage of So - Exercises
1 page
TST PDF
No ratings yet
TST PDF
326 pages
G8 - U7 +8 + 9 - Test 15mins
No ratings yet
G8 - U7 +8 + 9 - Test 15mins
5 pages
Arabic Handbook
100% (1)
Arabic Handbook
50 pages
Verbs: Tense, Mood, Voice, Etc
No ratings yet
Verbs: Tense, Mood, Voice, Etc
5 pages
25 The Lazy Dog
No ratings yet
25 The Lazy Dog
20 pages
Chakma Language and Script
No ratings yet
Chakma Language and Script
6 pages
Chapter 2 - Curioso
0% (1)
Chapter 2 - Curioso
15 pages
MT Mandarin Chinese Advanced Course
No ratings yet
MT Mandarin Chinese Advanced Course
17 pages
CW Week 1
100% (1)
CW Week 1
27 pages
Cohesive Devices
No ratings yet
Cohesive Devices
11 pages
Tabla Tiempos Verbales
No ratings yet
Tabla Tiempos Verbales
2 pages
Secrets of Chinese Words and Sayings
100% (3)
Secrets of Chinese Words and Sayings
4 pages
Ielts Intensive - Family Bond
No ratings yet
Ielts Intensive - Family Bond
4 pages
Class 7 Reduces Syllabus 2021-22
No ratings yet
Class 7 Reduces Syllabus 2021-22
8 pages
Top Class Numbers Practice Work Shalom Nursery School
No ratings yet
Top Class Numbers Practice Work Shalom Nursery School
24 pages
Introduction To Language & Linguistics I: Miza Rahmatika Aini, M.A
No ratings yet
Introduction To Language & Linguistics I: Miza Rahmatika Aini, M.A
64 pages
Unit-4: Tenses & Reported Speech
No ratings yet
Unit-4: Tenses & Reported Speech
82 pages
Islamic Persian Names
No ratings yet
Islamic Persian Names
63 pages
C Before E: Hablar Comer Vivir
No ratings yet
C Before E: Hablar Comer Vivir
12 pages
Police Communication System
No ratings yet
Police Communication System
20 pages

Two Issues in Lexical Analysis

Uploaded by

Two Issues in Lexical Analysis

Uploaded by

Two issues in lexical analysis

Specifying tokens (regular expression)

Space complexity O(|S|), time complexity O(|S|^2|x|)??

For s*, construct the NFA N(s*):

Example: using algorithm 3.3 to construct

initially e-closure(s0) is the only state in Dstates and it is unmarked

Initial state = e-closure(s0), Final state = ?

You might also like

For s, construct the NFA N(s):