0% found this document useful (0 votes)

10 views31 pages

Lecture 6

Uploaded by

mohammadhabib7850

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views31 pages

Lecture 6

Uploaded by

mohammadhabib7850

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

Formal Languages and Compiler

Regular Expression, Regular Languages,

Implementing a recognizer of RE

Ziaurahman Hikmat

[email protected]
Nangarhar University Computer Science Faculty

31 October 2023

1 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

LECTURE OVERVIEW
֎ Regular Expression
֎ Regular grammar Overview
֎ Introduction
֎ Regular Definition
֎ RE vs RG
֎ Implementing a Recognizer of RE’s: Automata
֎ DFA
֎ NFA
֎ From Regular Expression to NFA

NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

TYEP3 (REGULAR GRAMMAR)
Regular Grammars, also called Type 3 Grammars, are formal Grammars, G= (VT,VN,S,P),
such that all productions in P respect the following condition:
A → aB, or A → a
with A,B ∈ VN and a ∈ VT.
Furthermore, a rule of the form:
S → ε is allowed if S does not appear on the right side of any rule.
֎ The above define the Right-Regular Grammars. The following Productions:
A → Ba, or A → a
define Left-Regular Grammars.
֎ Right-Regular and Left-Regular Grammars define the same set of Languages.
֎ Regular Grammars are commonly used to define the lexical structure of
programming languages

3 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

REGULAR EXPRESSION
Each Regular Expression, say R, denotes a Language, L(R). The following
are the rules to build them over an alphabet V:
֎ If a ∈ V ∪ {𝜀} then a is a Regular Expression denoting the language {a};
֎ If R,S are Regular Expressions denoting the Languages L(R) and L(S)
then:
֎ R | S is a Regular Expression denoting L(R) ∪ L(S);
֎ R·S is a Regular Expression denoting the concatenation L(R) · L(S), i.e.,
L(R)·L(S) = {r·s | r ∈ L(R) and s ∈ L(S)};
֎ R∗ (Kleen closure) is a Regular Expression denoting L(R)∗, zero or more
concatenations of L(R), i.e., L(R)∗ =‫∞ڂ‬ 𝑖=0 𝐿 𝑅 —where 𝐿 𝑅
𝑖 0
= {𝜀};
֎ (R) is a Regular Expression denoting L(R).
Precedence of Operators: ∗ > · > |
E | F·G∗ = E | (F·(G∗))

4 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

EXAMPLE
֎ Let V = {a,b}
֎ The Regular Expression a | b denotes the Language {a,b}.
֎ The Regular Expression (a | b)(a | b) denotes the Language {aa,ab,ba,bb}.
֎ The Regular Expression a∗ denotes the Language of all strings of zero
or more a’s, {𝜀,a,aa,aaa,...}.
֎ The Regular Expression (a | b)∗ denotes the Language of all strings of
a’s and b’s.

5 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

REGULAR EXPRESSION SHORTHANDS
֎ Notational shorthand's are introduced for frequently used
constructors.
֎ +: One or more instances. If R is a Regular Expression then R+ ≡ RR∗.
֎ ?: Zero or one instance. If R is a Regular Expression then R? ≡ 𝜀 | R.
֎ Character Classes. If a,b,...,z ∈ V then [a,b,c] ≡ a | b | c, and
[a−z] ≡ a | b | ... | z.

6 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

REGULAR DEFINITIONS
֎ Regular Definitions are used to give names to regular Expressions
and then to re-use these names to build new Regular Expressions.
֎ A Regular Definition is a sequence of definitions of the form:
D1 → R1
D2 → R2
...
Dn → Rn
֎ Where each Di is a distinct name and each Ri is a Regular Expression
over the extended alphabet V ∪ {D1,D2,...,Di−1}.
֎ Note: Such names for Regular Expression will be often the Tokens
returned by the Lexical Analyzer. As a convention, names are printed
in boldface.

7 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

REGULAR DEFINITION EXAMPLE
Example 1.
Identifiers are usually strings of letters and digits beginning with a letter:
letter → A | B |...| Z | a | b | ... | z
digit → 0 | 1 |···| 9
id → letter(letter | digit)∗
Using Character Classes we can define identifiers as:
id → [A−Za−z][A−Za−z0−9]∗

8 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

REGULAR DEFINITION EXAMPLE
Example 2.
Numbers are usually strings such as 5230, 3.14, 6.45E4, 1.84E-4.
digit → 0 | 1 |···| 9
digits → digit+
optional-fraction → (.digits)?
optional-exponent → (E(+ |−)?digits)?
num → digits optional-fraction optional-exponent

9 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

RE VS RG
֎ Languages captured by Regular Expressions could be captured by
Regular Grammars (Type 3 Grammars).
֎ Regular Expressions are a notational variant of Regular Grammars:
Usually they give a more compact representation.
֎ Example. The Regular Expression for numbers can be captured by
a Regular Grammar with the following Productions (num is the
scope and digit is a terminal symbol):
num → digit | digit Z
Z → digit | digit Z | . Frac-Exp | E Exp-Num
Frac-Exp → digit | digit Frac-Exp | digit Exp
Exp → E Exp-Num
Exp-Num → +Digits |−Digits | digit | digit Digits
Digits → digit | digit Digits

10 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

FINITE AUTOMATA
֎ We need a mechanism to recognize Regular Expressions.
֎ While Regular Expressions are a speciﬁcation language, Finite
Automata are their implementation.
֎ Given an input string, x, and a Regular Language, L, they answer
“yes” if x ∈ L and “no” otherwise.

11 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

DETERMINISTIC FINITE AUTOMAT
A Deterministic Finite Automata, DFA for short, is a tuple:
A = (S, V, δ, s0, F):
֎ S is a ﬁnite non empty set of states;
֎ V is the input symbol alphabet;
֎ δ : S × V → S is a total function called the Transition Function;
֎ s0 ∈ S is the initial state;
֎ F ⊆ S is the set of ﬁnal states.

12 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

TRANSITION GRAPH
֎ A DFA can be represented by Transition Graphs where the nodes
are the states and each labeled edge represents the transition
function.
֎ The initial state has an input arc marked start. Final states are
indicated by double circles.
֎ Example: DFA that accepts strings in the Language L((a|b)∗abb)

13 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

TRANSITION TABLE
֎ Transition Tables implement transition graphs, and thus Automata.
֎ A Transition Table has a row for each state and a column for each
input symbol.
֎ The value of the cell (si,aj) is the state that can be reached from state
si with input aj.
֎ Example: The table implementing the previous transition graph will
have 4 rows and 2 columns, let us call the table δ, then:
δ(0,a) = 1 δ(0,b) = 0
δ(1,a) = 1 δ(1,b) = 2
δ(2,a) = 1 δ(2,b) = 3
δ(3,a) = 1 δ(3,b) = 0

14 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

NONDETERMINISTIC FINITE AUTOMATA
A Nondeterministic Finite Automata, NFA for short, is a tuple:
A = (S, V*, δ, s0, F):
֎ S is a ﬁnite non empty set of states;
֎ V* is the input symbol alphabet include ε;
֎ δ : S × V → S is a total function called the Transition Function;
֎ s0 ∈ S is the initial state;
֎ F ⊆ S is the set of ﬁnal states.

15 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

NFA EXAMPLE
Given an input string and an NFA there will be, in general, more then one
path that can be followed: An NFA accepts an input string if there is at
least one path ending in a ﬁnal state.
Example. NFA that accepts strings in the Language L((a | b)∗abb).

16 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

DFA VS NFA
֎ Both DFA and NFA are capable of recognizing all Regular Languages
/Expressions:
֎ L(NFA) = L(DFA)
֎ The main diﬀerence is a Space vs. Time tradeoﬀ:
֎ DFA are faster than NFA;
֎ DFA are bigger (exponentially larger) than NFA.

17 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

FROM RE TO NFA
֎ To convert a regular expression to NFA use Thompson’s construction.
֎ Given a RE, say r , the Thompson’s construction generates an NFA
accepting L(r).
֎ The Thompson’s construction is a recursive procedure guided by the
structure of the regular expression.

18 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

FROM RE TO NFA
֎ The NFA resulting from the Thompson’s construction has important
properties:
֎ It is an 𝜀 -NFA: The automaton can make a transition without
consuming an input symbol — the automaton can non-
deterministically change state.
֎ It has exactly one ﬁnal state.
֎ No edge enters the start state.
֎ No edge leaves the ﬁnal state.

19 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

FROM RE TO NFA
֎ Algorithm for conversion of regular expression to NFA is:
֎ Input: A regular expression R
֎ Output: NFA accepting language denoted by R
Method:
For ε NFA is: ε

For a NFA is:

20 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

FROM RE TO NFA
For a+b or a|b NFA is:
a
ε ε

ε b
ε

For ab NFA is:

a b

21 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

FROM RE TO NFA
For a* NFA is:
ε

ε a ε

22 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

EXAMPLE
Construct NFA for the regular expression ((a.b)|c)*
Step 1: construct NFA for r1

a
( ( a . B ) | c )* r1

23 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

EXAMPLE
Construct NFA for the regular expression ((a.b)|c)*
Step 2: construct NFA for r2

a
( ( a . B ) | c )* r1

r1 r2 b
r2

24 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

EXAMPLE
Construct NFA for the regular expression ((a.b)|c)*
Step 3: construct NFA for r3

( ( a . B ) | c )* b
a
r3
r3

25 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

EXAMPLE
Construct NFA for the regular expression ((a.b)|c)*
Step 4: construct NFA for r4

( ( a . B ) | c )* b
r3 a

r3 r4

c
r4

26 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

EXAMPLE
Construct NFA for the regular expression ((a.b)|c)*
Step 5: construct NFA for r5

a b
( ( a . B ) | c )*
ε ε
r5 r5
ε ε
c

27 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

EXAMPLE
Construct NFA for the regular expression ((a.b)|c)*
Step 6: construct NFA for r5*
ε

a b
ε ε
ε ε

ε ε
c

28 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

EXAMPLE
Construct NFA for the regular expression a(a+b)*bb

a
ε ε
a ε ε b b

ε ε
b

29 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

REFERENCES
Alfred V. Aho, Monica S. Lam, Ravi Sethi and Jeﬀ Ullman. (2007)
Compilers: Principles, Techniques, and Tools , 2nd Edition
J.E. Hopcroft, R. Motwani, J.D. Ullman. (2007)
Introduction to Automata Theory, Languages, and Computation, 3rd
Edition

30 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

THIS IS ENOUGH!
Any questions?
Suggestions?

31 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

Recognition of Tokens
No ratings yet
Recognition of Tokens
34 pages
TCS Notes
No ratings yet
TCS Notes
14 pages
Regular Expressions (RE) 3.1
100% (3)
Regular Expressions (RE) 3.1
53 pages
Lec02 Lexicalanalyzer
100% (1)
Lec02 Lexicalanalyzer
50 pages
Non Deterministic Finite Automata (NFA)
No ratings yet
Non Deterministic Finite Automata (NFA)
26 pages
Unit 4: Regular Expressions
No ratings yet
Unit 4: Regular Expressions
52 pages
Chapter 2 REGULAR EXPRESSION
No ratings yet
Chapter 2 REGULAR EXPRESSION
26 pages
Chapter 3 Regular Expression
No ratings yet
Chapter 3 Regular Expression
25 pages
CH 3 - Regular Languages Amd Regular Grammars
No ratings yet
CH 3 - Regular Languages Amd Regular Grammars
67 pages
Tafl Last Min Notes
No ratings yet
Tafl Last Min Notes
19 pages
Toc Unit-2
No ratings yet
Toc Unit-2
109 pages
Regular Expressions and Languages
No ratings yet
Regular Expressions and Languages
16 pages
Flat CH 2
No ratings yet
Flat CH 2
86 pages
FLAT - Ch.2
No ratings yet
FLAT - Ch.2
86 pages
Module 1&2
No ratings yet
Module 1&2
98 pages
02 Automata
No ratings yet
02 Automata
78 pages
Lecture 3 Lexical Analyzer
No ratings yet
Lecture 3 Lexical Analyzer
44 pages
CD ch2
No ratings yet
CD ch2
104 pages
CS-352 - Spring 2024 - Lec4
No ratings yet
CS-352 - Spring 2024 - Lec4
38 pages
Unit 3 - Regular Expression
No ratings yet
Unit 3 - Regular Expression
45 pages
Unit-Ii Regular Expressions and Languages Definition
No ratings yet
Unit-Ii Regular Expressions and Languages Definition
34 pages
Chapter 3 REGULAR EXPRESSION
No ratings yet
Chapter 3 REGULAR EXPRESSION
28 pages
Formal Languages, Automata and Computability: (For Next Time: Read Chapter 1.3 of The Book)
No ratings yet
Formal Languages, Automata and Computability: (For Next Time: Read Chapter 1.3 of The Book)
56 pages
Chapter - 2 - Finite Automata and Regular Language - Part - 2
No ratings yet
Chapter - 2 - Finite Automata and Regular Language - Part - 2
80 pages
Lecture05 RegularExpression&FA
No ratings yet
Lecture05 RegularExpression&FA
44 pages
CS 346: Compilers: Lexical Analyzer Lexical Analyzer
No ratings yet
CS 346: Compilers: Lexical Analyzer Lexical Analyzer
52 pages
CompilerD L3
No ratings yet
CompilerD L3
36 pages
End Sem CD
No ratings yet
End Sem CD
97 pages
SLD 2
No ratings yet
SLD 2
67 pages
Lexical Analysis
No ratings yet
Lexical Analysis
47 pages
Chapter 3 Implementation - of - Lexical - Analysis
No ratings yet
Chapter 3 Implementation - of - Lexical - Analysis
63 pages
Regular Expressions
No ratings yet
Regular Expressions
34 pages
3 Models of Computation - NFA Equiv. DFA & Regular Expressions
No ratings yet
3 Models of Computation - NFA Equiv. DFA & Regular Expressions
25 pages
Lecture 6 Regular Expressions
No ratings yet
Lecture 6 Regular Expressions
28 pages
CS372 Formal Languages & The Theory of Computation
No ratings yet
CS372 Formal Languages & The Theory of Computation
29 pages
Slides4week2 FA+REX
No ratings yet
Slides4week2 FA+REX
43 pages
Code Source Tokens Scanner Parser IR
No ratings yet
Code Source Tokens Scanner Parser IR
26 pages
Toc U2ppt
No ratings yet
Toc U2ppt
41 pages
Lecture 2
No ratings yet
Lecture 2
21 pages
Section 3.1
No ratings yet
Section 3.1
44 pages
Unit Iii
No ratings yet
Unit Iii
51 pages
CD - Unit1 - Lecture4 5 6 7
No ratings yet
CD - Unit1 - Lecture4 5 6 7
50 pages
Spring 2024 Compiler Constructoin A Lab 3-2
No ratings yet
Spring 2024 Compiler Constructoin A Lab 3-2
16 pages
Complierdesign Operatingsonlanguagesrefiniteautomata 240920162828 5f5b45f9
No ratings yet
Complierdesign Operatingsonlanguagesrefiniteautomata 240920162828 5f5b45f9
16 pages
Compilation Techniques
No ratings yet
Compilation Techniques
21 pages
Compiler Construction Final Notes For End Sem Exam
No ratings yet
Compiler Construction Final Notes For End Sem Exam
37 pages
Unit 1 Part 2 - Compiler
No ratings yet
Unit 1 Part 2 - Compiler
32 pages
Regular Expressions
No ratings yet
Regular Expressions
19 pages
3B-Formal Languages
No ratings yet
3B-Formal Languages
24 pages
Lecture 5 - Regular Expressions
No ratings yet
Lecture 5 - Regular Expressions
35 pages
Lecture Week 03
No ratings yet
Lecture Week 03
24 pages
3 Regex
No ratings yet
3 Regex
16 pages
Lecture 04
No ratings yet
Lecture 04
37 pages
Computability 05
No ratings yet
Computability 05
28 pages
TAFL Unit 1 - Basic Concepts and Automata Theory - Detailed Notes
No ratings yet
TAFL Unit 1 - Basic Concepts and Automata Theory - Detailed Notes
13 pages
CS606 Midterm
No ratings yet
CS606 Midterm
11 pages
Compiler
No ratings yet
Compiler
10 pages
Computation Theory Lecture 2
No ratings yet
Computation Theory Lecture 2
5 pages
Lex Analysis
No ratings yet
Lex Analysis
13 pages
CS 160
No ratings yet
CS 160
4 pages
Formal Languages Part 1 Including Regular Expressions: Basic Concepts For Symbols, Strings, and Languages
No ratings yet
Formal Languages Part 1 Including Regular Expressions: Basic Concepts For Symbols, Strings, and Languages
4 pages
21-Ambiguity in CFG, CYK Algorithm-27-02-2024
No ratings yet
21-Ambiguity in CFG, CYK Algorithm-27-02-2024
3 pages
Patterns, Automata, and Regular Expressions
No ratings yet
Patterns, Automata, and Regular Expressions
4 pages
Algebra
From Everand
Algebra
Larry C. Grove
5/5 (3)
A Short Course in Automorphic Functions
From Everand
A Short Course in Automorphic Functions
Joseph Lehner
No ratings yet

Lecture 6

Uploaded by

Lecture 6

Uploaded by

Formal Languages and Compiler

Regular Expression, Regular Languages,

1 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

3 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

4 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

5 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

6 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

7 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

8 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

9 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

10 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

11 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

12 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

13 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

14 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

15 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

16 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

17 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

18 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

19 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

For a NFA is:

20 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

For ab NFA is:

21 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

22 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

23 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

24 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

25 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

26 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

27 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

28 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

29 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

30 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

31 NANGARHAR UNIVERSITY COMPUTER SCIENCE FACULTY

You might also like