0% found this document useful (0 votes)

13 views62 pages

67163118e98feCCWeek 03lecture05

Cc lecture 3

Uploaded by

Dawood Habib Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views62 pages

67163118e98feCCWeek 03lecture05

Cc lecture 3

Uploaded by

Dawood Habib Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 62

Compiler Construction

(CSC-320)
Lecture # 05
Course Instructor: M. Ramzan Shahid Khan

Department of Computer Science,

Namal University Mianwali
Fall Semester, 2024
Topics
• Lexical Analysis Phase (Also Known As Scanner)

• Flex (Notations),

• Why NFA To DFA?

2
Lexical Analysis
• Input: Pre-Processed Code (Without pre-
processor directives) – Output of Pre-processor
• Pure High-Level Language

• Output: Valid Tokens

3
Lexical Analysis
• Also called Scanner

• It reads stream of characters from the source code left to right and
produces stream of valid tokens.

• If it encounter invalid token in the source code, generates an error

message indicating the line which contains that invalid token(s).

4
Valid or Invalid Tokens
• Whenever a Language is constructed, then set of rules are written
down for each type of token (present in the language), the tokens
may be:
• Constants
• Identifiers
• Punctuations
• Operators
• Keywords
• Those rules are known as Patterns

• Differentiate between Valid and Invalid Tokens with the help of DFA
5
Valid or Invalid Tokens
Constants, Identifiers, Punctuations, Operators, keywords

Patterns

Regular Expressions

DFA

6
Valid or Invalid Tokens
• The Patterns are used to develop Regular Expressions

• The DFAs are built on the basis of Regular Expressions.

7
Valid or Invalid Tokens
• DFA is a type of machine which differentiate between Valid and
Invalid Tokens.

• Tokens which are accepted by the DFA are Valid Tokens

• Tokens which are rejected by the DFA are Invalid Tokens

8
Valid or Invalid Tokens – Example
Identifiers
• Set of Rules
• Can start with _ (underscore)
Pattern
• Can start with an Alphabet
• Can’t start with any digit or other special symbol

9
Valid or Invalid Tokens – Example
• Set of Rules
• Can start with _ (underscore)
• Can start with an Alphabet
• Can’t start with any digit or other special symbol
• R.E (Regular Expression)
𝑖𝑑 → 𝑙𝑒𝑡𝑡𝑒𝑟 𝑙𝑒𝑡𝑡𝑒𝑟 𝑑𝑖𝑔𝑖𝑡 ∗
𝑙𝑒𝑡𝑡𝑒𝑟 → (𝑎 𝑏 𝑐 … 𝑧 𝐴 𝐵 𝐶 … |𝑍|_)
𝑑𝑖𝑔𝑖𝑡 → (0 1 2 … |9)

10
Valid or Invalid Tokens – Example
• Set of Rules
• Can start with _ (underscore)
• Can start with an Alphabet
• Can’t start with any digit or other special symbol
• R.E (Regular Expression)
𝑖𝑑 → 𝑙𝑒𝑡𝑡𝑒𝑟 𝑙𝑒𝑡𝑡𝑒𝑟 𝑑𝑖𝑔𝑖𝑡 ∗
𝑙𝑒𝑡𝑡𝑒𝑟 → (𝑎 𝑏 𝑐 … 𝑧 𝐴 𝐵 𝐶 … |𝑍|_)
𝑑𝑖𝑔𝑖𝑡 → (0 1 2 … |9)
• DFA (Deterministic Finite Automata)
Letter/digit

- letter
+ 11
Valid or Invalid Tokens – Example
• Set of Rules
• Can start with _ (underscore)
• Can start with an Alphabet
• Can’t start with any digit or other special symbol
• R.E (Regular Expression)
𝑖𝑑 → 𝑙𝑒𝑡𝑡𝑒𝑟 𝑙𝑒𝑡𝑡𝑒𝑟 𝑑𝑖𝑔𝑖𝑡 ∗
𝑙𝑒𝑡𝑡𝑒𝑟 → (𝑎 𝑏 𝑐 … 𝑧 𝐴 𝐵 𝐶 … |𝑍|_)
𝑑𝑖𝑔𝑖𝑡 → (0 1 2 … |9)
• DFA (Deterministic Finite Automata)
Letter/digit

- letter
+
• E.g., Input: x and _ab lead to Accepting State, Input 2x is rejected by DFA

12
Lexical Analysis
• Ignore or skip whitespace characters
• blanks,
• spaces,
• tab,
• new line

• Also ignore comments

13
Lexical Analysis

input output
sequence of Scanner stream of
characters tokens

Error Message

14
Lexical Analysis
• The error generated by the Lexical part of a compiler
is called Lexical Error.

15
Lexical Analysis – Example 1 output:
while
input: Scanner (
while (i<5) i
(Reading <
{ 5
from
i = i + 1; )
Left to {
} Right) i
=
+
1
;
}

16
Lexical Analysis – Example 1
Tokens
while keyword
( operator
i identifier
+ add operator
1 constant
Input: Stream of Characters
Output: Valid Tokens

17
Lexical Analysis – Example 2
output:
input: Scanner Error Message
while (2i<5)
{ (Reading
from
i = 2i + 1;
Left to
} Right) As
2i is Invalid
Token

18
Lexical Analysis – Example 3
output:
while
input: Scanner
(
while (i<5) i
{ (Reading <
from 5
i=i+1
Left to )
} Right) {
i
=
+
No Error 1
As Missing Semi-Colon is }
detected in Syntax Analysis 19
Lexical Analysis – Example 3
output:
while
input: Scanner
(
while (i<5) i
{ (Reading <
from 5
i=i+1
Left to )
} Right) {
i
=
+
Lexical Analyzer only 1
identifies valid and invalid }
tokens 20
FLEX
• Also called Fast Lexical Analyzer Generator

• Type of tool which helps us in constructing a Scanner. Generates a

Scanner

• It takes R.E as input then converts R.E to NFA, then NFA is converted
to DFA, then Scanner is produced as same DFA is used to differentiate
between Valid and Invalid Tokens

21
FLEX
• It is a tool used for generating scanners.

• We don’t have to write things from scratch

• You only need to do following 2 things:

• Identify vocabulary of certain language
• Write Regular Expression
• It will generate Scanner for you

22
FLEX
• Let’s discuss Notations which can be used to generate Regular
Expressions

23
Flex Regular Expression Symbols
″” Anything in quotes is matched exactly.
[] Characters in brackets match any
expression containing any of the
characters in brackets. For example [abc]
matches one a, one b, or one c.
[^] If there is a ^ character after the first
bracket, it matches any character except
those in brackets. For example, [^abc]
matches any character except a, b, or c.
. Matches any character except the
newline.
24
Flex Regular Expression Symbols

\n Matches a newline.
^ Matches the beginning
of a line.
$ Matches the end of a
line.

25
Flex Regular Expression Symbols
* Matches zero or more copies of the
preceding expression.
+ Matches one or more copies of the
preceding expression.
? Matches zero or one copy of the
preceding expression.
| Matches either the preceding expression
or the following one.

26
Flex Regular Expression Symbols
() Parenthesis are used for grouping
operators. For example a(bc|de)
matches abc or ade.
\* A iteral * character.
\” A literal ” character.
\^ A literal ^ character.

27
Regular Expression - Examples
• [a-zA-Z] matches any letter character.

• [a-zA-Z]+ matches any word.

• “hello” matches only the word hello.

• ^.*$ matches one entire line.

28
Regular Expression - Examples
• [a-z]
• we can specify a range of lowercase letters from a to z

• This will match exactly one lowercase character.

29
Regular Expression - Examples
• [A-Za-z0-9]
• The above expression specifies the range containing
• one single uppercase character,
• one lowercase character and
• a digit from 0 to 9.

• The brackets ([]) in the above expressions have a special meaning i.e. they are
used to specify the range.

• If you want to include a bracket as part of an expression, then you will need to
escape it.

30
Regular Expression - Examples
• [\[0-9]
• The above expression indicates
• an opening bracket OR
• a digit in the range 0 to 9 as a regex.

• But note that as we are programming in C++, we need to use the C++ specific
escape sequence as follows:
• [\\[0-9]

31
Regular Expression - Examples
• [a-z]+
• matches the strings like a, aaa, abcd, softwaretesting, etc.
• Note that it will never match a blank string.

• [a-z]*
• will match a blank string or
• any of the above strings.

32
Regular Expression - Examples
• (Xyz)+
• If you want to specify a group of characters to match one or more times, then
you can use the parentheses as above.
•
• The above expression will match Xyz, XyzXyz, and XyzXyzXyz, etc.

• Examples Implemented using:

• https://regexr.com/

33
Regular Expression – Example

• ^[a-zA-Z_][a-zA-Z_0-9]*\.[a-zA-Z0-9]+$

34
Regular Expression – Example 1
• C++ regex Example

• Consider a regular expression that matches an MS-DOS filename as

shown below.

35
Regular Expression – Example 1
• char regex_filename[] = “[a-zA-Z_] [a-zA-Z_0-9]*\\.[a-zA-Z0-9]+”;

• The above regex can be interpreted as follows:

1. Match a letter (lowercase and then uppercase) or an underscore.

2. Then match zero or more characters, in which each may be a letter, or an
underscore or a digit.
3. Then match a literal dot (.).
4. After the dot, match one or more characters, in which each may be a letter or
digit indicating file extension.

36
Regular Expression – Example 2
• Define a C++ language int literal using regular expression.

• ^(0|[1-9][0-9]*)$

37
Regular Expression – Example 3
• Define a C++ language float literal using regular expression:

38
Regular Expression – Example 3
• Define a C++ language float literal using regular expression:

• A float literal in C language has an optional exponent part.

• If a float literal is written without exponent part, then it must have a decimal
point which can appear at the start, at the end or in the middle of digits, as in
following examples:

• 123.456
• .456
• 456.

39
Regular Expression – Example 3
• [+-]?([0-9]*[.])?[0-9]+

• This will match:

• 123
• 123.456
• .456

40
Regular Expression – Example 3
• If you also want to match 123. .
• (a period with no decimal part), then you'll need a slightly longer expression:

• [+-]?([0-9]+([.][0-9]*)?|[.][0-9]+)

41
Regular Expression – Example 3
• If float literal is written with exponent, then decimal point in mantissa part
is optional, and exponent is a whole number with optional sign, as in
following examples:

• 123e78
• 123e+78
• 123e-78
• 123.456e78
• .456e78
• 456.e78

42
Regular Expression – Example 3
• [+-]?(\d+([.]\d*)?([eE][+-]?\d+)?|[.]\d+([eE][+-]?\d+)?)

43
Regular Expression – Example 4
• Define a C++ language string literal using regular expression.

• A string literal in C++ uses escape sequence.

44
Regular Expression – Example 4
• If a string literal is written like this:

cout << "This\nis\na\ntest\n\nShe said, \"Sells she seashells on the

seashore?\"\n";

• It will give you the results as:

This
is
a
test

She said, "Sells she seashells on the seashore?"

45
Regular Expression – Example 4
• ^([^"\\]|\\.)*$

46
Extended Regular Exp Notations
• To describe Regular Languages, we write down the Regular
Expressions

• Strings generated from the Regular Expressions are the Valid Strings
of that Language

47
Extended Regular Exp Notations
1. a?
• If there is question mark after any pattern, alphabet or character, it means
• Zero or one
• Makes that pattern of character optional, which can be replaced by 0 or one
existence
2. [A-Z]
• Shows the Range
• Ranges from A to Z (A or B or C up to Z)
3. R+
• One or More
4. [X|Y|Z]
• Either ‘X’ or ‘Y’ or ‘Z’
48
Extended Regular Exp Notations
5. [^ ab]
• Caret Sign (Circumflex) ^
• Everything but not ‘a’ & ‘b’
• Excluded part after Caret Sign
• Using a character class such as [^ab] will match a single character that is not
within the set of characters. (With the ^ being the negating part).

• To match a string which does not contain the multi-character sequence ab,
you want to use a negative lookahead:
• ^(?:(?!ab).)+$

49
Extended Regular Exp Notations
6. R*
• Zero or More

50
Extended Regular Exp Notations
1 a? Zero or one a’s
2 [A-Z] Ranges from A to Z (A or B or C up to Z)
3 R+ = RR* One or More
4 [X|Y|Z] Either ‘X’ or ‘Y’ or ‘Z’
5 [^ab] Everything but not ‘a’ and ‘b’
6 R* Zero or More

51
Extended Regular Exp Notations – Example 1
• Any variable name must start with an alphabet following any no. of
alphabets or digits.

• Regular Expression
letter(letter|digit)*

• letter and digit are the non-terminals

• Further explanation is required, by which letter and digit can be replaced

52
Extended Regular Exp Notations – Example 1
• Regular Expression

𝑙𝑒𝑡𝑡𝑒𝑟 𝑙𝑒𝑡𝑡𝑒𝑟|𝑑𝑖𝑔𝑖𝑡 ∗
𝑙𝑒𝑡𝑡𝑒𝑟 → 𝐴 − 𝑍 𝑎 − 𝑧
𝑑𝑖𝑔𝑖𝑡 → [0 − 9]

53
Extended Regular Exp Notations – Example 2
• Each String has exactly two number of a’s (No restriction on b’s)

• 𝐿 = {𝑎𝑎, 𝑏𝑏𝑎𝑎, 𝑎𝑏𝑎, … }

• 𝑅. 𝐸 = (𝑏 ∗ 𝑎𝑏 ∗ 𝑎𝑏 ∗ )

54
Next…
• How Regular Expressions can be converted to NFA?

• Conversion of NFA to DFA

55
Regular Exp to NFA
1. a
• NFA constructed would contain two states i.e., initial state and the final state
• This machine or NFA is accepting only ‘a’, nothing else

q0
a qf
q0

56
Regular Exp to NFA
2. ab
• NFA constructed would contain three states

a b
q0 q1 qf
q0

57
Regular Exp to NFA
3. a|b = a+b = aUb
• Optional – Either ‘a’ or ‘b’ would be accepted at a time
• NFA would be constructed with the help of Null Transitions (ϵ or λ)

q1
a q1
ϵ
ϵ
q0
qf
q0

ϵ ϵ
q2 q2
b 58
Regular Exp to NFA
4. a*
• Any number of a’s can be generated from this NFA
• 𝑎∗ = {∈, 𝑎, 𝑎𝑎, 𝑎𝑎𝑎, … }

ϵ a ϵ qf
q0
q0 q1 q2

ϵ
59
Regular Exp to NFA
5. (𝑎 + 𝑏)∗
• Can be divided into two parts
• First construct NFA for (a+b) as constructed for 3rd example
• Then apply * on (a+b) same as in a*

60
Regular Exp to NFA
ϵ

a
q2 q4
ϵ ϵ
q0
ϵ q1 qf
ϵ qf
q0

ϵ q3 q5 ϵ
b

ϵ
61
Why NFA to DFA? Next…
• NFA to DFA Conversion, Because

• Scanner takes Regular Expression

• Converts Regular Expression to NFA

• Then Converts NFA to DFA

• After that Scanner is able to differentiate between Valid and Invalid Tokens
62

Police Organisation at State Level
100% (2)
Police Organisation at State Level
49 pages
Chapter 2
No ratings yet
Chapter 2
56 pages
Lexical Analysis
No ratings yet
Lexical Analysis
57 pages
Lecture 3-4 Updated
No ratings yet
Lecture 3-4 Updated
26 pages
2 - Lexical Analysis
No ratings yet
2 - Lexical Analysis
52 pages
Lexical Analysis: Textbook:Modern Compiler Design
No ratings yet
Lexical Analysis: Textbook:Modern Compiler Design
43 pages
1st Phase Lexical Analyzer
No ratings yet
1st Phase Lexical Analyzer
33 pages
Chapter 2
No ratings yet
Chapter 2
99 pages
Lecture02 Scanning 1
No ratings yet
Lecture02 Scanning 1
72 pages
Chapter 2 - Lexical Analysis
100% (1)
Chapter 2 - Lexical Analysis
69 pages
CC 2
No ratings yet
CC 2
65 pages
Chapter2-Lexical Analysis
No ratings yet
Chapter2-Lexical Analysis
64 pages
Compiler Design Chapter-2
60% (5)
Compiler Design Chapter-2
105 pages
Chapter 2
No ratings yet
Chapter 2
91 pages
(Week 7) REGULAR EXPRESSION
No ratings yet
(Week 7) REGULAR EXPRESSION
44 pages
Slides 02 - Compiler Construction - UET CS - Lexical Analyzer Rev 2
No ratings yet
Slides 02 - Compiler Construction - UET CS - Lexical Analyzer Rev 2
69 pages
Lexi Cal A Analyzer
No ratings yet
Lexi Cal A Analyzer
38 pages
Chapter 3 - Lexical Analysis
100% (3)
Chapter 3 - Lexical Analysis
51 pages
Unit 1
No ratings yet
Unit 1
34 pages
2 Scan 1
No ratings yet
2 Scan 1
24 pages
Chapter 2 - Lexical Analysis - Regular Expressions
No ratings yet
Chapter 2 - Lexical Analysis - Regular Expressions
27 pages
Chapter 2
No ratings yet
Chapter 2
77 pages
CD ch2
No ratings yet
CD ch2
104 pages
Compiler Course: Lexical Analysis
No ratings yet
Compiler Course: Lexical Analysis
50 pages
CD Unit-2
No ratings yet
CD Unit-2
64 pages
CD Unit-2
No ratings yet
CD Unit-2
64 pages
ch3 M.PPTX - 0
No ratings yet
ch3 M.PPTX - 0
46 pages
Unit-2 Lexical Analysis
No ratings yet
Unit-2 Lexical Analysis
36 pages
Chapter 3 - Lexical Analysis
100% (1)
Chapter 3 - Lexical Analysis
51 pages
Lexical Analysis and Lexical Analyzer Generators: COP5621 Compiler Construction
No ratings yet
Lexical Analysis and Lexical Analyzer Generators: COP5621 Compiler Construction
52 pages
Lexical Analysis
No ratings yet
Lexical Analysis
121 pages
COS 320 Compilers: David Walker
No ratings yet
COS 320 Compilers: David Walker
38 pages
Compilers CH 3
No ratings yet
Compilers CH 3
58 pages
Unit22pdf 2021 03 13 13 38 11
No ratings yet
Unit22pdf 2021 03 13 13 38 11
114 pages
WINSEM2024-25 CSI2005 TH VL2024250502429 2024-12-14 Reference-Material-II
No ratings yet
WINSEM2024-25 CSI2005 TH VL2024250502429 2024-12-14 Reference-Material-II
84 pages
Chapter 3 - Lexical Analysis and Lexical Analyzer Generators
No ratings yet
Chapter 3 - Lexical Analysis and Lexical Analyzer Generators
52 pages
CompilerD L3
No ratings yet
CompilerD L3
36 pages
Practical File: Be (Cse) 6 Semester
No ratings yet
Practical File: Be (Cse) 6 Semester
54 pages
Ch2 CC
No ratings yet
Ch2 CC
47 pages
Chapter 2
No ratings yet
Chapter 2
27 pages
Lecture II - Lexical Analysis - Handouts
No ratings yet
Lecture II - Lexical Analysis - Handouts
71 pages
Lecture 3
No ratings yet
Lecture 3
22 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
03 Regular Expressions and Grammars Parser Generators 16102023 041542pm
No ratings yet
03 Regular Expressions and Grammars Parser Generators 16102023 041542pm
32 pages
Flex and Bison
100% (1)
Flex and Bison
23 pages
Chapter2-Lexical Analysis
No ratings yet
Chapter2-Lexical Analysis
28 pages
Unit 2
No ratings yet
Unit 2
89 pages
CH 3 Myppt
No ratings yet
CH 3 Myppt
59 pages
UNIT-I - Lexical Analysis
No ratings yet
UNIT-I - Lexical Analysis
51 pages
Compiler-Lexical Analysis
100% (1)
Compiler-Lexical Analysis
59 pages
Ch3 1
No ratings yet
Ch3 1
52 pages
Ch3 - Lexical Analysis
No ratings yet
Ch3 - Lexical Analysis
52 pages
CD - Unit1 - Lecture4 5 6 7
No ratings yet
CD - Unit1 - Lecture4 5 6 7
50 pages
Chapter 3 Lexical Analysis
No ratings yet
Chapter 3 Lexical Analysis
5 pages
Lexical Analyzer 1
No ratings yet
Lexical Analyzer 1
37 pages
Compilers: CS414-2017S-01 Compiler Basics & Lexical Analysis
No ratings yet
Compilers: CS414-2017S-01 Compiler Basics & Lexical Analysis
58 pages
Lec2 LexicalAnalyser
No ratings yet
Lec2 LexicalAnalyser
30 pages
Chapter 3 - Lexical Analysis
No ratings yet
Chapter 3 - Lexical Analysis
51 pages
Lecture 1 Basics of PCB
No ratings yet
Lecture 1 Basics of PCB
32 pages
1 Rakitan Printer 02 Agustus 2021
No ratings yet
1 Rakitan Printer 02 Agustus 2021
1 page
Industrial Report
No ratings yet
Industrial Report
56 pages
Phannarak CV
No ratings yet
Phannarak CV
2 pages
Payment Plan: Doctors Floor Price List
No ratings yet
Payment Plan: Doctors Floor Price List
1 page
Edu 210 Quiz
No ratings yet
Edu 210 Quiz
4 pages
Mud Logging
No ratings yet
Mud Logging
10 pages
Hotel Bill 25092024
No ratings yet
Hotel Bill 25092024
1 page
Industrial Shakers
No ratings yet
Industrial Shakers
4 pages
Vio's Bartering Money Guide For Poor People-1 PDF
No ratings yet
Vio's Bartering Money Guide For Poor People-1 PDF
13 pages
Winback - en Brochure Rshock Version J3 Mars 2021 A
100% (1)
Winback - en Brochure Rshock Version J3 Mars 2021 A
12 pages
Admission Circular in Evening - Executive MBA (EMBA) in Jahangirnagar University
No ratings yet
Admission Circular in Evening - Executive MBA (EMBA) in Jahangirnagar University
2 pages
Marine Hsse Final Assignment Chop Saw
No ratings yet
Marine Hsse Final Assignment Chop Saw
11 pages
Digital Touchpoints - SMO - Digital Economy
No ratings yet
Digital Touchpoints - SMO - Digital Economy
8 pages
Framemaker Has Two Ways of Approaching Documents: and Unstructured
No ratings yet
Framemaker Has Two Ways of Approaching Documents: and Unstructured
3 pages
Invitation of PT Garuda Indonesia (Persero) TBK: The Annual General Meeting of Shareholders
No ratings yet
Invitation of PT Garuda Indonesia (Persero) TBK: The Annual General Meeting of Shareholders
1 page
Installation Guide & User 'S Manual: The ACS-600 Load Moment Limiter
100% (1)
Installation Guide & User 'S Manual: The ACS-600 Load Moment Limiter
35 pages
Euler's Path
50% (2)
Euler's Path
10 pages
FF0332 01 Artificial Intelligence Powerpoint Template
No ratings yet
FF0332 01 Artificial Intelligence Powerpoint Template
8 pages
Agarrado vs. Librando-Agarrado
No ratings yet
Agarrado vs. Librando-Agarrado
6 pages
2024 Emerging Space Brief Satellite Servicing
No ratings yet
2024 Emerging Space Brief Satellite Servicing
6 pages
Organic Bakery Marketing Plan
No ratings yet
Organic Bakery Marketing Plan
30 pages
Banking and Insurance
50% (2)
Banking and Insurance
13 pages
Practise Questions For Test 2
No ratings yet
Practise Questions For Test 2
10 pages
Virtualization II-2019-SysVM
No ratings yet
Virtualization II-2019-SysVM
47 pages
Si4734 35 FM Radio Receiver
100% (1)
Si4734 35 FM Radio Receiver
42 pages
Young Medi CT Scanners
No ratings yet
Young Medi CT Scanners
3 pages
000400000007AF00
No ratings yet
000400000007AF00
7 pages
Antarang Foundation
No ratings yet
Antarang Foundation
25 pages

67163118e98feCCWeek 03lecture05

Uploaded by

67163118e98feCCWeek 03lecture05

Uploaded by

Compiler Construction

Department of Computer Science,

• Why NFA To DFA?

• Output: Valid Tokens

• If it encounter invalid token in the source code, generates an error

• The DFAs are built on the basis of Regular Expressions.

• Tokens which are accepted by the DFA are Valid Tokens

• Tokens which are rejected by the DFA are Invalid Tokens

• Also ignore comments

• Type of tool which helps us in constructing a Scanner. Generates a

• We don’t have to write things from scratch

• You only need to do following 2 things:

• [a-zA-Z]+ matches any word.

• “hello” matches only the word hello.

• ^.*$ matches one entire line.

• This will match exactly one lowercase character.

• Examples Implemented using:

• Consider a regular expression that matches an MS-DOS filename as

• The above regex can be interpreted as follows:

1. Match a letter (lowercase and then uppercase) or an underscore.

• A float literal in C language has an optional exponent part.

• This will match:

• A string literal in C++ uses escape sequence.

cout << "This\nis\na\ntest\n\nShe said, \"Sells she seashells on the

• It will give you the results as:

She said, "Sells she seashells on the seashore?"

• letter and digit are the non-terminals

• Further explanation is required, by which letter and digit can be replaced

• 𝐿 = {𝑎𝑎, 𝑏𝑏𝑎𝑎, 𝑎𝑏𝑎, … }

• Conversion of NFA to DFA

• Scanner takes Regular Expression

• Converts Regular Expression to NFA

• Then Converts NFA to DFA

You might also like