Compiler Lecture 7
Compiler Lecture 7
Objectives:
⮚ Understand the basic concept of Regular expression
⮚ Understand the regular expression algorithm
Outcome:
⮚ Students should be able to design the nondeterministic finite automate from
regular expression.
⮚ Students should be able to know the applications of a regular expression.
Regular Expression
Definition: A sequence of symbols and characters expressing a string or pattern to be
searched for within a longer piece of text.
Another words to say a regular expression is a method used in programming for pattern
matching. Regular expressions provide a flexible and concise means to match strings of
text.
The regular expressions are built recursively out of smaller regular expressions, using
some rules.
Each regular expression r denotes a language L(r), which is also defined recursively from
the languages denoted by r ' s subexpressions.
Regular Expression
The regular expressions are built recursively out of smaller regular expressions,
using some rules.
Here are the rules that define the regular expressions over some alphabet £ and the
languages that those expressions denote.
⮚ Basis
⮚ Induction
⮚ Precedence
Rules of Regular Expression
⮚ E is a regular expression, and L(E) is {E}, that is, the language whose sole
member is the empty string.
⮚ If a is a symbol in E, then a is a regular expression, and L(a) = {a}, that is, the
language with one string, of length one, with a in its one position. Here italics
is used for symbols, and boldface for their corresponding regular expression.
Rules of Regular Expression
INDUCTION: There are four parts to the induction. Suppose r and s are regular
expressions denoting languages L(r) and L(s), respectively.
⮚ (r) is a regular expression denoting L(r).The last rule says that we can add
additional pairs of parentheses around expressions without changing the
language they denote.
Example of a Regular expression
⮚ 2. (a|b)(a|b) denotes {aa, ab, ba, bb}, the language of all strings of length
two over the alphabet E.
⮚ 3. a* denotes the language consisting of all strings of zero or more a's, that is,
{ E, a , a a , a a a , . . . }.
Example of a Regular expression
⮚ 4. (a|b)* denotes the set of all strings consisting of zero or more instances of
a or b, that is, all strings of a's and b's: {E ,a, b,aa, ab, ba, bb,aaa,...}.
⮚ a|a*b denotes the language {a, b, ab, aab, aaab,...}, that is, the string a and
all strings consisting of zero or more a's and ending in b.
Operations of a Regular expression
Operations:
The various operations on languages are:
Example: a (b | c)*d
Regular Expression To NFA
By means of the construction of Thompson, outline the NFA relevant to the following
regular expression:
Example: a (b | c)*d
Class Exercises
1. (aUb)*abc
2. (abUbc(abUc)*)*
Lecture References