0% found this document useful (0 votes)
230 views20 pages

Lec 04-Regular Expression

The document discusses regular expressions and languages. It provides examples of regular expressions to define various languages, such as the language of strings with an even number of a's and b's (EVEN-EVEN). It also describes algorithms for determining if a string belongs to the EVEN-EVEN language by tracking the parity of a's and b's.

Uploaded by

Rooni Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
230 views20 pages

Lec 04-Regular Expression

The document discusses regular expressions and languages. It provides examples of regular expressions to define various languages, such as the language of strings with an even number of a's and b's (EVEN-EVEN). It also describes algorithms for determining if a string belongs to the EVEN-EVEN language by tracking the parity of a's and b's.

Uploaded by

Rooni Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 20

Theory of Automata

Example

• The following equivalences show that we should not treat expressions as


algebraic polynomials:

(a + b)* = (a + b)* + (a + b)*


(a + b)* = (a + b)* + a*
(a + b)* = (a + b)*(a + b)*
(a + b)* = a(a + b)* + b(a + b)* + Λ
(a + b)* = (a + b)*ab(a + b)* + b*a*

• The last equivalence may need some explanation:


– The first term in the right hand side, (a + b)*ab(a + b)*, describes all the words
that contain the substring ab.

– The second term, b*a* describes all the words that do not contain the
substring ab (i.e., all a’s, all b’s, Λ, or some b’s followed by some a’s).

Theory Of Automata 2
Example

• Let V be the language of all strings of a’s and b’s in which


either the strings are all b’s, or else an a followed by some b’s.
Let V also contain the word Λ. Hence,
V = {Λ, a, b, ab, bb, abb, bbb, abbb, bbbb, …}
• We can define V by the expression
b* + ab*
where Λ is included in b*.
• Alternatively, we could define V by
(Λ + a)b*
which means that in front of the string of some b’s, we have
either an a or nothing.
Theory Of Automata 3
Example contd.

• Hence,
(Λ + a)b* = b* + ab*

• Since b* = Λ b*, we have


(Λ + a)b* = b* + ab*
which appears to be distributive law at work.

• However, we must be extremely careful in


applying distributive law. Sometimes, it is
difficult to determine if the law is applicable.
Theory Of Automata 4
Product Set

• If S and T are sets of strings of letters (whether


they are finite or infinite sets), we define the
product set of strings of letters to be

ST = {all combinations of a string from S


concatenated with a string from T in that
order}

Theory Of Automata 5
Example

• If S = {a, aa, aaa} and T = {bb, bbb} then

ST = {abb, abbb, aabb, aabbb, aaabb, aaabbb}

• Note that the words are not listed in lexicographic order.

• Using regular expression, we can write this example as

(a + aa + aaa)(bb + bbb)
= abb + abbb + aabb + aabbb + aaabb + aaabbb

Theory Of Automata 6
Example

• If M = {λ, x, xx} and N = {λ, y, yy, yyy, yyyy, …}


then
• MN ={λ, y, yy, yyy, yyyy,…x, xy, xyy, xyyy, xyyyy,
…xx, xxy, xxyy, xxyyy, xxyyyy, …}

• Using regular expression

(λ + x + xx)(y*) = y* + xy* + xxy*


Theory Of Automata 7
Languages Associated with
Regular Expressions
Definition

• The following rules define the language associated with any


regular expression:

• Rule 1: The language associated with the regular expression


that is just a single letter is that one-letter word alone, and
the language associated with λ is just {λ}, a one-word
language.

• Rule 2: If r1 is a regular expression associated with the


language L1 and r2 is a regular expression associated with the
language L2, then:
(i) The regular expression (r1)(r2) is associated with the product L1L2,
that is the language L1 times the language L2:

language(r1r2) = L1L2
Theory Of Automata 9
Definition contd.

• Rule 2 (cont.):

(ii) The regular expression r1 + r2 is associated with


the language formed by the union of L1 and L2:
language(r1 + r2) = L1 + L2

(iii) The language associated with the regular


expression (r1)* is L1*, the Kleene closure of the
set L1 as a set of words:
language(r1*) = L1*
Theory Of Automata 10
Finite Languages Are Regular
Theorem 5

• If L is a finite language (a language with only finitely many words), then L


can be defined by a regular expression. In other words, all finite
languages are regular.

• Proof

• Let L be a finite language. To make one regular expression that defines L,


we turn all the words in L into boldface type and insert plus signs between
them.

• For example, the regular expression that defines the language


L = {baa, abbba, bababa} is baa + abbba + bababa

• This algorithm only works for finite languages because an infinite language
would become a regular expression that is infinitely long, which is
forbidden.

Theory Of Automata 12
How Hard It Is To Understand A
Regular Expression
Let us examine some regular expressions and
see if we could understand something about
the languages they represent.
Example

• Consider the expression

(a + b)*(aa + bb)(a + b)* =(arbitrary)(double letter)(arbitrary)

• This is the set of strings of a’s and b’s that at


some point contain a double letter.

Let us ask, “What strings do not contain a


double letter?” Some examples are
Theory Of Automata 14
Example contd.

• The expression (ab)* covers all of these except


those that begin with b or end with a. Adding
these choices gives us the expression:

(λ + b)(ab)*(λ + a)

• Combining the two expressions gives us the


one that defines the set of all strings
(a + b)*(aa + bb)(a + b)* + (λ + b)(ab)*(λ + a)
Theory Of Automata 15
Examples

• Note that
(a + b*)* = (a + b)*
since the internal * adds nothing to the
language. However,

(aa + ab*)* ≠ (aa + ab)*


since the language on the left includes the
word abbabb, whereas the language on the
right does not. (The language on the right
Theory Of Automata 16
Example

• Consider the regular expression: (a*b*)*.

• The language defined by this expression is all strings that can


be made up of factors of the form a*b*.

• Since both the single letter a and the single letter b are words
of the form a*b*, this language contains all strings of a’s and
b’s. That is,
(a*b*)* = (a + b)*

• This equation gives a big doubt on the possibility of finding a


set of algebraic rules to reduce one regular expression to
another equivalent one.
Theory Of Automata 17
Introducing EVEN-EVEN

• Consider the regular expression


E = [aa + bb + (ab + ba)(aa + bb)*(ab + ba)]*

• This expression represents all the words that are made up of syllables of
three types:
type1 = aa
type2 = bb
type3 = (ab + ba)(aa + bb)*(ab + ba)

• Every word of the language defined by E contains an even number of a’s


and an even number of b’s.

• All strings with an even number of a’s and an even number of b’s belong
to the language defined by E.

Theory Of Automata 18
Algorithms for EVEN-EVEN

• We want to determine whether a long string of a’s and b’s has the
property that the number of a’s is even and the number of b’s is even.

• Algorithm 1: Keep two binary flags, the a-flag and the b-flag. Every time
an a is read, the a-flag is reversed (0 to 1, or 1 to 0); and every time a b is
read, the b-flag is reversed. We start both flags at 0 and check to be sure
they are both 0 at the end.

• Algorithm 2: Keep only one binary flag, called the type3-flag. We read
letter in two at a time. If they are the same, then we do not touch the
type3-flag, since we have a factor of type1 or type2. If, however, the two
letters do not match, we reverse the type3-flag. If the flag starts at 0 and if
it is also 0 at the end, then the input string contains an even number of a’s
and an even number of b’s.

Theory Of Automata 19
• If the input string is

(aa)(ab)(bb)(ba)(ab)(bb)(bb)(bb)(ab)(ab)(bb)(b
a)(aa) then, by Algorithm 2, the type3-flag is
reversed 6 times and ends at 0.

• We give this language the name EVEN-EV EN.


so, EVEN-EV EN ={λ, aa, bb, aaaa, aabb, abab,
abba, baab, baba, bbaa, bbbb, aaaaaa,
aaaabb, aaabab, …} Theory Of Automata 20

You might also like